1 / 11

LONI/LSU RP Update

Honggao Liu, Ph.D Director, HPC @ LSU NSF HPCOPS PI, LONI November 5, 2009. LONI/LSU RP Update. Queen Bee Update. Was fairly reliable and down four times for total of 116 unavailable hours in the past four months

marnie
Download Presentation

LONI/LSU RP Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Honggao Liu, Ph.D Director, HPC @ LSU NSF HPCOPS PI, LONI November 5, 2009 LONI/LSU RP Update

  2. Queen Bee Update • Was fairly reliable and down four times for total of 116 unavailable hours in the past four months • Network connection between QB and the rest of the TeraGrid used a dedicated 1 Gbps connection to LONI, and a 10 Gbps connection from LONI to Chicago. The planned 10 Gbps to QB has been delayed by physical construction at a carrier hotel building in downtown Baton Rouge, which is preventing the installation of a new fiber. LONI ordered a 10GE Metro-E circuit between LSU and QB in September as a local wave service from AT&T. We are looking at the beginning of January 2010 to have the local wave service operational. • Four incidents occurred in the past four months. Each involved a user gaining root priviledge on QB head nodes. In each case, the impacted users were notified and forced to change their passwords. The head nodes were reinstalled and kernel patches were installed.

  3. Queen Bee Usage • Queen Bee had over 85% total usage in past four months

  4. Queen Bee Usage • Users, Jobs and SUs for Queen Bee relative to the peak of each data type. • That lets one plot all 3 data types on the same graph to see how they relate

  5. LONI’s New TeraGrid Projects • Project: SAGA (http://saga.cct.lsu.edu/) Deployment on TeraGrid • Project Aim I: Deploy SAGA on major TeraGrid resources (Kraken, Ranger, Abe/QB) • Get stable release (Ole Weidner) • Scheduled Date: 31st Oct, 2009 • Estimated Date: 15th Nov, 2009 • Make available via CTSS (Lukasz) • Work with RP and GIG (progressing well) • Test deployment available on QB • Project Aim II & III: SAGA-based Shell & Developing/Deploying FAUST (Framework for Adaptive Ubiquitous Scalable Tasks) • Planned for second half (Jan’10-May’10) of project • Depends upon stable and reliable deployment on TG

  6. SAGA Deployment on TeraGrid • Project Aim IV: Documentation (Andre) • Programming Manual and Exercise (Andre, Bety) in progress • http://faust.cct.lsu.edu/trac/saga/wiki/Tutorials/NeSC2009 • Tutorial and Training • Held several training events in Fall 2009 • International Summer School on Grid Computing • Advanced Distributed Summer School • NeSC-Edinburgh Training • Planned LONI training (January 2010)? • Is there interest in a TG-wide tutorial/training? • We currently provide source releases only – they’re available at http://saga.cct.lsu.edu/download/ • We’re following a 6/8-weekly release cycle. • 1.4 release due date 15 Nov (TeraGrid version) • 1.5 release due date 15 Jan 2010 • File a bug or feature request here: • http://faust.cct.lsu.edu/trac/saga/

  7. LONI’s New TeraGrid Projects • Project: TeraGrid-LONI-DEISA Interoperability • Background: Demonstrate the advantages of Scale-Out and Interoperability (across TG and DEISA) for appropriate scientific problems • Aim: To enhance the understanding of HIV-1 enzymes using replica-based methods across federated TG-DEISA-LONI • Do so using general-purpose, extensible, scalable approach • Test limits of Distributed Scale-Out – both algorithmic and infrastructure limits • As part of the VPH project, to ultimately help build the CI for quick, efficient (patient-specific) decision-tools using predictive MD of drugs and enzymatic targets (HIV-1 protease) • Application Models of HIV-1 and drugs created • Integration of LAMMPS with SAGA • Initial Replica-Exchange performed • Integration of LAMMPS with SAGA-based BigJob • Initial isolated runs on TeraGrid: Ranger and Abe • Working on launching on DEISA • SAGA-UNICORE (via GridSAM) testing in place

  8. TeraGrid-LONI-DEISA InteroperabilityNext Steps: • Integration of SAGA into Binding Affinity Calculator (BAC) tools to facilitate distributed Scale-Out • Protonation study of Ritonavir bound to HIV-1 Protease wild type (on QB/Ranger) • Study of binding affinity between 6 HIV-1 Protease mutants and the drug Ritonavir using SAGA-BAC Tools • Develop tools for Post-Processing on UK NGS and DEISA • Investigation of Reverse Transcriptase with Replica-Exchange (If time permits)

  9. LONI’s New TeraGrid Projects • Project: Extension of PetaShare to TeraGrid • PetaShare is an NSF-funded project that is deploying additional disk and tape storage at LONI sites and developing user-friendly data-aware storage systems, data-aware schedulers, and cross-domain metadata schemes. • PetaShare is currently providing distributed data storage and management capabilities to nine LONI institutions connected via high-speed LONI network. • This project is to extend PetaShare toTeraGrid thus TeraGrid users are be able to access their datasets in a more convenient way using the transparent PetaShare interfaces. • TeraGrid and LONI users be able to easily share and exchange data with each other. • PetaShare data access and retrieval services currently optimized for the LONI network and will need to be enhanced and optimized for the wide-area TeraGrid networks. • PetaShare services currently run only Linux-based systems and will need to be ported to different architecture and operating systems on Teragrid. • Ahmet Topcu was hired from IU for the TG PetaShare project and started here on June 15.

  10. LSU HPC/CCT Update • New Linux Cluster –Philip • Total 38 nodes, with 8 Intel “Nehalem” Xeon cores @ 2.93GHz, 160GB HD, 1GB Ethernet per node • 32 nodes with 24GB 1333MHz Ram, 3 nodes with 96GB 1066MHz Ram and 3 nodes with 48GB 1066MHz Ram • Open to users in September. Not a TeraGrid resource but potential for OSG jobs • New Educational Cluster dedicated for students--Arete • Total 72 nodes. 56 nodes have 8 AMD Opteron cores @2.3GHz and 16 nodes with cores @ 2.7GHz, 8GB RAM, 4x146GB HDD, Infiniband and 1GB Ethernet • Available for campus wide use beginning in the Spring 2010 • New Lustre storage • 240TB DDN storage through Dell was received and deployed as long term storage and will be allocated to LSU HPC users • The current 55TB Panasas storage will be upgraded to 80TB in December

  11. LONI/LSU Training • 5 workshops were held at LONI/LSU since June • 13 tutorials were provided since September at LSU and on Access Grid

More Related