1 / 31

CyberInfrastructure at NSF

CyberInfrastructure at NSF. Presented to: iGrid 2005 27 Sep 2005. Jos é Muñoz , Ph.D. Office of CyberInfrastructure Deputy Office Director Senior Science Advisor. Outline. NSF organizational changes CyberInfrastructure (CI) at NSF New Acquisition Announcement

sumayah
Download Presentation

CyberInfrastructure at NSF

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CyberInfrastructure at NSF Presented to: iGrid 2005 27 Sep 2005 José Muñoz , Ph.D. Office of CyberInfrastructure Deputy Office Director Senior Science Advisor

  2. Outline • NSF organizational changes • CyberInfrastructure (CI) at NSF • New Acquisition Announcement • NSF’s TeraGrid Effort • TeraGrid Examples • Summary Muñoz

  3. NSF [will provide] world-class computing environments to academic researchers across the complete spectrum of basic research in science and engineering. In addition to raw computational capability, NSF will ensure that these environments include a balanced mix of architecturally diverse computing resources and associated high-end storage and visualization platforms. Excerpt from Report of the CIIWG May 2005 Muñoz

  4. Recent Happenings • CyberInfrastructure Council (CIC) established • Office of Cyberinfrastructure (OCI) established • Agency-wide Strategic Planning Process Underway • Search for OCI Office Director • Advisory Committee for CI Muñoz

  5. OCI SCI NSF Director CISE CNS IIS CCF Muñoz

  6. Education & Training Cyberinfrastructure Components Collaboration, Communication & Remote Access Data, Data Analysis & Visualization High Performance Computing Muñoz

  7. CI Strategic Planning “CI Vision” document • Ch. 1: Call to Action • Ch. 2: Strategic Plan for High Performance Computing • Ch. 3: Strategic Plan for Data, Data Analysis & Visualization • Ch. 4: Strategic Plan for Collaboration, Communication & • Remote Access • Ch. 5: Strategic Plan for Education & Workforce Development Muñoz

  8. Strategic Plan for High Performance Computing (FY 2006 – 2010) Creating a World-Class HPC Environment To Enable Petascale Science and Engineering driven By The Needs Of The Science and Engineering Communities Muñoz

  9. Software Service Provider (SSP) S&E Community Portable, Scalable Applications Software & Services Science-Driven HPC Systems HPC Resource Providers Agency Partners SSP Local Storage Visualization Facilities Compute Engines SSP Private Sector Strategic Plan for High Performance Computing(2006-2010) Muñoz

  10. Acquisition Context • One or two high performance computing systems will be acquired from one or more hardware vendors with funds supplied by NSF.  • One or more resource providers (RPs) will manage and operate the system(s), including providing basic user support and services. • Two RFI activities have already been executed: potential resource providers and vendors, application scientists • Formal solicitation announcement from NSF’s Office of CyberInfrastructure by 30Sep05 Muñoz

  11. Acquisition Strategy Science and engineering capability (logrithmic scale) Track 1 system(s) Track 2 systems Typical university HPC systems FY06 FY07 FY08 FY09 FY10 Muñoz

  12. SSP SSP SSP TeraGrid (ETF) and HPC TeraGrid Provides a Unified User Environment to Support High-capability, Production-quality CI Services. HPC World-class HPC Environment For Petascale Science and Engineering • Production HPC is the focus • Portability and scalability addressed in software engineering • services • NSF and partner agencies support range of activities • Production HPC is one of several CI service components • Integration of services provided by grid technologies • Distributed, open architecture, sites may join Muñoz

  13. Muñoz Courtesy of TeraGrid

  14. TERAGRIDPartners www.teragrid.org

  15. Computational Science is no longer a cottage industry • Can we make many computing/data centers behave as one center? • Defining accessibility, performance, administration, policy… • Can national and international resources be integrated with community portals? • Seamlessly extending portals built around local resources • What steps can be made towards a national or global cyberinfrastructure? • Establishing an extensible technical and cooperative basis Muñoz Courtesy of TeraGrid

  16. Resource Providers and Facilities Integration Team(Grid Infrastructure Group) CI Operations, Networking & Security Community Engagement: Science Gateways Software Integration User Support Team TeraGrid Today: the Extensible Terascale Facility • August 2005 –Phase 2 Begins • Science Outreach and Operations • Architectural & Scale Diversity • 13 Partners: adding UNC, ISI, GaTech, UWisc • 16 Systems, 9 Architectures (adding Cray) • $148M over 5 years • 138 full time equivalents Muñoz Courtesy of TeraGrid

  17. Resource Providers: Resources and Services Grid Infrastructure Group (GIG) Architecture, Software, Operations, Common Services, Coordinated User Support, Science Gateways Courtesy of TeraGrid

  18. The TeraGrid Strategy TeraGrid DEEPEnabling terascale science Make science more productive through a unified set of very-high capability resources. • TeraGrid WIDEEmpowering science communities to leverage TeraGrid capabilities • Bring TeraGrid capabilities to the broad science community • TeraGrid OPEN • Driving the evolution of cyberinfrastructure • Interoperation with other Grids and facilitating addition of resources from new partners into the TeraGrid environment Muñoz Courtesy of TeraGrid

  19. User Communities DEEP Expert and Advanced Users (100s) • Want to log into supercomputers, develop and run applications • Interest in turnaround, can use a variety of platforms Broad Science Community (1,000s) • Want to use applications provided by others to carry out studies • Interest in turnaround and avoiding details of computing and data management • Interest in workflow management tools to automate procedures Public Access (10,000s, including education) • Want to use simple applications for small, possibly fixed, set of jobs WIDE Muñoz Courtesy of TeraGrid

  20. TeraGrid WIDE: What are Science Gateways? Gateways • Enable whole communities to take advantage of TeraGrid resources, • Engage science communities that are not traditional users of supercomputing centers, by • Providing community-tailored access to TeraGrid services and capabilities. Models • Web-based community Portals employ Grid Services to provide TeraGrid-deployed applications. • Coordinated access points enable users to move seamlessly between TeraGrid and other grids. • Application programs running on users' machines access services in TeraGrid (and elsewhere). All take advantage of existing community investment in software, services, education, and other components of Cyberinfrastructure. Muñoz Courtesy of TeraGrid

  21. Challenges in Gateways • Many common needs and issues across Gateways • Accounts – support for community accounts • Accounting – services to track and audit usage • Security – individual, portal or community certificates • Scheduling – centralized job management • Web Services – standardized interfaces to the above • Portal Middleware – integration with available frameworks • Data Access – supporting data collections • Servers – for hosting portals within TeraGrid • Primer – for Science Gateways Muñoz Courtesy of TeraGrid

  22. Storms Forming Forecast Model Streaming Observations Data Mining On-Demand Grid Computing Gateways that Bridge to Community Grids • Many Community Grids already exist or are being built • NEESGrid, LIGO, Earth Systems Grid, NVO, Open Science Grid, etc. • TeraGrid will provide a service framework to enable access that is transparent to users • The community maintains and controls the Gateway • Different communities have different requirements. • NEES and LEAD will use TeraGrid to provision compute services • LIGO and NVO have substantial data distribution problems • All of them require remote execution of multi-step workflows Muñoz Courtesy of TeraGrid

  23. Neutron Science GatewaySpallation Neutron Source, Oak Ridge National Laboratory • 17 instruments • Users worldwide get “beam time” • Need access to their data and post processing capabilities • Day-1 – April 2006 • First Users – September 2006 • General Users – June 2007 • Opportunity to impact how large facilities are designed Muñoz Courtesy of TeraGrid

  24. Workflow Composer Grid Portal Gateways • A Portal accessed through a browser or desktop tools • Provides Grid authentication and access to services • Provides direct access to TeraGrid hosted applications as services • Required Support Services • Authorization services • Application deployment services • Searchable metadata catalogs • Information space management • Workflow managers • Resource brokers • Builds on NSF & DOE software • Use NMI Portal Framework, GridPort • NMI Grid Tools: Condor, Globus, etc. • OSG, HEP tools: Clarens, MonaLisa Muñoz Courtesy of TeraGrid

  25. CMS on the TeraGridExperiments for the Large Hadron Collider (LHC) • Compact Muon Solenoid Experiment • PI: Harvey Newman, Caltech • TeraGrid ASTA Team: Tommy Minyard, Edward Walker, Kent Milfeld • Simulations running simultaneously across multiple TeraGrid sites, SDSC, NCSA and TACC, using grid middleware tool, GridShell • Complex workflow consisting of multiple execution stages running a large number of serial jobs (~1000s) with very large datasets stored on SDSC HPSS and staged to local sites prior to job runs • Used 420K CPU hours on TeraGrid systems last year, usage expected to increase this and coming years CMS experiment is looking for the Higgs particles, thought to be responsible for mass, and to find supersymmetry, a necessary element for String Theory. Currently running event simulations and reconstructions to validate methods prior to experimental data becoming available. “Using the NSF TeraGrid for Parametric Sweep CMS Applications”, Proc. Int. Sym. on Nuclear Electronics and Computing (NEC’2005) Sofia, Bulgaria, Sept. 2005 Muñoz Courtesy of TeraGrid

  26. CyberShake 2 • Incorporate dynamic ruptures into large propagation simulations • First attempt to calculate the physics-based probabilistic hazard curves for Southern California using full waveform modeling • Uses TeraGrid compute and storage resources Major Earthquakes on the San Andreas Fault, 1680-present 1906 M 7.8 1857 M 7.8 1680 M 7.7 Funded by NSF GEO/CISE PIs: Olsen (SDSU), Okaya (USC) TG ASTA Team: Cui (SDSC), Reddy (GIG) Such simulations provide potentially immense benefits in saving both many lives and billions in economic losses large-scale simulation of a magnitude 7.7 seismic wave propagation on the San Andreas Fault, generating more than 50 TBs of output Muñoz

  27. Arterial Blood Flow StudiesCross-Site Runs and Computational Steering on the TeraGrid • PIs: Karniadakis & Dong (Brown); Boghosian (Tufts) • Develop and optimize infrastructure by Nov. 2005 • Job manager and queuing system • Globus and MPICH-G2 installation • Performance monitoring and optimization • Real-time performance data gathering for visualization • Various MPICH-G2 porting efforts • Visualization support • First simulation of complete human arterial tree Mar. 2006 Muñoz

  28. TeraGrid Usage by Discipline: Jan. 2004 – June 2005 Geoscience So far… 280 PI’s 550 projects ~800 users 42M SUs used ~900M Cray X-MP hours Computer 4% Sci/Eng Chemistry 7% 21% Engineering 12% Physics 15% Biology 26% Astronomy Math Materials 9% 1% 5% Courtesy of TeraGrid

  29. TeraGrid Success Stories • Large Earthquake Impact Models in TeraShake • S-CA model shows directional effects larger than expected [Olsen/Okaya/Minster] • Enhancing Oil Recovery Techniques with IPARS • Data-driven optimization employing 4 TeraGrid resources [Wheeler/Saltz/Parashar] • Improving Groundwater Cleanup Decisions • Identifies tradeoffs to reduce contamination at less cost [Minsker/Loftus] • Understanding Dark Energy with NVO • Comparing astronomical measurements with simulations [Connolly/Scranton] • Analysis of Amphiphilic Liquids in TeraGyroid • 2004 ISC Award for Integrated Data and Information Mgt. [Coveney/Boghosian] • Protein Sequence Analysis with GADU/GNARE • 2.3M sequences analyzed in 8.8 days [Maltsev] • Identifying Brain Disorders with BIRN • Analysis of Hippocampus shapes revealed disease patterns [Miller/Beg] Courtesy of TeraGrid Muñoz

  30. Cyberinfrastructure Vision NSF will support the development and maintenance of a comprehensive cyberinfrastructure essential to 21st century advances in science and engineering. Internet2 Universities Muñoz

  31. Thank You Muñoz

More Related