1 / 40

Danny Powell Executive Director National Center for Supercomputing Applications

NCSA – Evolution of an HPC Center Infrastructure and Services for Scientific Analysis and Decision Support. Danny Powell Executive Director National Center for Supercomputing Applications University of Illinois at Urbana-Champaign. Talk Outline. About NCSA – Who we are now.. Basic numbers

adonis
Download Presentation

Danny Powell Executive Director National Center for Supercomputing Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NCSA – Evolution of an HPC CenterInfrastructure and Services for Scientific Analysis and Decision Support Danny Powell Executive Director National Center for Supercomputing Applications University of Illinois at Urbana-Champaign

  2. Talk Outline • About NCSA – Who we are now.. • Basic numbers • Mission • Basic methods of operation • Projects and Customers • Cyber-Infrastructure and Science Projects • Industry • Education • Government – Public Health • Evolving into a successful HPC Center • How we changed over the years • User service – centric focus • Your staff – it’s almost always about the people • Management – effective roles National Center for Supercomputing Applications

  3. University of Illinois at Urbana-ChampaignNational Center for Supercomputing Applications • Applied Research Unit of University of Illinois • Origin: 1986 NSF-funded national supercomputing centers • Original Mission: Provide state-of-the-art computing and data capabilities to the nation’s scientists and engineers • Develop software tools and software systems needed to make full use of advanced computing and data systems (Mosaic, Apache Web Server, Telnet, D2K, MyProxy, numerous others…) • NCSA by the Numbers • Approximately 275 staff (250 technical/professional staff) • Two facilities (NCSA Building, NPCF) (>220k sq.ft)

  4. Basic Facts about NCSA • Computing/Data Resources • Blue Waters: 11+ Petaflop (1+ PF sustained) computer (Cray) • Most powerful machine in NSF portfolio – NSF’s only Tier One machine • $350 million project ($200 million construction - $150 million operations) • Mid-Range Supercomputing systems: ~200 TF • Archival storage system: 500+ PB • Advanced visualization systems • Types of projects • Local, National and Global scale • Individual tools to large CI frameworks • Point solutions to systemic improvements • IP • Majority of work at NCSA is open source. • Can effectively deal with secure environments, proprietary codes, confidentiality National Center for Supercomputing Applications

  5. It is All About Working with Others • Funding • Federal Agencies, Industry, State of Illinois, Foundations, International sources • Most projects are partnerships with others (88%) • Leveraging skills/resources of others • Goal to be viewed as the “Partner of Choice” • IACAT (Institute for Advanced Computing Applications and Technologies) • Integrates applied research of NCSA with basic research teams of Universities • International Program • 30+ institutions from 22+ countries • Faculty and student exchanges, joint projects, workshops, technology sharing • Industrial Program • Nationally/internationally recognized for it’s level of functional interaction, technology transfer, student engagement • 23+ companies (Fortune 50/100/500, smaller technology companies) National Center for Supercomputing Applications

  6. NCSA Bridges Basic Research and Commercialization with Application Phase 2 Design/ Development Phase 3 Prototyping Phase 4 Production/ Deployment Phase 0 Concept/ Vision Phase 1 Feasibility Product Life Cycle Applied Prototyping & Development Optimization & Robustification Commercialization & Production (.com or .org) Theoretical & Basic Research NCSA Bridges the Gap BETWEEN Basic Research & Commercialization Application Universities & Labs Private Industry Economic Development

  7. Mission: Enable Science/Engineering/Education Individual tools, System software, Analytics, Visualization, Integrated SW systems, Workflow, User Support, Training Effective Resource Utilization USERS: High End Computer & Data Needs NCSA Enables effective/efficient use of high end computer and data resources in support of science and education Scientific, Decision Support, Inquiry Results

  8. Projects and Customers CyberInfrastructure Development A Collaboration/Partnership with a Broad Set of Communities National Center for Supercomputing Applications

  9. Blue Waters Presentation Title

  10. Blue Waters ProjectInput from Scientific Community • D. Baker, University of Washington • Protein structure refinement and determination • M. Campanelli, RIT • Computational relativity and gravitation • D. Ceperley, UIUC • Quantum Monte Carlo molecular dynamics • J. P. Draayer, LSU • Ab initio nuclear structure calculations • P. Fussell, Boeing • Aircraft design optimization • C. C. Goodrich • Space weather modeling • M. Gordon, T. Windus, Iowa State University • Electronic structure of molecules • S. Gottlieb, Indiana University • Lattice quantum chromodynamics • V. Govindaraju • Image processing and feature extraction • M. L. Klein, University of Pennsylvania • Biophysical and materials simulations • J. B. Klemp et al., NCAR • Weather forecasting/hurricane modeling • R. Luettich, University of North Carolina • Coastal circulation and storm surge modeling • W. K. Liu, Northwestern University • Multiscale materials simulations • M. Maxey, Brown University • Multiphase turbulent flow in channels • S. McKee, University of Michigan • Analysis of ATLAS data • M. L. Norman, UCSD • Simulations in astrophysics and cosmology • J. P. Ostriker, Princeton University • Virtual universe • J. P. Schaefer, LSST Corporation • Analysis of LSST datasets • P. Spentzouris, Fermilab • Design of new accelerators • W. M. Tang, Princeton University • Simulation of fine-scale plasma turbulence • A. W. Thomas, D. Richards, Jefferson Lab • Lattice QCD for hadronic and nuclear physics • J. Tromp, Caltech/Princeton • Global and regional seismic wave propagation • P. R. Woodward, University of Minnesota • Astrophysical fluid dynamics National Center for Supercomputing Applications

  11. Optimized Scientific Libraries Languages Compilers Programming Models IO Libraries Tools Fortran/CAF (OpenACC) Environment setup Cray Compiling Environment (CCE) NetCDF • Distributed Memory (Cray MPT) • MPI • SHMEM LAPACK Modules HDF5 C (OpenACC) ScaLAPACK Debugging Support Tools C++ (OpenACC) ADIOS • Fast Track Debugger(CCE w/ DDT) • Abnormal Termination Processing BLAS (libgoto) Python GNU Resource Manager • Shared Memory • OpenMP 3.0 UPC Iterative Refinement Toolkit Performance Analysis Debuggers Adaptive/Other • PGAS & Global View • UPC (CCE) • CAF (CCE) Allinea DDT STAT Cray Adaptive FFTs (CRAFFT) Cray Performance Monitoring and Analysis Tool Visualization • Cray Comparative Debugger# lgdb FFTW Data Transfer Prog. Env. VisIt PAPI Cray PETSc(with CASK) GO Paraview Eclipse Charm++ PerfSuite HPSS Cray Trilinos (with CASK) RAIT YT Tau Traditional Cray Linux Environment (CLE)/SUSE Linux Cray developed Under development Licensed ISV SW 3rd party packaging NCSA supported Cray added value to 3rd party MWTCC - May 31, 2013

  12. Blue Waters Designed to meet compute-intensive, memory-intensive, and data-intensive needs across a wide range of disciplines. • Peak performance: 11.61 PF • Cray XE6 cabinets: 237 • AMD Interlagos processors: >49,000 • 2.3 GHz • 22 640 compute nodes • 362,240 Bulldozer cores • Cray XK6 cabinets: >30 • NVIDIA GPUs: >3,000 • Interconnect: Cray Gemini / 3D torus • Usable storage: >25 PB • Usable storage bandwidth: >1 TB/s • Aggregate system memory: >1.5 PB • System Storage • Scaling to 500 petabytes • Bandwidth to near-line storage: 100 GB/s • Memory per core: 4 GB • Number of disks: >17,000 • Number of memory DIMMS: >190,000 • External network bandwidth: 100 Gb/s scaling to 300 Gb/s • Integrated near-line environment: Presentation Title

  13. XSEDE – National Compute and Data CyberInfrastructure • Collaboration between multiple US CI centers with deep experience: a partnership led by NCSA • PI: John Towns NCSA/Univ of Illinois • Co-PIs: Jay Boisseau, TACC/Univ of Texas Austin Gregg Peterson, NICS/Univ of Tenn-Knoxville Ralph Roskies, PSC/CMU Nancy Wilkins-Diehr, SDSC/UC-San Diego • Partners who complement these CI centers with expertise in science, engineering, technology and education • Univ of Virginia Ohio Supercomputer Center SURA CornellIndiana Univ PurdueUniv of Chicago RiceBerkeley NCARShodor Jülich Supercomputing Centre

  14. Advanced Information SystemsNational Cyberinfrastructure • Hardware • Computers • Data sources • Data stores • Networks • Software • Middleware • Portals • Grid-enabled • Applications • Visualization • Data analysis • Workflows National Center for Supercomputing Applications

  15. CyberInfrastructure is also about the tools/systems that allow effective use • Workflow • Data management • Software models/simulations • Compute resources • Software/Hardware optimization • Visualization tools and resources • Analytic tools • Collaborative environments • Resource sharing • Publishing support tools National Center for Supercomputing Applications

  16. Examples: Community Infrastructure Projects • Earthquake Engineering • Consequence based risk management for seismic events • Environmental Observatories • Ocean Observatories, Coupled Human/Natural Systems, BioDiversity • Atmospheric Modeling • Severe Weather Predictions, Regional Climate Modeling • Astronomy • Very large data transport, processing, and analysis pipelines • BioMedical Informatics • Multisource infectious disease surveillance and patient safety • Humanities/Social Science Research • Digital libraries, Text/Image analysis, social networks • Science Educational Support Systems • Teaching support and educational enhancement systems National Center for Supercomputing Applications

  17. Projects and Customers Industrial Partnerships National Center for Supercomputing Applications

  18. Private Sector Program Partners – August 2012

  19. Industrial Interests in HPC • PDM (Product Development Management) • CRM (Customer Relationship Management) • ERP (Enterprise Resource Planning) • SCM (Supply Chain Management) • BENEFITS: • Reduced Time-to-Market • Improved Product Quality • Reduced Prototyping Costs • Re-use original data • Reduced Waste • Framework for Optimization • Global Collaboration Courtesy of TranscenData.com Imaginations unbound

  20. Industrial Activities • Cycle provision • Overflow – when need exceeds their internal capacity • Testing – new architectures before purchasing • Research – testing new methods prior to large investments • Scalability, algorithms, optimization, security, … • Prototype tool/system development • Training • Peer discussions – on non-competitive basis • Stated as an important and unique reason for participating • Industrial park participation • Partners – proximity to expertise and students • New company spinoffs Imaginations unbound

  21. Projects and Customers Education National Center for Supercomputing Applications

  22. Training • Workshops • Train the trainer workshops • Targeted disciplinary/technology/techniques workshops • National conferences and other venues • Training materials • XSEDE https://www.xsede.org/training1 • Blue Waters – Petascale undergraduate education program http://www.shodor.org/petascale/ • Short courses • Virtual School of Computational Science and Engineering – petascale oriented (including big data) • http://www.vscse.org/ • Collaboration – multiple universities National Center for Supercomputing Applications

  23. Outreach • Public awareness • Visualization of real scientific data in public venues • Planetariums – digital domes – astronomy • Hubble 3-D • Cosmic Voyage • Science and Technology Museums – weather, astronomy • Search for Life • Computational Tornado Science • Dynamic Earth • TV and Film • “Tree of Life” - Academy Award nomination – Cinematography and visual effects • “Hunt for the Supertwister” - a public television (NOVA) special • “Monster of the Milky Way” - NOVA PBS television special • Others … National Center for Supercomputing Applications

  24. Educational TechnologyIn support of the learning process • Often - the technology used to support research is also valuable in supporting education • Digital informational resources • Books, references, lectures, photos, videos, audio • Virtual museums, artifacts • Data, experiments • Tools • Analysis, Inquiry, Applications, Visualization • Models and Simulations • Collaborative Environments • Virtual coordination, workflow spaces • Resource sharing – data, computation, visualization National Center for Supercomputing Applications

  25. Projects and Customers Government and Public Health Informatics National Center for Supercomputing Applications

  26. Examples of Uses of HPC / Data Analytics • Illinois State Police – analysis of historical data to help determine crime (and hence staffing) patterns • Policy makers – hazard risk assessments and planning (and response) • Public health officials – early warning on disease outbreaks, with informed options to manage • National Archives – data tools for long term preservation and for public analysis of the data • Economic Development – agricultural marketing enhancement and monitoring program • Policy Decision Support - Urban Planners, Environmental Monitoring, Socio-Economic Modeling, Social Network analysis… many others National Center for Supercomputing Applications

  27. Evolving into a successful HPC Center How we have changed over time User focus Keeping your staff sharp – not complacent Management National Center for Supercomputing Applications

  28. Mission: Enable Science/Engineering/Education Individual tools, System software, Analytics, Visualization, Integrated SW systems, Workflow, User Support, Training Effective Resource Utilization USERS: High End Computer & Data Needs NCSA Enables effective/efficient use of high end computer and data resources in support of science and education Scientific, Decision Support, Inquiry Results

  29. Traditional Function: System Support • System Management • Resource and job scheduling • Storage Management • On-line and Near-line system and data administration • Information life cycle management • Cyber-protection • Networking provisioning and tuning • System Monitoring • System software upgrades and SW management. • Quality Assurance BW Full Service Overview

  30. User Support Function: Basic and Beyond • Requirement Analysis • Service Request Management • Application Services • Application analysis • Porting and Tuning at scale • Bottleneck reduction • Client consulting • Application re-engineering • Library and tools creation and support • Third Party Application support • Visualization and Data Analysis • Information provisioning • Documentation, notification, training, community • Account/allocation management • Quality Assurance BW Full Service Overview

  31. Community Engagement Function: Relationship Building • Partnership/Team Building • Structured Requirement Analysis • Workflow Systems • Business / operation rules • Collaborative environments • Intuitive user interfaces • Data storage, data management tools • Visualization and data analytics tools • Community engagement • Work Plan Management • Participation in evaluation and planning • Trust BW Full Service Overview

  32. Staff Changes (estimated numbers) Technical staff breakdown Current Very Early Days Technical system administration 50 70 Applied R&D 100 40 User Support (from basic service to Customized disciplinary support) 50 20 Technical management (mid level to senior) 50 25 National Center for Supercomputing Applications

  33. And Finally: Organizational Management • Hire and retain skilled staff • Continued professional development • Keep staff motivated and sharp • Proposals – competitions • Peer speaking engagements – personnel exchanges • Enable them to grow personally and professionally • Don’t micromanage – empower your staff to succeed, and let them • The MONEY – Always the Money!!! • Core funding – work closely with your core funding sources • Variety of competitive grant funding • Help your funding agencies understand the value of HPC and CyberInfrastructure, and what it takes to be successful. • It’s not cheap, and the ROI will take time to show value – but without a long term commitment from your core funding agency, it will be very, very difficult to accomplish. National Center for Supercomputing Applications

  34. Questions? STEM Smart Workshop •10 April 2012 •Chicago, Illinois •http://iclcs.illinois.edu

  35. Imaginations unbound

  36. Building Integrated Application/Decision Support Systems – It’s an Iterative Process of Teamwork User Representatives Team Participation Application Roadmaps Technology Roadmaps Requirements Analysis & Specification Cyberarchitecture Working Group Partners Integrated Project Teams TeraGrid Working Groups Advisory Committees Industrial Partners International Partners Portals & GUIs Workflow Mgmt S&E Applications Data Mining & Analysis Visualization Webservices Collaboratories Middleware Security Development & System Integration Prototype or Production Cyberenvironments Situation Analysis Knowledge and Decision Support National Center for Supercomputing Applications National Center for Supercomputing Applications

  37. Science & Engineering Application Support Science Team (ST) Requirements and Challenges Gathering SEAS Staff and Points of Contact (PoC)

  38. Advanced Information SystemsMajor New Data Sources Computers New high-end computers are producing massive amounts of data from ever more detailed computational models Sensors, Surveys and Satellites Sensor arrays, aerial surveys and satellite data will revolutionize our understanding of the environment Instruments New instruments, e.g., telescopes and detectors, are using advanced digital technologies to support increasingly detailed observations National Center for Supercomputing Applications

  39. NDEMC - OVERVIEW • $5M, 18-month Public-Private Partnership (PPP) • 4 OEMs; 4 solution providers; • Phase 1: 8 manufacturing sector SMEs • Advanced modeling, simulation & analysis (MS&A) • Rationale: • MS&A adoption by OEMs is high and growing • SMMs’ use of advanced MS&A is suboptimal • ROI is definitely favorable • Objectives: • Boost MS&A adoption at SMMs • Simplified access to advanced MS&A • Demonstrate a scalable business model

  40. Networks are Critical Infrastructure National Center for Supercomputing Applications

More Related