slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
NERSC Status Update for NERSC User Group Meeting June 2006 William T.C Kramer kramer@nersc PowerPoint Presentation
Download Presentation
NERSC Status Update for NERSC User Group Meeting June 2006 William T.C Kramer kramer@nersc

Loading in 2 Seconds...

play fullscreen
1 / 26

NERSC Status Update for NERSC User Group Meeting June 2006 William T.C Kramer kramer@nersc - PowerPoint PPT Presentation


  • 140 Views
  • Uploaded on

NERSC Status Update for NERSC User Group Meeting June 2006 William T.C Kramer kramer@nersc.gov 510-486-7577 Ernest Orlando Lawrence Berkeley National Laboratory. Outline. Thanks for 10 Years of Help. This is the 20 th NUG meeting I have the privilege of attending

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'NERSC Status Update for NERSC User Group Meeting June 2006 William T.C Kramer kramer@nersc' - aurek


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

NERSC Status Update for

NERSC User Group Meeting

June 2006

William T.C Kramer

kramer@nersc.gov

510-486-7577

Ernest Orlando Lawrence

Berkeley National Laboratory

thanks for 10 years of help
Thanks for 10 Years of Help
  • This is the 20th NUG meeting I have the privilege of attending
  • Throughout the past 10 years you all have provided NERSC invaluable help and guidance
  • NUG is very unique within the HPC community
  • NERSC and I are grateful for your help in making NERSC successful
nersc must address three trends
NERSC Must AddressThree Trends
  • The widening gap between application performance and peak performance of high-end computing systems
  • The recent emergence of large, multidisciplinary computational science teams in the DOE research community
  • The flood of scientific data from both simulations and experiments, and the convergence of computational simulation with experimental data collection and analysis in complex workflows
science driven systems
Science-Driven Systems
  • Balanced and timely introduction of best new technology for complete computational systems (computing, storage, networking, analytics)
  • Engage and work directly with vendors in addressing the SC requirements in their roadmaps
  • Collaborate with DOE labs and other sites in technology evaluation and introduction
science driven services
Science-Driven Services
  • Provide the entire range of services from high-quality operations to direct scientific support
  • Enable a broad range of scientists to effectively use NERSC in their research
  • Concentrate on resources for scaling to large numbers of processors, and for supporting multidisciplinarycomputational science teams
science driven analytics
Science-Driven Analytics
  • Provide architectural and systems enhancements and services to more closely integrate computational and storage resources
  • Provide scientists with new tools to effectively manipulate, visualize and analyze the huge data sets from both simulations and experiments
slide9

National Energy Research Scientific Computing (NERSC)Center Division

NERSC CENTER

DIVISION DIRECTOR

HORST SIMON

DIVISION DEPUTY

WILLIAM KRAMER

Draft

NERSC CENTER GENERAL MANAGER

& HIGH PERFORMANCE COMPUTING DEPARTMENT HEAD

WILLIAM KRAMER

SCIENCE DRIVEN SYSTEM ARCHITECTURE

JOHN SHALF

Team Leader

SCIENCE DRIVEN SYSTEMS

HOWARD WALTER

Associate General Manager

SCIENCE DRIVEN SERVICES

FRANCESCA VERDIER

Associate General Manager

HENP COMPUTING

CRAIG TULL

Group Leader

COMPUTATIONAL SYTEMS

JAMES CRAW

Group Leader

USER SERVICES

JONATHAN CARTER

Group Leader

MASS STORAGE

JASON HICK

Group Leader

ANALYTICS

WES BETHEL- TL

(Matrixed - CRD)

OPEN SOFTWARE

& PROGRAMMING

DAVID SKINNER Group Leader

NETWORK, SECURITY & SERVERS

BRENT DRANEY

Group Leader

COMPUTER OPERATIONS

& ESnet SUPPORT

STEVE LOWE

Group Leader

ACCOUNTS &

ALLOCATION TEAM

CLAYTON BAGWELL

Team Leader

slide10

NERSC Center

NERSC CENTER GENERAL MANAGER

WILLIAM KRAMER

Draft

SCIENCE DRIVEN SYSTEMS

HOWARD WALTER

Associate General Manager

SCIENCE DRIVEN SERVICES

FRANCESCA VERDIER

Associate General Manager

COMPUTATIONAL

SYSTEMS

JAMES CRAW

Group Leader

Matthew Andrews (.5)

William Baird

Nick Balthaser

Scott Burrow (V)

Greg Butler

Tina Butler

Nicholas Cardo

Thomas Langley

Rei Lee

David Paul

Iwona Sakrejda

Jay Srinivasan

Cary Whitney (HEP/NP)

Open Positions (2)

USER

SERVICES

JONATHAN CARTER

Group Leader

Harsh Anand Andrew Canning (.25- CRD)

Richard Gerber

Frank Hale

Helen He

Peter Nugent – (.25-CRD)

David Skinner (.5)

Mike Stewart

David Turner (.75)

ANALYTICS

WES BETHEL

Team Leader

(.5-CRD)

Cecilia Aragon (.2 - CRD)

Julian Borrill (.5 - CRD)

Chris Ding(.3 - CRD)

Peter Nugent (.25 - CRD)

Christina Siegrist (CRD)

Dave Turner (.25)

Open Positions (1.5)

SCIENCE DRIVENSYTEM

ARCHITECTURETEAM

JOHN SHALF

Team Leader

Andrew Canning (.25- CRD)

Chris Ding (.2 – CRD)

Esmond Ng (.25-CRD)

Lenny Oliker (.25-CRD)

Hongzhang Shan (.5-CRD)

David Skinner (.5)E.Strohmaier (.25-CRD)

Lin Wang Wang (.5 – CRD)

Harvey Wasserman

Mike Welcome (.15-CRD)

Katherine Yelick (.05-CRD)

NETWORKING,

SECURITY, SERVERS

& WORKSTATIONS

BRENT DRANEY

Group Leader

Elizabeth Bautista (DB)

Scott Campbell

Steve Chan

Jed Donnelley

Craig Lant

Raymond Spence

Tavia Stone

Open Position (DB)

COMPUTER OPERATIONS

& ESnet SUPPORT

STEVE LOWE

Group Leader

Richard Beard

Del Black

Aaron Garrett

Russell Huie (ES)

Yulok Lam

Robert Neylan

Tony Quan (ES)

Alex Ubungen

OPEN SOFTWARE

& PROGRAMMING

DAVID SKINNER

Group Leader

Mikhail Avrekh

Tom Davis

RK Owen

Open Position (1) - Grid

ACCOUNTS &

ALLOCATIONS

CLAYTON BAGWELL

Team Leader

Mark Heer

Karen Zukor (.5)

MASS

STORAGE

JASON HICK

Group Leader

Matthew Andrews (.5)

Shreyas Cholia

Damian Hazen

Wayne Hurlbert

Open Position (1)

V- Vendor staff

CRD – Matrixed staff from CRD

ES – funded by ESnet

HEP/NP – funded by LBNL HEP and NP Division

DB – Division Burden

large scale capability computing is addressing new frontiers

DOE

Joule

metric

  • Comprehensive Scientific Support:
  • 20-45% code performance improvements  2M extra hours
  • All projects reliedheavily on NERSC visualization services
Large-Scale Capability Computing Is Addressing New Frontiers

INCITE Program at NERSC in 2005:

  • Turbulent Angular Momentum Transport; Fausto Cattaneo, University of Chicago
    • Order of magnitude improvement in simulation of accretion in stars and in the lab.
  • Direct Numerical Simulation of Turbulent Non-premixed Combustion; Jackie Chen, Sandia Labs
    • The first 3D Direct Numerical Simulation of a turbulent H2/CO/N2-air flame with detailed chemistry. Found new flame phenomena unseen in 2D.
  • Molecular Dynameomics; Valerie Dagget, University of Washington
    • Simulated folds for 38% of all known proteins
    • 2 TB protein fold database created
the good
The Good
  • Deployed Bassi – January 2006
    • One of the fastest installations and acceptances
    • Bassi providing exceptional service
  • Deployed NERSC Global File System – Sept 2005
    • Upgraded – January 2006
    • Excellent feedback from users
  • Stabilized Jacquard – October 2005 to April 2006
    • Resolved MCE
    • errors
    • Installed 40 more nodes
the good1
The Good
  • Improved PDSF
    • Added processing and storage
    • Converted 100’s of NSF file systems to a few GPFS file systems
    • Access to NGF
  • Increased Archive Storage function and performance
    • Upgraded to HPSS 5.1 – April 2006
    • More tape drives
    • More Cache disk
    • 10 GE Servers
  • NERSC 5 procurement
    • On schedule and below cost (to do the procurement)
  • Continued Network tuning
the good2
The Good
  • Deployed Bassi – January 2006
    • One of the fastest installations and acceptances
    • Bassi providing exceptional service
  • Deployed NERSC Global File System – Sept 2005
    • Upgraded – January 2006
    • Excellent feedback from users
  • Stabilized Jacquard – October 2005 to April 2006
    • Resolved MCM errors
    • Installed 40 more nodes
the good3
The Good
  • Continued Network Tuning
  • Security
    • Continued to avoid major incidents
    • Good results from the “Site Assistance Visit” at LBNL
      • LBNL and NERSC “outstanding”
      • Still a lot of work to do – and some changes – before they return in a year
  • Over allocation issues (AY 05) solved
    • Better queue responsiveness
    • Stable time allocations
the good4
The Good
  • Other
    • Thanks to ASCR the NERSC budget appears stabilized
    • Worked with others to help define HPC business practices
    • Continued progress in influencing advanced HPC concepts
      • Cell, Power, Interconnects, Software roadmaps, evaluation methods, working methods,…
the not so good
The Not So Good
  • Took a long time to stabilize Jacquard
    • Learned some lessons about light weight requirements
  • Upgrades on systems have not gone as well as we would have liked
    • Extremely complex – and much is not controlled by NERSC
  • Security attempts continue and increase in sophistication
    • Can expect continued evolution
      • User and NERSC data base usage will be a point of focus
the jury is still out
The Jury is still out
  • Analytics ramp-up taking longer than we desired
    • NGF major step
    • Some success stories, but we don’t have breadth
  • Scalability of Codes
    • DOE Expects significant (>50%?) of time to be for jobs > 2,048 way for the first full year of NERSC-5
    • Many of the most scalable applications are migrating to LCFs – so some of the low hanging fruit is already harvested
    • Should be a continuing focus of NERSC and NUG
fy 04 06 overall goals
FY 04-06 Overall Goals
  • (Support for DOE Office of Science) Support and assist DOE Office of Science in meeting its goals and obligations through the research, development, deployment and support of high performance computing and storage resourcesand advanced mathematical and  computer systems software.
  • (Systems and Services)Provide leading edge, open High Performance Computing (HPC) systems and services to enable scientific discovery. NERSC will use its expertise and leadership in HPC to provide reliable, timely, and excellent services to its users.
  • (Innovative assistance)Provide innovative scientific and technical assistance to NERSC's users. NERSC will work closely with the user community and together produce significant scientific results while making the best use of NERSC facilities.
  • (Respond to Scientific Needs)Be an advocate for NERSC users within the HPC community. Respond to science-driven needs with new and innovative services and systems.
fy 04 06 overall goals1
FY 04-06 Overall Goals
  • (Balanced integration of  new products and ideas)Judiciously integrate new products, technology, procedures, and practices into the NERSC production environment in order to enhance NERSC's ability to support scientific discovery.
  • (Advance technology)Develop future cutting-edge strategies and technologies that will advance high performance scientific computing capabilities and effectiveness, allowing  scientists to solve new and larger problems, and making HPC systems easier to use and manage.
  • (Export NERSC knowledge)Export knowledge, experience, and technology developed at NERSC to benefit computer science and the high performance scientific computing community.
  • (Culture)Provide a facility that enables and stimulates scientific discovery by continually improving our systems, services and processes. Cultivate a can-do approach to solving problems and making systems work, while maintaining high standards of ethics and integrity.
5 year plan milestones
5 Year Plan Milestones
  • 2005
    • NCS enters full service.- Completed
      • Focus is on modestly parallel and capacity computing.
      • >15–20% of Seaborg
    • WAN upgrade to 10 Gb/s .- Completed
    • Upgrade HPSS to 16 PB. Storage upgrade to support 10 GB/s for higher density and increased bandwidth. .- Completed
    • Quadruple the size of the visualization/post-processing server. .- Completed
  • 2006
    • NCSb enters full service. .- Completed
      • Focus is on modestly parallel and capacity computing
      • >30–40% of Seaborg .- Completed – Actually > 85% of Seaborg SSP
5 year plan milestones1
5 Year Plan Milestones
  • 2006
    • NERSC-5: initial delivery with possibly a phasing of delivery. – Expected – but most will be in FY 07
      • 3 to 4 times Seaborg in delivered performance – Over Achieved – more later
      • Used for entire workload and has to be balanced
    • Replace the security infrastructure for HPSS and add native Grid capability to HPSS – Completed and Underway
    • Storage and Facility-Wide File System upgrade. .- Completed and Underway
  • 2007
    • NERSC-5 enters full service. - Expected
    • Storage and Facility-Wide File System upgrade. - Expected
    • Double the size of the visualization/post processing server. – If usage dictates
summary
Summary
  • It is a good time to be in HPC
  • NERSC has far more success stories than issues
  • NERSC Users are doing an outstanding job producing leading edge science for the Nation
    • More than 1,200 peer reviewed papers for AY 05.
  • DOE is extremely support of NERSC and its users