HPC USER FORUM Stuttgart Germany October 2010 Merle Giles Private Sector Program & Economic Development [email protected] Basic & Applied Research. College of Engineering. CSL. Digital Computer Lab. NCSA. ECE. Biotechnology Center. Computer Science. Institute for
College of Engineering
Over the horizon development …
Product Life Cycle
(.com or .org)
Bridges the Gap
Basic Research & Commercialization
Value ApplicationCreation and Economic Development
Value Creation and Economic ApplicationDevelopment
Industry Partners Application
CURRENT PARTNERS Application
Industrial Design Cycle
Bills Of Materials
Technical Regs and Standards
Supply Chain Mgmt
Smart-aggregation rules: Some forms of content will always need human curating.
Dumb-aggregation rules: And some forms of content won’t.
The trick is to figure out where algorithms beat humans, and vice versa.
Materials, fluids, mechanical,
aerothermal, electrical, etc.
Petascale, ISV code, data management, architecture, etc.
scaling, multiphysics, algorithms, etc.
Strategy, process, services, partnerships, etc.
New Performance Driver Application
NCSA’s Blue Waters is the first open-access system tasked to achieve ≥ 1 petaflop/s
on real applications.
From 1956 . . .
From 1956 . . .
A hard disk drive
with 5 MB storage
Leading-Edge Collaboration Application
Peta- 1015 (Bits, Bytes, Flop/s)
Integrated/Scalable System Application
Blue Waters will be the most powerful computer in the world for scientific research when it comes on line in 2011.
~10 PF Peak
~1 PF sustained
~1.2 PB of memory
>18 PB of disk storage
500 PB of archival storage
≥100 Gbps connectivity
Blue Waters Building Block
32 IH server nodes
256 TF (peak)
32 TB memory
128 TB/s memory bw
4 Storage systems (>500 TB)
10 Tape drive connections
IH Server Node
8 QCM’s (256 cores)
8 TF (peak)
1 TB memory
4 TB/s memory bw
8 Hub chips
Fully water cooled
4 Power7 chips
128 GB memory
512 GB/s memory bw
1 TF (peak)
1.128 TB/s bw
Blue Waters is built from components
that can be used to build other systems
with a wide range of capabilities—
from servers to beyond Blue Waters.
8 cores, 32 threads
L1, L2, L3 cache (32 MB)
Up to 256 GF (peak)
128 Gb/s memory bw
Data Center in a Rack Application
Programming Environment Application
Global, Parallel shared file system (>10 PB) and archival storage
Environment: Traditional (command line), Eclipse IDE (application development, debugging, performance tuning, job and workflow management)
Languages: C/C++, Fortran (77-2008 including CAF), UPC
Performance tuning: HPC and HPCS toolkits, open source tools
Resource manager: Batch and interactive access
Parallel debugging at full scale
Full – featured OS(AIX or Linux),
Sockets, threads, shared memory, checkpoint/restart
Libraries: MASS, ESSL, PESSL, PETSc, visualization…
Programming Models: MPI/MP2, OpenMP, PGAS, Charm++, Cactus
Low-level communications API supporting active messages (LAPI)
Multicore POWER7 processor with Simultaneous MultiThreading (SMT) and Vector MultiMedia Extensions (VSX)
Private L1, L2 cache per core, shared L3 cache per chip
High-Performance, low-latency interconnect supporting RDMA
NSF Challenge: ≥1 Sustained petaflop/s
Photos courtesy of NERSC, UIUC, IBM