Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations

Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations David Gobaud Computational Drug Discovery Stanford University 7 March 2006

Outline • Overview • Background • Delft Molecular Dynamics Processor • GRAPE • Protein Explorer Summary • MDGRAPE-3 Chip • Force Calculation Pipeline • J-Particle Memory and Control Units • System Architecture • Software • Cost • Questions

Overview • Protein Explorer • Petaflop special-purpose computer system for molecular dynamics simulations • High-precision screening for drug design • Large-scale simulations of huge proteins/complexes • PC cluster with special-purpose engines to perform the most time-consuming calculations • Dedicated LSI MDGRAPE-3 chip performs force calculations at 165 Gflops or higher • ETA 2006

Background • PCs are universal machines • Various applications • Hardware can be designed independent of applications • Obstacles to high-performance • Memory bandwidth bottleneck • Heat dissipation problem • Can be overcome by developing specialized architectures

Delft Molecular Dynamics Processor (DMDP) • Pioneered high-performance special-purpose systems • Not able to achieve effective cost-performance • Demanded too much time and money in development state • Speed of development is a crucial factor affecting cost-performance because electronic device technology continues to develop rapidly • Almost all calculations performed by DMDP making hardware very complex

GRAPE (GRAvity PipE) • One of the most successful attempts to develop high-performance special-purpose systems • Specialized for simulations of classical particles • Most time spent on calculation of long-range forces (gravitational, Coulomb, and van der Waals) • Thus special hardware only performs these calculations • Hardware very simple and cost-effective

GRAPE (GRAvity PipE) • In 1995 first machine to break teraflops barrier in nominal peak performance • Since 2001 leader in performance has been Molecular Dynamics Machine at RIKEN at 78-TFlops • 2002 @ University of Tokyo a 64-TFlop GRAPE-6 completed • Protein Explorer launched based on 2002 University of Tokyo success

Protein Explorer Summary • Host PC cluster with special purpose boards attached • Boards calculate only non-bounded forces • Very simple hardware and software • No detailed knowledge of hardware needed to write programs • Communication time between host and boards is proportional to number of particles • Calculation time proportional to • N^2 for direct summation of long-range forces • N*Nc for short range forces where Nc is the average number of particles within the cutoff radius • 0.25 byte/1000 operations

MDGRAPE-3 Chip - Force Calculation Pipeline • 3 subtractor units • 6 adder units • 8 multiplier units • 1 function-evaluation unit • Can perform ~33 equivalent operations/sec when it calculates the Coulomb force

MDGRAPE-3 Chip - Force Calculation Pipeline

MDGRAPE-3 Chip - Force Calculation Pipeline • Most operations done in 32-bit single precision floating point format • Force accumulation is 80-bit fixed point format • Can be converted to 64-bit double precision floating point • Coordinates stored in 40-bit fixed-point format • Makes implementation of periodic boundary condition easy

MDGRAPE-3 Chip - Force Calculation Pipeline • Function Evaluator • Most important part of pipeline • Allows calculation of arbitrary smooth function • Has memory unit which contains a table for polynomial coefficients and exponents and a hardwired pipeline for fourth-order polynomial evaluation • Interpolates an arbitrary smooth function g(x) using segmented fourth-order polynomials by Homer’s method

MDGRAPE-3 Chip - J-Particle Memory and Control Units • 20 Force Calculation Pipelines • j-Particle Memory Unit • 32,768 bodies • “Main Memory” • 6.6 Mbits constructed by static RAM • Cell-Index Controller • Controls j-Particle memory – generates addresses • Force Simulation Unit • Master Controller • Manages timings and inputs/outputs of the chip

MDGRAPE-3 Chip • 2 virtual pipelines/physical pipeline • Physical bandwidth of j-particle unit 2.5 Gbytes/sec but virtual bandwidth will reach 100 Gbytes/sec • 340 arithmetic units • 20 function-evaluator units which work simultaneously • 165 Gflops at 250MHz

MDGRAPE-3 Chip

MDGRAPE-3 Chip • Chip made by Hitachi • 6M gates • 10M bits of memory • Chip size is ~220 mm^2 • Dissipate 20 watts at core voltage of +1.2V • .12 W/Gflops much better than P4 3GHz which is 14 W/Gflop

System Architecture • Host PC cluster will use Itanium or Opteron CPU • 256 nodes with 512 CPUs each • Performance of node is 3.96 Tflops • Total reaches a petaflop • Require 10G-bit/sec network • Infiniband 10G Ethernet or future Myrinet • Network topology will be a 2D hyper-crossbar • Each node has 24 MDGRAPE-3 chips • MDGRAPE-3 chips connected via 2 PCI-X busses at 133 MHz • 19” rack can house 6 nodes • 43 racks total • Power dissipation ~150 KWatts • Occupy 100 m^2

System Architecture

Protein Explorer Board

Software • Very easy to create programs for • All computational abilities provided in a library • No special knowledge of device needed

Cost • $20 million including labor • Less than $10/Gflop • At least ten times better than general-purpose computers even when compared with relatively cheap BlueGene/L ($140/Gflop)

Questions • What is Myrinet? • What is a two-dimensional hyper-crossbar network topology? • How does this compare to massive distributed computing such as Folding@Home • Advantages? • Disadvantages?

Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations

Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations

Presentation Transcript

Molecular Dynamics simulations of biological ion channels

Anton, a Special-Purpose Machine for Molecular Dynamics Simulation

Molecular Dynamics Simulations

Algorithms and Infrastructure for Molecular Dynamics Simulations

Molecular Dynamics Simulations of Amyloid  Dimer Formation

Molecular dynamics (MD) simulations

Molecular Biophysics III – dynamics Molecular Dynamics Simulations - 01/13/05

Molecular Dynamics Simulations on a GPU in OpenCL

Thermal boundary conditions for molecular dynamics simulations

An Introduction to Molecular Dynamics Simulations

Molecular Dynamics simulations

Molecular Dynamics Simulations An Introduction N. Gautham

Calculating 1/sqrt(x) for Molecular Dynamics Simulations

Basics of molecular dynamics simulations

Molecular Dynamics Simulations of Diffusion in Polymers

Molecular Dynamics Simulations of Diffusion in Polymers

Molecular Biophysics III – dynamics Molecular Dynamics Simulations - 01/13/05

Anton, a Special-Purpose Machine for Molecular Dynamics Simulation

Car-Parrinello Molecular Dynamics Simulations (CPMD): Basics

Molecular dynamics (MD) simulations