1 / 22

Anton, a Special-Purpose Machine for Molecular Dynamics Simulation

Anton, a Special-Purpose Machine for Molecular Dynamics Simulation. By David E. Shaw et al Presented by Bob Koutsoyannis. The Anton Legacy. Anton van Leeuwenhoek “Father of Microscopy” First to see bacteria and other micro organisms

bdonovan
Download Presentation

Anton, a Special-Purpose Machine for Molecular Dynamics Simulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Anton, a Special-Purpose Machine for Molecular Dynamics Simulation By David E. Shaw et al Presented by Bob Koutsoyannis

  2. The Anton Legacy • Anton van Leeuwenhoek “Father of Microscopy” • First to see bacteria and other micro organisms • Objective: Improve the tools available to scientists to further our understanding of organisms & diseases

  3. Anton the Machine • Specialized Massively Parallel Machine being built to improve Molecular Dynamic Simulations. • In the works to be completed by 2009 • Biological processes spatially distributed among many nodes in a 3D torus. • MD specific hardware • Novel parallel algorithms

  4. Molecular Dynamics Simulation • Models the motions and interactions of molecular systems • Proteins • Cell Membranes • DNA • (atomic level simulations)

  5. Motivation • Life Saving… • Used to visualize biochemical phenomena that cannot be seen in lab experiments. • Protein Folding • Protein, Protein interactions • Protein, Drug interaction • Key for Developing Drugs

  6. What makes one MD simulatorbetter than the Next? • Time Scale • Being able to simulate the interaction between molecules for more than a nanosecond. • Problem Size • Why is a millisecond of simulation out of the scope of our current technology? • Consider 200,000 molecules • 1012 time steps to simulate a millisecond • Each time step requires intense arithmetic computation on all 200,000 molecules

  7. What makes one MD simulatorbetter than the Next? • Other Projects Addressing MD Sims • Folding@Home • Network of 200,000 PC’s • Large sample for independent molecular sims • But no millisecond simulations • FASTRUN, MDGRAPE, MD Engine • Good with larger molecular system sims • Have strong arithmetic units • Still limited by communication bottlenecks

  8. MD Simulator Requirements • Force Calculation • (getting an idea of the level of computation needed) • Molecular mechanics force fields used to model the total PE of a system. • Input: X,Y,ZOutputs: Force Quantities M1 M2

  9. MD Simulator Requirements • Force Calculation • (getting an idea of the level of computation needed) • For every time step, the force fields must be updated. • FFT, Convolution, Inverse FFT (Computationally expensive operations) • For 200,000 molecules/step… • 1) Need a huge number of arithmetic processing elements

  10. MD Simulator Requirements • Integration • (getting an idea of the level of computation needed) • For every time step, updates of atomic positions and velocities must be made. • Global actions and Constraints must be enforced on the entire system (temperature, pressure, optimizations.)

  11. MD Simulator Requirements • Parallelization • (getting an idea of the level of computation needed) • For every time step, every atom must communicate within its cutt-off radius with every other atom. • 2) A lot of inter-processor communication that can be scaled well is needed.

  12. MD Simulator Requirements • Parallelization • (getting an idea of the level of computation needed) • Whole System is broken down into boxes (processing nodes) • Each node handles the bonded interactions within • NT method for non-bonded interactions (much more common). • NT method for Atom Migration

  13. Why Specialized Hardware? • 1) Need a huge number of arithmetic processing elements • 2) A lot of inter-processor communication that can be scaled well is needed. • 3) Memory is not an issue • With 25,000 atoms (64bytes each) total=1.6MB over 512 nodes=3.2KB/node which is < most L1 Memory Communication Computation Needs

  14. Memory Communication Computation Needs Why Specialized Hardware? • Consider Moore’s Law on 10X improvement in 5 years vs. Anton’s 1000X in 1 year. • Can great discoveries wait? • Can use custom pipelines with more precision, increased datapath logic speed, over less silicon area. • Have Tailored ISA’s for geometric calculations+ • Programmability for accommodating various force fields and integration algorithms • Dedicated memory for each particle to accumulate forces

  15. Updating force field This node may update for them Communication Latency • Low-latency, high-bandwidthnetwork within and betweenASICs. • Push based communicationwith counters (reduce wait). • Set of Autonomous DirectMemory Access (DMA) Enginesallowing for greater overlap of communication and computation. • Admission Control Features

  16. Subsystems of Anton • High-Throughput Interaction Subsystem (HTIS) • Flexible Subsystem • Communication Subsystem • Memory Subsystem

  17. High-Throughput Interaction Subsystem • Executes Non-bonded MD interaction calculations (Charge Spreading & Force Interpolation) • Accumulates forces on each particle as data streams through. • ICB Controls flow of data through the HTIS, programmable ISA extensions, acts as a buffering, pre-fetching, synchronization, and write back controller

  18. Flexible Subsystem • Initiates Force Computation Phase • Calculates bonded force terms • Force correction terms • All integration tasks Constraint Calculations (temp & pressure) Pos. Vel. Updates Atom Migration All Maintenance Activities (boot, diagnostic, self-test, loading sims, switching contexts, logging, check pointing, error reporting).

  19. Flexible Subsystem • General Purpose Core w/ Caches • Remote Access Unit • Autonomous data transfers • Geometry Cores • MD calculations bonded • Correction Pipeline • Computes force correction terms • Racetrack • Local, internal connect for flex subsys components • Ring Interface Unit • Flex subsys to transfer packets to/from communication subsystem.

  20. Communications Subsystem • Routing 48-bit address space • 16-bit node identifier 32-bit of address per node • Flow Control • Provided access to ASIC DRAM • Supports accumulation and synchronization Memory Subsystem

  21. Simulation Evaluations • 500X NAMD 80-100X Desmond 100X Blue Matter

  22. Accuracy Efficiency • Increase system simulation size leads to increase in efficiency. • Force Error measured in relative rms force error • Energy Drift

More Related