1 / 16

Atomistic Protein Folding Simulations on the Submillisecond Timescale Using Worldwide Distributed Computing

Atomistic Protein Folding Simulations on the Submillisecond Timescale Using Worldwide Distributed Computing. Qing Lu CMSC 838 Presentation. Overview. Overview of talk Motivation Challenge Methods Ensemble Dynamics Folding@Home Evaluation Observations. Motivation.

khuyen
Download Presentation

Atomistic Protein Folding Simulations on the Submillisecond Timescale Using Worldwide Distributed Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Atomistic Protein Folding Simulations on the Submillisecond Timescale Using Worldwide Distributed Computing Qing Lu CMSC 838 Presentation

  2. Overview • Overview of talk • Motivation • Challenge • Methods • Ensemble Dynamics • Folding@Home • Evaluation • Observations CMSC 838T – Presentation

  3. Motivation • Atomistic simulation of protein folding • understand dynamics of folding • real-time folding in full atomic detail • large-scale parallelization methods • Benefits • protein folding & disease • protein self-assemble to function • proteins misfold  diseases • nanotechnology • nanomachines • self-assemble on the nanoscale CMSC 838T – Presentation

  4. Challenge • Difficulties • limited by current computational techniques • fastest folding in microseconds • one CPU: 1ns/day, 30 years • 10,000 fold computational gap • 1,000 CPUs, 1 microsecond / day • traditional parallelization scheme • hard to scale to a large amount of processors • extremely fast communication • complexity of coordination • expensive supercomputers • cost • time-sharing CMSC 838T – Presentation

  5. Method • ensemble dynamics • a new simulation algorithm • parallel simulation • Folding@Home • heterogeneous network, Internet • large-scale distributed platform CMSC 838T – Presentation

  6. Simulation of Dynamics • free energy barrier • progress from one state to another: transition • thermal fluctuations to push system over free energy barrier • previous approaches: sampling • maybe stuck in meta-stable free energy minima • expensive computational cost of sampling CMSC 838T – Presentation

  7. Ensemble Dynamics • application scenario • waiting time of transitions dominates total time • protein folding • transition: free energy barrier crossing • coupled simulations: transition coupling • Algorithm • M independent simulations from a initial condition • first simulation to cross free energy barrier • M times less to cross barrier than average time • restart M simulations with the new location after transition • Near linear speed up in #processors • exponential kinetics: f(t) = 1 – exp(-k t) • If k * t is small, f(t) = k * t • M simulations  M * f(t) = M * k * t folding events CMSC 838T – Presentation

  8. Limitations • barrier crossing probability • exponential assumptions • correct transition detection • transition: free energy barrier crossing • a large variance in energy: threshold • correct detection is not guaranteed • multiple possible transition • not addressed • selection of the first transition CMSC 838T – Presentation

  9. Distributed Computing • Distributed simulations • M processors for each run • simulate folding in atomic detail on each processor • restart once a crossing barrier event occurs • Implementation: Folding@Home • worldwide distributed computing: Internet • started in October 2000 • more than 200,000 participants • 10,000 CPU-years in the first 12 months CMSC 838T – Presentation

  10. Folding@Home CMSC 838T – Presentation

  11. Folding@Home • client-server architecture • server assign jobs(work unit) to client • client sends back results after computation • ~100K data transfer between client and server • why is ensemble dynamics good for Folding@Home? • CPU intensive job: a few hours, often days • connection speed: modem, good enough • suitable for Folding@Home CMSC 838T – Presentation

  12. Other@Home Work • SETI@Home • search for intelligent life outside Earth • data analysis of signals • FightAids@Home • find drug therapy for HIV • how drugs interact with various HIV virus mutations • distributed projects • Divide-and-Conquer • CPU intensive jobs • small pieces of data(kilobytes) transfer • communication not a major concern CMSC 838T – Presentation

  13. Evaluation • Folding@Home • based on Tinker molecular dynamics code • voluntary participants worldwide, over 400,000 CPUs • simulate folding and unfolding • folding rates • simulations on small proteins CMSC 838T – Presentation

  14. Folding Rates CMSC 838T – Presentation

  15. Folding & Unfolding CMSC 838T – Presentation

  16. Observations • Sampling • too expensive to run for a long timescales • waste too much time lingering in local energy minima • Ensemble dynamics • speed up simulations of dynamics • biological meaning of simulations results? • results on large protein folding? • limitations: correct transition detection, transition probability • Folding@Home • cheap way to achieve super computation power • huge distributed computing platform: over 400,000 CPUs • an efficient approach for CPU intensive job • Complexity of problems and size of data increase rapidly • find better algorithm is preferable to buying supercomputers CMSC 838T – Presentation

More Related