1 / 27

Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland

All-atom molecular simulations of protein folding and unfolded-state dynamics and structure with accelerated calculations on GPU. Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland. The 10th Protein Folding Winter School, KIAS, February, 7-11, 2011.

durin
Download Presentation

Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. All-atom molecular simulations of protein folding and unfolded-state dynamics and structure with accelerated calculations on GPU CezaryCzaplewski Faculty of Chemistry University of Gdańsk Poland The 10th Protein Folding Winter School, KIAS, February, 7-11, 2011

  2. Molecular Simulation of ab Initio Protein Folding for a Millisecond Folder NTL9(1-39) Vincent A. Voelz,1 Gregory R. Bowman,2 Kyle Beauchamp,2 Vijay S. Pande1,2,3 1 Department of Chemistry, Stanford University, 2 Biophysics Program, Stanford University 3 Department of Structural Biology Stanford University J. AM. CHEM. SOC. 2010, 132, 1526–1528

  3. Computer simulations, validated by experiment, can help gain a complete understanding of how proteins fold. • Over a million-fold range in folding rates = possible diversity in folding mechanism. • Folding@Home using GPU allowing for several folding trajectories of 39-residue NTL9(1-39), the slowest-folding protein (~1.5 ms folding time) folded ab initio with all-atommodel MD to date. • Insights into folding mechanism based on Markov state model (MSM).

  4. all atom MD step sidechain rotation helix formation protein folding 10-15 femto 10-12 pico 10-9 nano 10-6 micro 10-3 milli 100 seconds bond vibration folding of -hairpins loop closure

  5. GPU • Type of CPU attached to a graphics carddedicated to calculating floating point operations • Incorporates stream processing microchips which containspecial mathematical operations • Stream Processing: applications can use multiplecomputational units without explicitly managingallocation, synchronization, or communicationamong those units.

  6. CPU vs. GPU CPU – 4 cores

  7. Floating-Point Operations per Second for the CPU and GPU

  8. Proteins folded ab initio by all atom MD Trp-cage 4.1 ms Pitera, Swope, PNAS 2003 Fip35 WW 13 ms Ensign, Pande, Biophys. J., 2009 Villin headpiece 10 ms Zagrovic, Snow, Shirts, Pande, JMB 2002 Fast folding villin variant <1 ms Ensign, Kasson, Pande, JMB 2007

  9. NTL9(1-39)~1.5 msexperimental folding time

  10. Folding@Home using Gromacs with OpenMM library written specially for GPU allowing dramatically longer trajectories • AMBER ff96 with Onufriev, Bashford,Case GBSA • Up to 10000 parallel MD simulations at 300, 330, 370 and 450K • Starting from native, random coil, extended • Aggregate 1.52 ms • Out of the ~3000 trajectories started from unfolded states at 370K only two reach <3.5 Å RMSD and eight <4 Å RMSD • Number of folding events is consistent with a simple model of parallel uncoupled folding as a two-state Poisson process: 〈n〉 = ∫M(t)k exp(-M(t) kt) dt M(t) is the number of parallel simulations that reach time t. k is ~640/s experimental folding rate

  11. Distributions of rmsd for native-state simulations of NTL9(1−39) after 10 μs Posterior predictions of the folding rate The number of parallel simulations at 370 K that reach time t.

  12. A snapshot from a folding trajectory 3.1 Å RMSD Non-native and native-like hydrophobic core arrangements

  13. Markov state model (MSM) • MSM constitutes a kinetic clustering • Conformations that can interconvert rapidly are grouped into the same state • Conformations that can only interconvert slowly are grouped into separate states • Satisfies the Markov property—the identity of the next state depends only on the identity of the current state and not any of the previous states • Transition probability matrix T propagates state probabilities p • An implied timescale k for given lag time tcan be calculated from the eigenvaluesm of matrix T

  14. Detail of MSMBuilder package • 100,000 microstates were generated by clustering conformations separated by 10 ns using k-centers algorithm • The remaining 90% of the data was then assigned to these clusters • The resulting microstates had an average radius of ~4.5 Å • A macrostate model generated by lumping microstates into 2,000 macrostates using the Robust Perron Cluster Analysis (PCCA+) algorithm • Although only a few folding trajectories were observed directly, a network of many possible pathways can be inferred from the overlapping sampling of local transitions. • Top 10 folding fluxes, calculated by a greedy backtracking algorithm

  15. Implied timescales Markov State Models (MSMs) built at lag times between 1 and 32 ns 100,000-microstate model 2000-macrostate model

  16. A scatter plot of the 2000 macrostatesShown in red are the 14 macrostates transited by the top ten pathway fluxes

  17. A 2000-state Markov State Model (MSM). The top 10 folding pathways account for ∼25% of the total flux and transit 14 of the 2000 macrostates

  18. Contact profile subspaces used to calculate Qa Qb12 Qb13 c(x)– contact profile indexed by x = (i, j)

  19. The 14 macrostates plotted along structural and kinetic reaction coordinates

  20. Contact profiles for the 14 macrostates involvedin the top folding pathways

  21. Values of Q for each of the 14 macrostates involved in the top ten folding pathways

  22. Q-values plotted versus pfold (committor) values

  23. Macrostatesl, m and n have very similar structural ensembles and similar pfold values These states differ mostly in their hairpin registrations and packing of the hairpin loop.

  24. Conclusions • Existing force field models using implicit solvent are accurate enough to fold proteins ab initio at long time scales, openingthe door to simulating more structurally complex proteins. • There need not be a single pathway or single, dominant mechanism for the folding of a given protein. • Multiple mechanisms could be simultaneously present . • The sequence of the protein, coupled with the chemical environment, control the balance to which each mechanistic pathway is seen.

  25. Take-home message • GPU can speed up your simulations 10 times • Existing force field models using implicit solvent are accurate enough to fold proteins during MD. • With only a few folding trajectories observed directly, a network of many possible pathways can be inferred from kinetic clustering using the Markov State Model. • Several pathways for the folding of a given protein. • Multiple folding mechanisms (a diffusion-collision or nucleation-condensation) could be simultaneously present .

More Related