- 616 Views
- Updated On :
- Presentation posted in: Science / Technology

Molecular dynamics

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Molecular dynamics

ECE 697S: Topics in Computational Biology

Naomi Fox

March 13, 2006

- Why simulate motion of molecules?
- Energy model of conformation
- Two main approaches:
- Monte Carlo - stochastic
- Molecular dynamics - deterministic

- Applications of MD
- Speeding up MD

- Predict structure
- Understand interactions
- Understand properties
- Learn about normal modes of vibration
- Design of bio-nano materials
- Experiment on what cannot be studied experimentally
- Obtain a movie of the interacting molecules

- central dogma: sequence to structure
- After protein is generated, it folds without the help of external agents (except solvent)
- Difficult problem for humans to predict 3-d structure from 1-d protein sequence
- Organic chemists know how structure implies function for small molecular systems. Proteins are much larger, and more difficult. Open problem.

central dogma

2004 Winner of Visualization Challenge in Science and Engineering,

Organized by the National Science Foundation and Science Magazine.

Structure, Dynamics, and Function of Aquaporins

http://www.ks.uiuc.edu/Research/aquaporins/

- Use physics to find the potential energy between all pairs of atoms.
- Move atoms to the next state.
- Repeat.

- Target function that MD tries to optimize
- Describes the interaction energies of all atoms and molecules in the system
- Always an approximation
- Closer to real physics --> more realistic, more computation time (I.e. smaller time steps and more interactions increase accuracy)

Hy = Ey

continuum

mesoscale

Monte Carlo

Time Scale

10-6 S

molecular

dynamics

domain

10-8 S

quantum

D

exp(-

E/kT)

chemistry

10-12 S

F = MA

10-10 M

10-8 M

10-6 M

10-4 M

Length Scale

Taken from Grant D. Smith

Department of Materials Science and Engineering

Department of Chemical and Fuels Engineering

University of Utah

http://www.che.utah.edu/~gdsmith/tutorials/tutorial1.ppt

- Proposed by Linus Pauling in the 1930s
- Bond angles and lengths are almost always the same
- Energy model broken up into two parts:
- Covalent terms
- Bond distances (1-2 interactions)
- Bond angles (1-3)
- Dihedral angles (1-4)

- Non-covalent terms
- Forces at a distance between all non-bonded atoms

- Covalent terms

http://cmm.cit.nih.gov/modeling/guide_documents/molecular_mechanics_document.html

The NIH Guide to Molecular Modeling

Energy =

Stretching Energy +

Bending Energy +

Torsion Energy +

Non-Bonded Interaction Energy

These equations together with the data (parameters) required to describe the behavior of different kinds of atoms and bonds, is called a force-field.

kb is the spring constant of the bond.

r0 is the bond length at equilibrium.

Unique kb and r0 assigned for each bond pair, i.e. C-C, O-H

k is the spring constant of the bend.

0 is the bond length at equilibrium.

Unique parameters for angle bending are assigned to each bonded triplet of atoms based on their types (e.g. C-C-C, C-O-C, C-C-H, etc.)

kb and k broaden or steepen the slope of the parabola.

The larger the value of k, the more energy is required to deform an angle (or bond) from its equilibrium value.

The parameters are determined from curve fitting.

Unique parameters for torsional rotation are assigned to each bonded quartet of atoms based on their types (e.g. C-C-C-C, C-O-C-N, H-C-C-H, etc.)

- A controls the amplitude of the curve
- n controls its periodicity
- shifts the entire curve along the rotation angle axis ().

A is the amplitude.

n reflects the type symmetry in the dihedral angle.

used to synchronize the torsional potential to the initial rotameric state of the molecule

A determines the degree the attractiveness

B determines the degree of repulsion

q is the charge

A determines the degree the attractiveness

B determines the degree of repulsion

q is the charge

Stochastic method

aka Monte Carlo

Deterministic method

aka Molecular Dynamics (some ambiguity of how term is used)

Goal: energy minimization

- The smaller the system, the more particles on the surface
- 1000 atom cubic crystal, 49% on surface
- 10^6 atom cubic crystal, 6% on surface

- Would like to simulate infinite bulk surrounding N-particle system
- Two approaches:
- Implicitly
- Explicitly
- Ex. Periodic boundary
conditions

- Ex. Periodic boundary

Schematic representation of periodic boundary conditions.

http://www.ccl.net/cca/documents/molecular-modeling/node9.html

Explore the energy surface by randomly probing the configuration space

Metropolis method (avoids local minima):

- Specify the initial atom coordinates.
- Select atom i randomly and move it by random displacement.
- Calculate the change of potential energy, E corresponding to this displacement.
- If E < 0, accept the new coordinates and go to step 2.
- Otherwise, if E 0, select a random R in the range [0,1] and:
- If e-E/kT < R accept and go to step 2
- If e-E/kT R reject and go to step 2

- System behaves as a Markov process
- Current state depends only on previous.
- Converges quickly.

- System is ergodic (any state can be reached from any other state).
- Accepted more by physicists than by chemists.
- Not deterministic and does not offer time evolution of the system.

- Provides us with a trajectory of the system.
- Typical simulations of small proteins including surrounding solvent in the pico-seconds.

- From atom positions, velocities, and accelerations, calculate atom positions and velocities at the next time step.
- Integrating these infinitesimal steps yields the trajectory of the system for any desired time range.
- There are efficient methods for integrating these elementary steps with Verlet and leapfrog algorithms being the most commonly used.

{r(t+Dt), v(t+Dt)}

{r(t), v(t)}

- Initialize system
- Ensure particles do not overlap in initial positions (can use lattice)
- Randomly assign velocities.

- Move and integrate.

Leapfrog algorithm

- Molecular modeling for r+d of bionanotechnology
- Three case studies:
- Polarizability of carbon nanotubes
- Nanopores for DNA sequencing
- Nanodiscs for experimenting on membrane proteins

- Single-walled carbon nanotubes (SWNT) can be used for in vivo biosensor applications.
- SWNTs are highly-polarizable
- Used MD to develop accurate description of the influence of biomolecules on nanotube electronic properties.
- Furnishes ability to test nanotube electronic properties in realistic biological environments.

- Studied water-nanotube and ion-nanotube interactions.
- Engineers seek to control the transport of ions or charged molecules through SWNTs by means of external electric fields.

(a) Top view (b) side view of an ideal 12-section SWNT. © Potential energy profiles for same SWNT interacting with a water molecule of fixed orientation at various positions along tube axis.

- With current technology, individual genomes determined to 99.99% accuracy within 2 months for 10 million dollars.
- Describes a device that may provide alternative

- 2-nm-diameter pore in a 2-5nm thick synthetic membrane.
- Sequence of a DNA molecule can be read by such a device through a semiconductor detector in the pore
- A,C,G, and T nucleotides differ by only a few atoms
- Essential to characterize conformation of DNA inside the pore in atomic detail

- A 1.4V bias applied to membrane.
- 20 base-pair fragment of double stranded DNA placed in front of a nanopore.
- End of DNA nearest to the pore is pulled into the pore by its charged backbone (a,b)
- System reaches a meta-stable state (c) and translocation halts.
- Base pairs start to split. Some freed nucleotides adhere to pore surface.
- Voltage increased momentarily to drive system out of metastable state.
- DNA exits pore. One of the bases holds firmly to the pore surface.
- After 50ns, most of DNA has left pore. Nine of twenty base pairs are split.

- Nanodiscs provide a platform in which to embed, solubilize, and study single membrane proteins.
- Membrane proteins are difficult to study experimentally.
- Often studied under non-native conditions, with artifacts unknown.

- Using an optimized starting ratio of protein to lipid, homogeneous nanodiscs form with discrete size and composition.
- Predict structure of nandodiscs by simulating the experimental self-assembly.
- To account for full process, used Marrink et-al coarse-grained lipid-water model.
- Protein amino acid residues are represented by 2 beads
- DPPC lipids represented by 12 beads.
- 4 water molecules are represented 1 bead.

- Protein folds from 10 to 100s of s. Longest simulations ~1 s.
- Can we speed up MD?
- Would like:
- larger systems
- longer runs
- more trajectories

- approaches:
- Mix in experimental data
- Steered MD
- Massive parallel computation
- Course-grained methods

Steered MD example Bacteriorhodopsin

Steered molecular dynamics, performed by applying a series of external forces to retinal, allow one to extract retinal from bR once the Schiff base bond to Lys216 is cleaved

- Review of deterministic method for molecular dynamics
- Two examples of approaches for making MD ‘faster’
- Questions

- Initialize system
- Ensure particles do not overlap in initial positions (can use lattice)
- Randomly assign velocities.

- Compute Energy
Etotal =

Ebond + Etorsion + Ebond-angle + Enon-bond

- Get velocities of atoms, then move for timestep (~1fs)

{r(t+Dt), v(t+Dt)}

{r(t), v(t)}

- Collect large number of trajectories using traditional MD, and see if folding rates agree with lab results.
- Exploit geometry of the system to remove number of nodes that must be considered.

- Statistically compared computational predictions and experimentally determined mean folding times for small protein (BBA5).
- Used massively distributed cluster to accumulate thousands of 5 to 20 ns trajectories.
- Found computational predictions in excellent agreement with experimentally determined mean folding times and equilibrium times.

- Similar to SETI@home, volunteers run screensaver software to donate unused CPU cycles.
- Since October 1, 2000, over 1,000,000 CPUs throughout the world have participated in Folding@Home.

- Simulated two BBA5 mutants
- Considered fast-folding, mean folding time 10 s
- 23 residues
- Has -hairpin turn and -helix motif

- Statistically, 10 out of 10,000 molecules will be folded after 10 ns.
- Qualitatively reasonable but arbitrary criteria used to discriminate folded structures.
- Each final conformation aligned to C positions of low energy BBA5 NMR structure and calculated r.m.s.d.
- A conformation was folded if it had the helix, hairpin, and low r.m.s.d.C

- Acquired a total of 32500 trajectories
- Observed -hairpin in 1100 trajectories
- Observed -helix in 21000 trajectories
- Of 9000 double-mutant trajectories, 16 were folded after 20ns.

- Criteria for discrimination of folded structures is arbitrary
- Does not represent structure prediction, because relies on C positions of BBA5
- Native state flexibility may be indication that force field is insufficiently accurate
- Experimental folding rate uncertainty large (instrument latencies)

- Contributes to body of knowledge about rapid protein dynamics.
- Found following consistent between simulation and experimental evidence :
- Helical structure in the unfolded state
- Rate of helix formation
- Rate of hairpin formation
- Rate for folding

- “The ability to investigate directly folding dynamics and heterogeneity with molecular dynamics will bring molecular dynamics to the same prominence in protein folding that it has already achieved in structural modeling.”

- Old method
- MD considers all 3N degrees of motion
- Accurate trajectories for atoms only obtained with short time step (~femtosecond)
- 10ns requires weeks of CPU time to compute
- Most biologically relevant motions take place on milliseconds to seconds timescale

- New method
- Geometric simulation for proteins

- Not time-dependent, but distance dependent
- Obtain collections of possible conformations, which define a trajectory that respects stereo-chemistry
- Measure of the progression of motion is distance rather than time:
- Mobility is monitored by RMSD from a reference conformer.

- FIRST - Find which regions are rigid and which are flexible
- FRODA - generate sterically accurate motions of a protein using rigidity information

- Floppy Inclusion and Rigid Substructure Topography
- Given a structure (graph of atoms and covalent bonds), identify rigid and flexible regions

- Framework Rigidity Optimized Dynamic Algorithm
- Finds continuous pathways as it traverses allowed regions within conformational phase space.
- Evaluates non-covalent interaction constraints (i.e. steric clashes and hydrophobic tethers) simultaneously with the geometric constraints.

- Perform static rigidity analysis on the protein (FIRST)
- Perform geometric simulation (FRODA)

- Protein is a network. All covalent bond lengths and angles are constrained.

Ring of 6 atoms and bonds

G

A bond-bending graph G is rigid in R3 iff

e 3n- 6for every subgraph of G having n graph vertices and e graph edges.

G2

Bond Bending Graph:

|V|=6, |E| = 12

|E| = 3|V|-6

- A subgraph is “over-constrained” if e > 3n-6
- If you remove an edge, component is still rigid

- A subgraph is “under-constrained” if e < 3n-6
- Component has 3n-6 - e degrees of freedom.

Under-constrained

e = 5 n = 4 3n - 6 = 9

over-constrained

e = 8 n = 4 3n - 6 = 5

- Brute force method to find all rigid subcomponents takes exponential time (2N subgraphs in any graph).
- FIRST uses pebble game algorithm, which takes O(N2) time, and scales linearly in practice.
- Essence of pebble game: an algorithm for distributing the degrees of freedom belonging to the atoms (‘pebbles’) over the bonds (‘constraints’) to determine the rigidity.
- More on pebble games at http://flexweb.asu.edu

- decomposes a protein down to its rigid and flexible components

- Uses geometrical constraints
- Claim: geometric simulation is an efficient method to obtain the mobility that results from flexibility.
- Uses creative algorithm for generating motion while enforcing constraints
- Replace interatomic potentials by fictitious rigid bodies (the ghost templates)
- Atoms bound to the vertices of ghost templates.

- A ghost template is a rigid body that consists of set of mutually rigid atoms
- Since they are rigid only 6 DOF
- Atoms are bound to vertices of ghost templates

- Defined by least-square fit of atoms
- Vertex corresponds to specific edge

Fit templates

Perturb

Fit atoms

The motion of an ethane molecule (a) initial atomic positions; (b) ghost templates; (c) random atomic displacement; (d) fitting of ghost templates to atoms; (e) refitting of atoms to ghost templates; (f) and (g) further iterations of (d) and (e); (h) until a new conformer is found.

- Small bias to the atomic motion during the fitting (atoms to templates) process
- Cutoff of 62% of radius

- Sterics are always ensured (no rejections)
- Similarly, hydrophobic tethers are ensured
- Bias given if two atoms with hydrophobic tether move further apart than sum of radii plus .5 Å

- Individual step ~.1 to ~.4Å RMSD
- If algorithm does not converge in a fixed number of steps, restart from earlier conformer

50 CPU minutes!

Barnase

110 residue

Above left: NMR ensemble of 20 conformers

Above right: single x-ray structure plus FRODA conformers

Right: RMSD for each residue, determined from figures above.

- Given a starting point and ending point can we find a path?
- Rather than randomly perturb atoms, bias towards desired structure
- Include RMSD to target conformer in ghost/atom fitting

- Can generate a pathway from one conformer to the other.

- Rather than randomly perturb atoms, bias towards desired structure

DHFR

closed to occluded

Enzymatic protein

Has three very similar conformations (open, closed, and occluded), most significant differences in residues 14-24 (M-20 loop)/

Mobile loop in closed (red) and occluded (blue) positions.

From closed to occluded

- FRODA can produce a sterically correct pathway (all motions within allowed regions).
- Does not show time-dependency.
- A flexible motion of a typical protein (~100 residues) investigated in 10 to 100 min.

Massively parallel traditional MD

Can find trajectories that are at most ~10 to ~20ns time duration.

Time dependent. Most accurate simulation of actual motion of each atoms

Geometric abstraction method

Can find a full sterically correct pathway from initial conformation to target conformation

Not time dependent. Not an accurate simulation of motion of each atom.

- Can we make some relaxations to MD while minimizing loss of accuracy?
- Can we make geometric simulation time-dependent?
- Any other ideas for course-grained methods?

- Understanding Molecular Simulation: from Algorithms to Applications. Daan Frenkel, Berend Smith. 2002. Academic Press. San Diego, California.
- Deyu Lu, Aleksei Aksimentiev, AmyY. Shih, Eduardo Cruz-Chu, PeterL. Freddolino, Anton Arkhipov, and Klaus Schulten. The role of molecular modeling in bionanotechnology. Physical Biology, 3:S40-S53, 2006.
- Snow CD, Nguyen H, Pande VS, Gruebele M. Absolute comparison of simulated and experimental protein-folding dynamics. Nature 420(6911):33-4, Nov 7 2002.
- Stephen Wells, Scott Menor, Brandon Hespenheide, and MF Thorpe. Constrained geometric simulation of diffusive motion in proteins. Physical Biology 2:S127-S136, 2005.