bioinformatics data analysis tools
Download
Skip this Video
Download Presentation
Bioinformatics Data Analysis & Tools

Loading in 2 Seconds...

play fullscreen
1 / 64

Bioinformatics Data Analysis & Tools - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

Bioinformatics Data Analysis & Tools. Molecular simulations & sampling techniques. Molecular Simulations: Brief History. Protein flexibility. Also a correctly folded protein is dynamic Crystal structure yields average position of the atoms ‘Breathing’ overall motion possible. B-factors.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Bioinformatics Data Analysis & Tools' - agnes


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
bioinformatics data analysis tools

Bioinformatics Data Analysis & Tools

Molecular simulations & sampling techniques

Molecular Simulations & Sampling Techniques

molecular simulations brief history
Molecular Simulations: Brief History

Molecular Simulations & Sampling Techniques

protein flexibility
Protein flexibility
  • Also a correctly folded protein is dynamic
    • Crystal structure yields average position of the atoms
    • ‘Breathing’ overall motion possible

Molecular Simulations & Sampling Techniques

b factors
B-factors
  • De gemiddelde beweging van atoom rond gemiddelde positie

alpha helices

beta-sheet

Molecular Simulations & Sampling Techniques

peptide folding from simulation
Peptide folding from simulation
  • A small (beta-)peptide forms helical structure according to NMR
  • Computer simulations of the atomic motions: molecular dynamics

Molecular Simulations & Sampling Techniques

folding and un folding in 200 ns

unfolded

folded

Folding and un-folding in 200 ns

all different?

how different?

Unfolded structures

321 1010 possibilities!

Folded structures

all the same

Molecular Simulations & Sampling Techniques

temperature dependence

unfolded

folded

folding equilibrium depends on temperature

Temperature dependence

360 K

350 K

340 K

320 K

298 K

Molecular Simulations & Sampling Techniques

pressure dependence

unfolded

folded

folding equilibrium depends on pressure

Pressure dependence

2000 atm

1000 atm

1 atm

Molecular Simulations & Sampling Techniques

surprising result
Surprising result
  • Number of relevant non-folded structures is very much smaller than the number of possible non-folded structures
  • If the number of relevant non-folded structures increases proportionally with the folding time, only 109 protein structures need to be simulated in stead of 1090 structures
  • Folding-mechanism perhaps simpler after all…

Molecular Simulations & Sampling Techniques

phase space
Phase Space
  • Defines state of classical system of N particles:
    • coordinates q = (x1, y1, z1, x2, … , zN)
    • momenta p = (px1, py1, pz1, px2, … , pzN)
  • One conformation (+ momenta) is one point (p,q) in phase space
  • Motion is a curved line in phase space
    • trajectory: (p(t),q(t))

Molecular Simulations & Sampling Techniques

molecular motions time length scales
Molecular Motions: Time & Length-scales

Molecular Simulations & Sampling Techniques

newton dynamics
Newton Dynamics

Sir Isaac Newton

t

t + Dt

Molecular Simulations & Sampling Techniques

classical newton mechanics
Classical (Newton) Mechanics
  • A system has coordinates q and momenta p (= mv):

p = ( p1, p2, … , pN )

q = ( q1, q2, … , qN )

  • This is called the configuration space.
  • The total energy can be split into two components:
    • kinetic energy (K):

K(p) = ½ mv2 = ½ p2/m

    • potential energy (V):

V(q) depends on interaction(s)

  • The potential energy is described by
    • bonded interactions (e.g. bond stretching, angle bending)
    • non-bonded interactions (e.g. van der Waals, electrostatic)
  • Non-bonded interactions determine the conformational variation that we observe for example in protein motions.

Molecular Simulations & Sampling Techniques

the hamilton function
The Hamilton Function
  • The Hamiltonian function represents the total energy:H(p,q) = K(p) + V(q)
  • Is the generalised expression of classical mechanics
  • In two differential expressions:
  • Newton equations of motion, but in a very elegant way
  • Use \'generalised coordinates\' (p and q):
    • can use any coordiate system
      • e.g., Cartesian coordinates or Euler angles

dpdHp = ––– = ––– dtdqk

dqdHq = ––– = ––– dtdpk

.

.

Molecular Simulations & Sampling Techniques

hamilton s principle
Hamilton\'s Principle
  • "The time derivative of the integral over the energy ofd ( pq - H(p,q) ) dt = 0
  • Hamilton\'s principle is most fundamental
    • Newton\'s equation of motion are only one set of equations that can be derived from Hamilton\'s principle.
  • The integral is called the \'action‘, meaning:
    • If we integrate the trajectory of an object in a configuration space given by positions q and momenta p between time points (integration limits) t1 and t2, then the value of the integral (= the \'action\') of a \'real‘ trajectory is a minimum (more precisely an extremum) if compared to all other trajectories.
  • Example: Why does a thrown stone follow a parabolic trajectory?
    • If you vary the trajectory and calculate the action, the parbolic trajectory will yield the smallest \'action\'.

.

.

Molecular Simulations & Sampling Techniques

harmonic oscillator
Harmonic oscillator:
  • 1-dimensional motion
  • 2 dimensions in phase-space:
    • position (1-dimensional)
    • momentum (1-dimensional)
  • analytical solution for integration:
    • q(t) = b · cos (√k/m · t )
    • p(t) = -b·√mk· sin ( √k/m·t )

q(t)

p(t)

Molecular Simulations & Sampling Techniques

calculating averages
Calculating Averages
  • Integration of phase space:
    • 1 particle, 2 values per coordinate (e.g. up, down):
      • 1*6 degrees of freedom (dof); 26 = 64 points
      • 2 particles: 2*6 dof; 212 = 4.096 points
      • 3 particles: 3*6 dof; 218 = 262.144 points
      • 4 particles: 4*6 dof; 224 = 16.777.216 points
  • Need whole of phase space ?
    • only low energy states are relevant

Molecular Simulations & Sampling Techniques

solving complex systems
Solving Complex systems
  • No analytical solutions
  • Numerical integration:
    • by time (Molecular Dynamics)
    • by ensemble (Monte-Carlo)
  • Molecular Dynamics:Numerical integration in time
    • Euler’s approximation:
      • q(t + Δt) = q(t) + p(t)/m·Δt
      • p(t + Δt) = p(t) + m·a(t) ·Δt
    • Verlet / Leap-frog

Molecular Simulations & Sampling Techniques

features of newton dynamics
Features of Newton Dynamics
  • Newton’s equations:
    • Energy conservative
    • Time reversible
    • Deterministic
  • Numeric integration by Verlet algrorithm: ‘Simulation’r(t + Dt) ~ 2 r(t) - r(t - Dt) + F(t)/mDt2 [ + 2 O(Dt4) ]
  • In ‘real’ simulation: Rounding errors (cumulative):

 not fully reversible

 no full energy conservation

      • Coupling to thermal bath  re-scaling

 not fully deterministic

      • ‘Lyapunov’ instability  trajectories diverge

Molecular Simulations & Sampling Techniques

derivation verlet
Derivation: Verlet
  • Taylor expansion:
    • q(t+Δt) = q(t) + q’(t)Δt + 1/2! q’’(t)Δt2 + 1/3! q’’’(t)Δt3 + …
      • where: q’(t) = v(t) (1st derivative, velocity)
      • and: q’’(t) = a(t) (2nd derivative, acceleration)

q(t+Δt) = q(t) + q’(t)Δt + 1/2! q’’(t)Δt2 + 1/3! q’’’(t)Δt3

q(t−Δt) = q(t) − q’(t)Δt + 1/2! q’’(t)Δt2 − 1/3! q’’’(t)Δt3+

q(t+Δt) + q(t−Δt) = 2q(t) + 2·1/2! q’’(t)Δt2

    • Rearrange:

q(t+Δt) = 2q(t) − q(t−Δt) + a(t)Δt2

  • 2nd order; but 3rd order accuracy

Molecular Simulations & Sampling Techniques

what do we obtain
What do we obtain?
  • Trajectory:q(t) and p(t)
  • Probability of occurence:P(p,q) = 1/Z e-H(p,q)/kT
  • Averages along trajectory: <A(p,q)T> = 1/T A(q(t),p(t)) dt (where T denotes total time, and not! temperature)

Molecular Simulations & Sampling Techniques

convergence
Convergence
  • Amount of phase-space covered
    • “Sampling”
  • Impossible to prove:You cannot know what you don’t know
  • Energy “landscape” in phase-space
    • there might be a “next valley”

Molecular Simulations & Sampling Techniques

example convergence 1
Example: Convergence (1)

Molecular Simulations & Sampling Techniques

example convergence 2
Example: Convergence (2)

Molecular Simulations & Sampling Techniques

example convergence 3
Example: Convergence (3)
  • Apparent Convergenceon all timescales100 ps – 10 ns !

Molecular Simulations & Sampling Techniques

efficiency
Efficiency
  • Time step limited by vibrational frequencies
    • heavy-atom–hydrogen bond vibration 10-14s (10fs)
    • 10-20 integration steps per vibrational period:
      • 0.5 fs time step; 2.000.000 steps for 1 ns
  • Removal of fast vibrations (constraining):
    • hydrogen atom bond and angle motion
    • heavy-atom bond motion
    • out-of-plane motions (e.g. aromatic groups)
  • In practice: 1-2 fs time step
    • 5-7 fs maximum

Molecular Simulations & Sampling Techniques

constraining
Constraining
  • to remove degrees of freedom, e.g.:
    • bond i-j vibrations  keep distance i-j constant
    • angle i-j-k vibrations  keep distance i-k constant
  • Constraint Algorithms
    • SHAKE
      • iterative adjustment of lagrange multipliers
    • LINCS
      • Taylor expansion of matrix inversion
      • non-iterative (more stable)
      • no highly connected constraints
    • SETTLE
      • Analytical Solution
        • for symmetric 3-atom molecules (like water)

Molecular Simulations & Sampling Techniques

improving performance
Improving Performance
  • Pairwise potential: Fij = − Fji
  • Potential E(r) ~ 0 at large r : cut-off
    • Coulomb: ~ 1/r
    • Lennard-Jones: ~1/r6
  • Atoms move little in one step: pair-list
    • Evaluating r is expensive: r = √|rj−ri|
  • Large distances change less: twin-range
    • short-range each step; long range less often
  • Multiple time-step methods
  • Many Processor/Compiler/Language specific optimizations:
    • use of Fortran vs. C
    • optimize cache performance
      • arrays of positions, velocities, foces, parameters are very large
    • compiler optimizations

Molecular Simulations & Sampling Techniques

ignoring degrees of freedom
Ignoring Degrees of Freedom
  • Internal:
    • bonds, angles → Constraint algorithm
      • larger time steps
  • External:
    • “Solvent” → Langevin dynamics
      • less (explicit) particles
    • Inertia & “solvent” → Brownian dynamics
      • larger time steps

Molecular Simulations & Sampling Techniques

trajectory on energy surface
Trajectory on Energy Surface

Molecular Simulations & Sampling Techniques

sampling in conformational space
Sampling in Conformational Space
  • Most of the computational time is spent on calculating(local, harmonic) vibrations.

DE >> KT

Energy

vibration

Entropy

Molecular Simulations & Sampling Techniques

barriers
Barriers
  • Kitao et al. (1998) Proteins 33, 496-517.

Molecular Simulations & Sampling Techniques

psychology of theorists
Psychology of Theorists

100%

“In theory, there should be no difference between theory and practice. In practice, however, there is always a difference...“ (Witten and Frank)

“For every complex question there is a simple and wrong solution.” (Albert Einstein)

“All models are wrong, but some are useful.” (George Box)

0%

OPTIMIST SCALE

Molecular Simulations & Sampling Techniques

monte carlo sampling
Monte Carlo Sampling
  • Ergodic hypothesis:
    • Sampling over time (Molecular Dynamics approach); and
    • Ensemble averaging (Monte Carlo approach)
  • Yield the same result:

r (r) = < ri(r) >NVE

  • Detailed Balance condition:

p(o) p(on) = p(n) p(no)

Molecular Simulations & Sampling Techniques

metropolis selection scheme
Metropolis Selection Scheme
  • Metropolis acceptance rule that satisfies detailed equilibrium:acc(on) = p(n)/p(o) = e-DE/kT if p(n) < (o)acc(on) = 1 if p(n)  (o)

 Metropolis Monte Carlo

  • Ergodic probability density for configurations around rN e-E/kTp(rN) = ––––––S e-E/kT

Molecular Simulations & Sampling Techniques

search strategies
Search Strategies

Molecular Simulations & Sampling Techniques

leaps
Leaps

Molecular Simulations & Sampling Techniques

computational scheme
Computational Scheme
  • Readuction of the leaps will lead to classical dynamics
  • Control parameter:
    • RMSD
    • Angle deviation

Molecular Simulations & Sampling Techniques

computational load solvation
Computational Load: Solvation
  • Most computational time (>95%) spent on calculating (bulk) water-water interactions

Molecular Simulations & Sampling Techniques

implicit solvation
Implicit Solvation

Molecular Simulations & Sampling Techniques

slide41
POPS
  • Solvent accessible area
    • fast and accurate area calculation
    • resolution:
      • POPS-A (per atom)
      • POPS-R (per residue)
    • parametrised on 120000 atoms and 12000 residues
    • derivable -> MD
  • Free energy of solvationDGsolvi = areai·si
  • POPS is implemented in GROMOS96
  • parameters \'sigma\' from simulations in water:
    • amino acids in helix, sheet and extended conformation
    • peptides in helix and sheet conformation

Molecular Simulations & Sampling Techniques

pops server
POPS server

Molecular Simulations & Sampling Techniques

test molecules alanine dipeptide
Test molecules: alanine dipeptide

Molecular Simulations & Sampling Techniques

test molecules bpti y35g bpti
Test molecules: BPTI / Y35G-BPTI

Classical MD Leap-dynamics Essential dynamics

Molecular Simulations & Sampling Techniques

calmodulin domains
Calmodulin domains
  • Apparent unfolding temperatures (CD)
    • C-domain : 315 K (42 ° C)
    • N-domain : 328 K (55 °C)
  • LD simulations:
    • 3 ns
    • 4 trajectories
      • 290 K
      • 325 K
      • 360 K

Molecular Simulations & Sampling Techniques

snapshots
Snapshots

Molecular Simulations & Sampling Techniques

trajectories
Trajectories

Molecular Simulations & Sampling Techniques

example protein ligand dynamics
Example: Protein & Ligand Dynamics

Molecular Simulations & Sampling Techniques

example essential dynamics analysis
Example: Essential Dynamics Analysis

Cyt-P450BM37 x 10ns “free” MD simulations

Molecular Simulations & Sampling Techniques

slide50
CD

Molecular Simulations & Sampling Techniques

comparison cd simulation
Comparison CD / simulation

Molecular Simulations & Sampling Techniques

example minima
Example: Minima

Molecular Simulations & Sampling Techniques

example conformations
Example: Conformations

Molecular Simulations & Sampling Techniques

levinthal s paradox
Levinthal’s paradox
  • Eiwitvouwingsprobleem:
    • Voorspel de 3D structuur vanuit de sequentie
    • Begrijp het vouwingsproces

Molecular Simulations & Sampling Techniques

folding energy

energy

E(x)

may have higher energy

but lower free energy

than

coordinate x

Folding energy
  • Each protein conformation has a certain energy and a certain flexibility (entropy)
  • Corresponds to a point on a multidimensional free energy surface

Three coordinates per atom

3N-6dimensions possible

DG = DH – TDS

Molecular Simulations & Sampling Techniques

folded state
Folded state
  • Native state = lowest point on the free energy landscape
  • Many possible routes
  • Many possible local minima (misfolded structures)

Molecular Simulations & Sampling Techniques

molten globule
Molten globule
  • First step: hydrophobiccollapse
  • Molten globule: globular structure, not yet correct folded
  • Local minimum on the free energy surface

Molecular Simulations & Sampling Techniques

force field
Force Field

“the collection of all forces that we consider to occur in a mechanical atomar system”

  • A generalised description:

Etotal = Ebonded + Enon-bonded + Ecrossterm

  • Crossterms:
    • non-bonded interaction influence the bonded interaction (v.v.).
    • Some force fields neglect those terms.
  • Note that force fields are (mostly) designed for pairwise atom interactions.
    • Higher order interactions are implicitly included in the pairwise interaction parameters.

Molecular Simulations & Sampling Techniques

force field components bonded interactions
Force Field Components: Bonded Interactions

Molecular Simulations & Sampling Techniques

force field components non bonded interactions
Force Field Components: Non-Bonded Interactions

Molecular Simulations & Sampling Techniques

all together
All Together…

Molecular Simulations & Sampling Techniques

reduced units
Reduced Units
  • Generalise description of (atomic) systems
    • expres all quantities in basic units derived from system\'s dimensions
  • For example, a Lennard-Jones interaction:VLJ = eƒ(r/s)eis characteristic interaction energy; s is equilibrium distance
  • Choose basic units:
    • unit of length, s
    • unit of energy, e
    • unit of mass, m (mass of the atoms in the system)
  • all other units can be derived from these, e.g.:
    • time: sm/e
    • temperature: e/kB

(from: Frenkel and Smit, \'Understanding Molecular Simulations\', Academic Press.)

  • Other choices, e.g., ‘MD’ units:
    • length nm (10-9m),mass u, time ps (10-12s), charge e, temp K
    • energy kJ mol-1, veolcity nm ps-1, pressure kJ mol-1 nm-3

Molecular Simulations & Sampling Techniques

main points
Main points

Molecular Simulations & Sampling Techniques

ad