Bioinformatics practical application of simulation and data mining protein folding i
This presentation is the property of its rightful owner.
Sponsored Links
1 / 22

Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I PowerPoint PPT Presentation


  • 54 Views
  • Uploaded on
  • Presentation posted in: General

Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I. Prof. Corey O’Hern Department of Mechanical Engineering Department of Physics Yale University. Statistical Mechanics. Protein Folding. Biochemistry. What are proteins?. Linear polymer. Folded state.

Download Presentation

Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Bioinformatics practical application of simulation and data mining protein folding i

Bioinformatics: Practical Application of Simulation and Data MiningProtein Folding I

  • Prof. Corey O’Hern

  • Department of Mechanical Engineering

  • Department of Physics

  • Yale University


Bioinformatics practical application of simulation and data mining protein folding i

Statistical

Mechanics

Protein

Folding

Biochemistry


Bioinformatics practical application of simulation and data mining protein folding i

What are proteins?

Linear polymer

Folded state

  • Proteins are important; e.g. for catalyzing and regulating biochemical reactions,

  • transporting molecules, …

  • Linear polymer chain composed of tens (peptides) to thousands (proteins) of monomers

  • Monomers are 20 naturally occurring amino acids

  • Different proteins have different amino acid sequences

  • Structureless, extended unfolded state

  • Compact, ‘unique’ native folded state (with secondary and tertiary structure) required

    for biological function

  • Sequence determines protein structure (or lack thereof)

  • Proteins unfold or denature with increasing temperature or chemical denaturants


Bioinformatics practical application of simulation and data mining protein folding i

Amino Acids I

N-terminal

C-terminal

C

peptide

bonds

R

variable

side chain

  • Side chains differentiate amino acid repeat units

  • Peptide bonds link residues into polypeptides


Bioinformatics practical application of simulation and data mining protein folding i

Amino Acids II


The protein folding problem

The Protein Folding Problem:

What is ‘unique’ folded 3D structure of a protein based on its amino acid

sequence? Sequence  Structure

Lys-Asn-Val-Arg-Ser-Lys-Val-Gly-Ser-Thr-Glu-Asn-Ile-Lys- His-Gln-Pro- Gly-Gly-Gly-…


Bioinformatics practical application of simulation and data mining protein folding i

Driving Forces

  • Folding: hydrophobicity, hydrogen bonding, van der

    Waals interactions, …

  • Unfolding: increase in conformational entropy,

    electric charge…

Hydrophobicity index

inside

H (hydrophobic)

outside

P (polar)


Bioinformatics practical application of simulation and data mining protein folding i

Higher-order Structure


Bioinformatics practical application of simulation and data mining protein folding i

Secondary Structure: Loops, -helices, -strands/sheets

-helix

-strand

-sheet

  • Right-handed; three turns

  • Vertical hydrogen bonds between NH2 (teal/white)

    backbone group and C=O (grey/red) backbone group

    four residues earlier in sequence

  • Side chains (R) on outside; point upwards toward NH2

  • Each amino acid corresponds to 100, 1.5Å, 3.6

    amino acids per turn

  • (,)=(-60,-45)

  • -helix propensities: Met, Ala, Leu, Glu

  • 5-10 residues; peptide backbones fully extended

  • NH (blue/white) of one strand hydrogen-bonded

    to C=O (black/red) of another strand

  • C ,side chains (yellow) on adjacent strands

    aligned; side chains along single strand alternate

    up and down

  • (,)=(-135,-135)

  • -strand propensities: Val, Thr, Tyr, Trp,

    Phe, Ile


Bioinformatics practical application of simulation and data mining protein folding i

Backbonde Dihedral Angles


Bioinformatics practical application of simulation and data mining protein folding i

Ramachandran Plot: Determining Steric Clashes

Gly

Non-Gly

Backbone

dihedral angles

theory

4 atoms define dihedral angle:

PDB

CCNC

=0,180

CN CC

N CCN

vdW radii

backbone

flexibility

< vdW radii


Bioinformatics practical application of simulation and data mining protein folding i

How can structures from PDB exist

outside Ramachadran bounds?

  • Studies were performed on alanine dipeptide

  • Fixed bond angle (=110) [105,115]

J. Mol. Biol. (1963) 7, 95-99


Bioinformatics practical application of simulation and data mining protein folding i

Side-Chain Dihedral Angles

4: Lys, Arg

5: Arg

Side chain: C-CH2-CH2-CH2-CH2-NH3

Use NCCCCCN to define 1, 2, 3,4


Bioinformatics practical application of simulation and data mining protein folding i

Your model is oversimplified

and has nothing to do with

biology!

Your model is too complicated

and has no predictive power!

Molecular biologist

Biological Physicist


Bioinformatics practical application of simulation and data mining protein folding i

Possible Strategies for Understanding Protein Folding

  • For all possible conformations, compute free energy

    from atomic interactions within protein and protein-

    solvent interactions; find conformation with lowest free

    energy…e.g using all-atom molecular dynamics

    simulations

Not possible?, limited time resolution

  • Use coarse-grained models with effective interactions

    between residues and residues and solvent

General, but qualitative


Bioinformatics practical application of simulation and data mining protein folding i

Why do proteins fold (correctly & rapidly)??

Levinthal’s paradox:

For a protein with N amino acids, number of backbone conformations/minima

 = # allowed dihedral angles

How does a protein find the global optimum w/o

global search? Proteins fold much faster.

Nc~ 3200 ~1095

fold ~ Nc sample ~1083 s

fold ~ 10-6-10-3 s

vs

universe ~ 1017 s


Bioinformatics practical application of simulation and data mining protein folding i

Energy Landscape

U, F =U-TS

S12

S23

S-1

M2

M3

M1

minimum

all atomic

coordinates;

dihedral angles

saddle point

maximum


Bioinformatics practical application of simulation and data mining protein folding i

Roughness of Energy Landscape

smooth, funneled

rough

(Wolynes et. al. 1997)


Bioinformatics practical application of simulation and data mining protein folding i

Folding Pathways

dead

end

similarity to

native state


Bioinformatics practical application of simulation and data mining protein folding i

Folding Phase Diagram

smooth

rough


Bioinformatics practical application of simulation and data mining protein folding i

Open Questions

  • What differentiates the native state from other low-lying energy minima?

  • How many low-lying energy minima are there? Can we calculate

    landscape roughness from sequence?

  • What determines whether protein will fold to the native state or become

  • trapped in another minimum?

  • What are the pathways in the energy landscape that a given protein

    follows to its native state?

NP Hard Problem!


Bioinformatics practical application of simulation and data mining protein folding i

Digression---Number of Energy Minima for Sticky Spheres

Nm

Ns

Np

4

1

1

sphere

packings

1

5

6

6

2

50

7

5

486

8

13

5500

polymer

packings

9

52

49029

10

-

-

Ns~exp(aNm); Np~exp(bNm) with b>a


  • Login