protein structural prediction l.
Skip this Video
Loading SlideShow in 5 Seconds..
Protein Structural Prediction PowerPoint Presentation
Download Presentation
Protein Structural Prediction

Loading in 2 Seconds...

play fullscreen
1 / 39

Protein Structural Prediction - PowerPoint PPT Presentation

  • Uploaded on

Protein Structural Prediction Protein Structure is Hierarchical Structure Determines Function The Protein Folding Problem What determines structure? Energy Kinematics How can we determine structure? Experimental methods Computational predictions Primary Structure: Sequence

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Protein Structural Prediction' - paul2

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
structure determines function
Structure Determines Function

The Protein Folding Problem

  • What determines structure?
  • Energy
  • Kinematics
  • How can we determine structure?
  • Experimental methods
  • Computational predictions
primary structure sequence
Primary Structure: Sequence
  • The primary structure of a protein is the amino acid sequence
primary structure sequence5
Primary Structure: Sequence
  • Twenty different amino acids have distinct shapes and properties
primary structure sequence6
Primary Structure: Sequence

A useful mnemonic for the hydrophobic amino acids is "FAMILY VW"

secondary structure loops
Secondary Structure: , , & loops
  •  helices and  sheets are stabilized by hydrogen bonds between backbone oxygen and hydrogen atoms
second and a half ary structure motifs
Second-and-a-half-ary Structure: Motifs

beta helix

beta barrel

beta trefoil

quaternary structure multimeric proteins or functional assemblies
Quaternary Structure: Multimeric Proteins or Functional Assemblies
  • Multimeric Proteins
  • Macromolecular Assemblies

Ribosome:Protein Synthesis


A tetramer


DNA copying

protein folding
Protein Folding
  • The amino-acid sequence of a protein determines the 3D fold [Anfinsen et al., 1950s]

Some exceptions:

    • All proteins can be denatured
    • Some proteins have multiple conformations
    • Some proteins get folding help from chaperones
  • The function of a protein is determined by its 3D fold
  • Can we predict 3D fold of a protein given its amino-acid sequence?
the leventhal paradox
The Leventhal Paradox
  • Given a small protein (100aa) assume 3 possible conformations/peptide bond
  • 3100 = 5 × 1047 conformations
  • Fastest motions 10- 15 sec so sampling all conformations would take 5 × 1032 sec
  • 60 × 60 × 24 × 365 = 31536000 seconds in a year
  • Sampling all conformations will take 1.6 × 1025 years
  • Each protein folds quickly into a single stable native conformation ­ the Leventhal paradox
the hydrophobic effect
The Hydrophobic Effect
  • Important for folding, because every amino acid participates!

Fauchere and Pilska (1983). Eur. J. Med. Chem. 18, 369-75.

Experimentally Determined Hydrophobicity Levels

protein structure determination
Protein Structure Determination
  • Experimental
    • X-ray crystallography
    • NMR spectrometry
  • Computational – Structure Prediction

(The Holy Grail)

Sequence implies structure, therefore in principle we can predict the structure from the sequence alone

protein structure prediction
Protein Structure Prediction
  • ab initio
    • Use just first principles: energy, geometry, and kinematics
  • Homology
    • Find the best match to a database of sequences with known 3D-structure
  • Threading
  • Meta-servers and other methods
ab initio prediction
Ab initio Prediction
  • Sampling the global conformation space
    • Lattice models / Discrete-state models
    • Molecular Dynamics
    • Pre-set libraries of fragment 3D motifs
  • Picking native conformations with an energy function
    • Solvation model: how protein interacts with water
    • Pair interactions between amino acids
  • Predicting secondary structure
    • Local homology
    • Fragment libraries
lattice string folding
Lattice String Folding
  • HP model: main modeled force is hydrophobic attraction
    • NP-hard in both 2-D square and 3-D cubic
    • Constant approximation algorithms
    • Not so relevant biologically
rosetta http www bioinfo rpi edu bystrc hmmstr server php





  • Monte Carlo based method
  • Limit conformational search space by using sequence—structure motif I-Sites library (
    • 261 patterns in library
    • Certain positions in motif favor certain residues
  • Remove all sequences with <25% identity
  • Find structures of the 25 nearest sequence neighbors of each 9-mer


    • Local structures often fold independently of full protein
    • Can predict large areas of protein by matching sequence to I-Sites
i sites examples
Non polar helix

Abundance of alanine at all positions

Non-polar side chains favored at positions 3, 6, 10 (methionine, leucine, isoleucine)

I-Sites Examples
  • Amphipathic helix
    • Non-polar side chains favored at positions 6, 9, 13, 16 (methionine, leucine, isoleucine)
    • Polar side chains favored at positions 1, 8, 11, 18 (glutamic acid, lysine)
rosetta method




  • New structures generated by swapping compatible fragments
  • Accepted structures are clustered based on energy and structural size
  • Best cluster is one with the greatest number of conformations within 4-Å rms deviation structure of the center
  • Representative structures taken from each of the best five clusters and returned to the user as predictions
rosetta results
Rosetta Results
  • In CASP4, Rosetta’s best models ranged from 6–10 Å rmsd C
  • For comparison, good comparative models give 2-5 Å rmsd C
  • Most effective with small proteins (<100 residues) and structures with helices
the scop database
The SCOP Database

Structural Classification Of Proteins

FAMILY: proteins that are >30% similar, or >15% similar and have similar known structure/function

SUPERFAMILY: proteins whose families have some sequence and function/structure similarity suggesting a common evolutionary origin

COMMON FOLD: superfamilies that have same secondary structures in same arrangement, probably resulting by physics and chemistry

CLASS: alpha, beta, alpha–beta, alpha+beta, multidomain

status of protein databases
Status of Protein Databases


SCOP: Structural Classification of Proteins. 1.67 release24037 PDB Entries (15 May 2004). 65122 Domains.


evolution of proteins domains
Evolution of Proteins – Domains
  • #members in different families obey power law
  • 429 families common in all 14 eukaryotes;
  • 80% of animal domains, 90% of fungi domains
  • 80% of proteins are multidomain in eukaryotes;
  • domains usually combine pairwise in same order --why?

Chothia, Gough, Vogel, Teichmann, Science 300:1701-17-3, 2003

Evolution of proteins happens mainly through duplication, recombination, and divergence

homology based prediction
Homology-based Prediction
  • Align query sequence with sequences of known structure, usually >30% similar
  • Superimpose the aligned sequence onto the structure template, according to the computed sequence alignment
  • Perform local refinement of the resulting structure in 3D

The number of unique structural folds

is small (possibly a few thousand)

90% of new structures submitted to PDB in the

past three years have similar folds in PDB

homology based prediction38

Raw model

Loop modeling

Side chain placement


Homology-based Prediction