Bcb 444 544
Download
1 / 35

BCB 444/544 - PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on

BCB 444/544. Lecture 22 Secondary Structure Prediction Tertiary Structure Prediction #22_Oct10. Required Reading ( before lecture). Mon Oct 8 - Lecture 20 Protein Secondary Structure Prediction Chp 14 - pp 200 - 213 Wed Oct 10 - Lecture 21 Protein Tertiary Structure Prediction

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' BCB 444/544' - kaden-briggs


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Bcb 444 544
BCB 444/544

Lecture 22

  • Secondary Structure Prediction

  • Tertiary Structure Prediction

    #22_Oct10

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Required reading before lecture
Required Reading (before lecture)

MonOct 8- Lecture 20

Protein Secondary Structure Prediction

  • Chp 14 - pp 200 - 213

    Wed Oct 10 - Lecture 21

    Protein Tertiary Structure Prediction

  • Chp 15 - pp 214 - 230

    Thurs Oct 11 & Fri Oct 12- Lab 7 & Lecture 22

    Protein Tertiary Structure Prediction

  • Chp 15 - pp 214 - 230

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Assignments announcements
Assignments & Announcements

ALL: HomeWork #3

√Due: Mon Oct 8 by 5 PM

  • HW544: HW544Extra #1

    √Due: Task 1.1 - Mon Oct 1 by noon

    Due: Task 1.2 & Task 2 - Fri Oct 12 by 5 PM

  • 444 "Project-instead-of-Final" students should also submit:

    • HW544Extra #1

    • √Due: Task 1.1 - Mon Oct 8 by noon

    • Due: Task 1.2 - Fri Oct 12 by 5 PM

      <Task 2 NOT required for BCB444 students>

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


New reading homework assignment
New Reading & Homework Assignment

ALL: HomeWork #4 (posted online today)

Due: Fri Oct 19 by 5 PM (one week from today)

Read:

Ginalski et al.(2005) Practical Lessons from Protein Structure Prediction, Nucleic Acids Res. 33:1874-91.http://nar.oxfordjournals.org/cgi/content/full/33/6/1874

(PDF posted on website)

  • Although somewhat dated, this paper provides a nice overview of protein structure prediction methods and evaluation of predicted structures.

  • Your assignment is to write a summary of this paper - for details see HW#4 posted online & sent by email on Fri Oct 12

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Seminars this week yesterday
Seminars this Week - (yesterday)

BCB List of URLs for Seminars related to Bioinformatics:

http://www.bcb.iastate.edu/seminars/index.html

  • Oct 11 Thurs

    • Dr. Klaus Schulten (Univ of Illinois) - Baker Center Seminar The Computational Microscope2:10 PM in E164 Lagomarcinohttp://www.bioinformatics.iastate.edu/seminars/abstracts/2007_2008/Klaus_Schulten_Seminar.pdf

    • Dr. Dan Gusfield(UC Davis) - Computer Science ColloquiumReCombinatorics: Combinatorial Algorithms for Studying History of Recombination in Populations 3:30 PM in Howe Hall Auditorium

      http://www.cs.iastate.edu/~colloq/new/gusfield.shtml

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Seminars this week fri today
Seminars this Week - Fri (today)

BCB List of URLs for Seminars related to Bioinformatics:

http://www.bcb.iastate.edu/seminars/index.html

  • Oct 12 Fri

    • Dr. Edward Yu(Physics/BBMB, ISU) - BCB Faculty Seminar TBA: "Structural Biology" (see URL below)2:10 PM in 102 Scihttp://webdev.its.iastate.edu/webnews/data/site_gdcb_dept_seminars/30/webnewsfilefield_abstract/Dr.-Ed-Yu.pdf

    • Dr. Srinivas Aluru (ECprE, ISU) - GDCB Seminar

      Consensus Genetic Maps: A Graph Theoretic Approach

      4:10 PM in 1414 MBB

      http://webdev.its.iastate.edu/webnews/data/site_gdcb_dept_seminars/35/webnewsfilefield_abstract/Dr.-Srinivas-Aluru.pdf

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Chp 12 protein structure basics
Chp 12 - Protein Structure Basics

SECTION V STRUCTURAL BIOINFORMATICS

Xiong: Chp 12Protein Structure Basics

  • Amino Acids

  • Peptide Bond Formation

  • Dihedral Angles

  • Hierarchy

  • Secondary Structures

  • Tertiary Structures

  • Determination of Protein 3-Dimensional Structure

  • Protein Structure DataBank (PDB)

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Experimental determination of 3d structure
Experimental Determination of 3D Structure

2 Major Methods to obtain high-resolution structures

  • X-ray Crystallography(most PDB structures)

  • Nuclear Magnetic Resonance (NMR) Spectroscopy

    Note Advantages & Limitations of each method

    • (See your lecture notes & textbook)

    • For more info:http://en.wikipedia.org/wiki/Protein_structure

  • Other methods (usually lower resolution, at present):

    • Electron Paramagnetic Resonance (EPR - also called ESR, EMR)

    • Electron microscopy (EM)

    • Cryo-EM

    • Scanning Probe Microscopies (AFM - Atomic Force Microscopy)

      • http://www.uweb.engr.washington.edu/research/tutorials/SPM.pdf

    • Circular Dichroism (CD), several other spectroscopic methods

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Best resolution of protein structures
"Best" Resolution of Protein Structures

  • High-resolution methods

    • X-ray crystallography (< 1A)

    • NMR (~1 - 2.5A)

  • Lower-resolution methods

    • Cryo-EM (~10-15A)

  • Theoretical Models?

    • Usually low resolution, at present, but

    • Highly variable - & a few ~crystal data

Baker & Sali (2000)

Pevsner

Fig 9.36

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Chp 13 protein structure visualization comparison classification
Chp 13 - Protein Structure Visualization, Comparison & Classification

SECTION V STRUCTURAL BIOINFORMATICS

Xiong: Chp 13

Protein Structure Visualization, Comparison & Classification

  • Protein Structural Visualization

  • Protein Structure Comparison - later

  • Protein Structure Classification

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Protein structure classification
Protein Structure Classification Classification

  • SCOP = Structural Classification of Proteins

    Levels reflect both evolutionary and structural relationships

    http://scop.mrc-lmb.cam.ac.uk/scop

  • CATH = Classification by Class, Architecture,Topology & Homologyhttp://cathwww.biochem.ucl.ac.uk/latest/

  • DALI -(recently moved to EBI & reorganized)

    DALI Database (fold classification)http://ekhidna.biocenter.helsinki.fi/dali/start

Each method has strengths & weaknesses….

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Chp 14 secondary structure prediction
Chp 14 - Secondary Structure Prediction Classification

SECTION V STRUCTURAL BIOINFORMATICS

Xiong: Chp 14

Protein Secondary Structure Prediction

  • Secondary Structure Prediction for Globular Proteins

  • Secondary Structure Prediction for Transmembrane Proteins

  • Coiled-Coil Prediction

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Secondary structure prediction
Secondary Structure Prediction Classification

Has become highly accurate in recent years (>85%)

  • Usually 3 (or 4) state predictions:

    • H = -helix

    • E = -strand

    • C = coil (or loop)

    • (T = turn)

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Secondary structure prediction methods
Secondary Structure Prediction Methods Classification

  • 1st Generation methods

    Ab initio - used relatively small dataset of structures available

    Chou-Fasman - based on amino acid propensities (3-state)

    GOR - also propensity-based (4-state)

  • 2nd Generation methods

    based on much larger datasets of structures now available

    GOR II, III, IV, SOPM, GOR V, FDM

  • 3rd Generation methods

    Homology-based & Neural network based

    PHD, PSIPRED, SSPRO, PROF, HMMSTR, CDM

  • Meta-Servers

    combine several different methods

    Consensus & Ensemble based

    JPRED, PredictProtein, Proteus

BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


Secondary structure prediction servers
Secondary Structure Prediction Servers Classification

Prediction Evaluation?

  • Q3 score - % of residues correctly predicted (3-state)

    in cross-validation experiments

    Best results? Meta-servers

  • http://expasy.org/tools/(scroll for 2' structure prediction)

  • http://www.russell.embl-heidelberg.de/gtsp/secstrucpred.html

  • JPred www.compbio.dundee.ac.uk/~www-jpred

  • PredictProteinhttp://www.predictprotein.org/Rost, Columbia

    Best "individual" programs? ??

  • CDM http://gor.bb.iastate.edu/cdm/ Sen…Jernigan, ISU

  • FDM (not available separately as server) Cheng…Jernigan, ISU

  • GOR Vhttp://gor.bb.iastate.edu/ Kloczkowsky…Jernigan, ISU

  • BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Consensus data mining cdm
    Consensus Data Mining (CDM) Classification

    • Developed by Jernigan Group at ISU

      • Basic premise: combination of 2 complementary methods can enhance performance by harnessing distinct advantages of both methods; combines FDM & GOR V:

  • FDM - Fragment Data Mining - exploits availability of sequence-similar fragments in the PDB, which can lead to highly accurate prediction - much better than GOR V - for such fragments, but such fragments are not available for many cases

  • GOR V - Garnier, Osguthorpe, Robson V - predicts secondary structure of less similar fragments with good performance; these are protein fragments for which FDM method cannot find suitable structures

  • For references & additional details: http://gor.bb.iastate.edu/cdm/

  • BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Where find actual secondary structure in the pdb
    Where Find Classification "Actual" Secondary Structure? In the PDB

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    How does predicted secondary structure compare e g from cmd
    How Does ClassificationPredicted Secondary Structure Compare? e.g., from CMD

    DSSP

    Author

    Query MAATAAEAVASGSGEPREEAGALGPAWDESQLRSYSFPTRPIPRLSQSDPRAEELIENEE

    GOR V CCCCHHHHHHHHCCHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCC

    FDM CCCCCCCCCCCCCCCCCEECCCCCCCCCHHHCCCCCCEECCCCCCCCCCHHHHHHHHCCC

    CDM CCCCHHHHHHCCCCCCCEECCCCCCCCCHHHCCCCCCEECCCCCCCCCCHHHHHHHHCCC

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Secondary structure prediction for different types of proteins domains
    Secondary Structure Prediction: Classificationfor Different Types of Proteins/Domains

    For Complete proteins:

    Globular Proteins - use methods previously described

    Transmembrane (TM) Proteins - use special methods

    (next slides)

    For Structural Domains:many under development:

    Coiled-Coil Domains (Protein interaction domains)

    Zinc Finger Domains (DNA binding domains),

    others…

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Ss prediction for transmembrane proteins
    SS Prediction for Transmembrane Proteins Classification

    Transmembrane (TM) Proteins

    • Only a few in the PDB - but ~ 30% of cellular proteins are membrane-associated !

    • Hard to determine experimentally, so prediction important

    • TM domains are relatively 'easy' to predict!

      Why? constraints due to hydrophobic environment

      2 main classes of TM proteins:

      -helical

      - barrel

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Ss prediction for tm helices
    SS Prediction for TM Classification-Helices

    -Helical TM domains:

    • Helices are 17-25 amino acids long (span the membrane)

    • Predominantly hydrophobic residues

    • Helices oriented perpendicular to membrane

    • Orientation can be predicted using "positive inside" rule

      Residues at cytosolic(inside or cytoplasmic) side of TM helix, near hydrophobic anchor are more positively charged than those on lumenal (inside an organelle in eukaryotes) or periplasmicside (space between inner & outer membrane in gram-negative bacteria)

    • Alternating polar & hydrophobic residues provide clues to interactions among helices within membrane

      Servers?

    • TMHMM or HMMTOP - 70% accuracy - confused by hydrophobic signal peptides (short hydrophobic sequences that target proteins to the endoplasmic reticulum, ER)

    • Phobius - 94% accuracy - uses distinct HMM models for TM helices

      & signal peptide sequences

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Ss prediction for tm helices1
    SS Prediction for TM Classification-Helices

    -Helical TM domains:

    • Helices are 17-25 amino acids long (span the membrane)

    • Predominantly hydrophobic residues

    • Helices oriented perpendicular to membrane

    • Orientation can be predicted using "positive inside" rule

      Residues at cytosolic(inside or cytoplasmic) side of TM helix, near hydrophobic anchor are more positively charged than those on lumenal (inside an organelle in eukaryotes) or periplasmicside (space between inner & outer membrane in gram-negative bacteria)

    • Alternating polar & hydrophobic residues provide clues to interactions among helices within membrane

      Servers?

    • TMHMM or HMMTOP - 70% accuracy - confused by hydrophobic signal peptides (short hydrophobic sequences that target proteins to the endoplasmic reticulum, ER)

    • Phobius - 94% accuracy - uses distinct HMM models for TM helices

      & signal peptide sequences

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Ss prediction for tm barrels
    SS Prediction for TM Classification-Barrels

    -Barrel TM domains: 

    • -strands are amphipathic(partly hydrophobic, partly hydrophilic)

    • Strands are 10 - 22 amino acids long

    • Every 2nd residue is hydrophobic, facing lipid bilayer

    • Other residues are hydrophilic, facing "pore" or opening

      Servers?Harder problem, fewer servers…

      TBBPred - uses NN or SVM (more on these ML methods later)

      Accuracy ?

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Prediction of coiled coil domains
    Prediction of Coiled-Coil Domains Classification

    Coiled-coils

    • Superhelical protein motifs or domains, with two or more interacting -helices that form a "bundle"

    • Often mediate inter-protein (& intra-protein) interactions

      'Easy' to detect in primary sequence:

    • Internal repeat of 7 residues (heptad)

      • 1 & 4 = hydrophobic (facing helical interface)

      • 2,3,5,6,7 = hydrophilic (exposed to solvent)

    • Helical wheel representation - can be used manually detect these, based on amino acid sequence

      Servers?

      Coils, Multicoil -probability-based methods

      2Zip - for Leucine zippers = special type of CC in TFs:

      characterized by Leu-rich motif: L-X(6)-L-X(6)-L-X(6)-L

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Chp 15 tertiary structure prediction
    Chp 15 - Tertiary Structure Prediction Classification

    SECTION V STRUCTURAL BIOINFORMATICS

    Xiong: Chp 15

    Protein Tertiary Structure Prediction

    • Methods

    • Homology Modeling

    • Threading and Fold Recognition

    • Ab Initio Protein Structural Prediction

    • CASP

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Structural genomics status goal
    Structural Genomics - Status & Goal Classification

    ~ 20,000 "traditional" genes in human genome

    (recall, this is fewer than earlier estimate of 30,000)

    ~ 2,000 proteins in a typical cell

    > 4.9 million sequences in UniProt (Oct 2007)

    > 46,000 protein structures in the PDB (Oct 2007)

    Experimental determination of protein structure lags far behind sequence determination!

    • Goal:Determine structures of "all" protein folds in nature, using combination of experimental structure determination methods (X-ray crystallography, NMR, mass spectrometry) & structure prediction

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Structural genomics projects
    Structural Genomics Projects Classification

    TargetDB: database of structural genomics targets

    http://targetdb.pdb.org

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Protein sequence structure analysis
    Protein Sequence & Structure: Analysis Classification

    • Diamond STING Millennium- Many useful structure analysis tools, including Protein Dossier http://trantor.bioc.columbia.edu/SMS/

    • SwissProt (UniProt)

      Protein knowledgebase

      http://us.expasy.org/sprot

    • InterPro

      Sequence analysis tools

      http://www.ebi.ac.uk/interpro

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Protein structure prediction or protein folding problem
    Protein Structure Prediction Classification or Protein Folding Problem

    "Major unsolved problem in molecular biology"

    In cells: spontaneous

    assisted by enzymes

    assisted by chaperones

    In vitro: many proteins can fold to their "native" states spontaneously & without assistance

    but, many do not!

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Deciphering the protein folding code

    • Protein Structure Prediction Classification

      or "Protein Folding" Problem

      Given the amino acid sequence of a protein, predict its

      3-dimensional structure (fold)

    • "Inverse Folding" Problem

      Given a protein fold, identify every amino acid sequence that can adopt that

      3-dimensional structure

    Deciphering the Protein Folding Code

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Protein structure prediction
    Protein Structure Prediction Classification

    Structure is largely determined by sequence

    BUT:

    • Similar sequences can assume different structures

    • Dissimilar sequences can assume similar structures

    • Many proteins are multi-functional

      2 Major Protein Folding Problems:

      1- Determination of folding pathway

      2- Prediction of tertiary structure from sequence

      Both still largely unsolved problems

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Steps in protein folding
    Steps in Protein Folding Classification

    1-"Collapse"- driving force is burial of hydrophobic aa’s

    (fast - msecs)

    2- Molten globule - helices & sheets form, but "loose"

    (slow - secs)

    3- "Final" native folded state - compaction & rearrangement of some 2' structures

    Native state? - assumed to be lowest free energy

    - may be an ensemble of structures

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Protein dynamics
    Protein Dynamics Classification

    • Protein in native state is NOT static

    • Function of many proteins requires conformational changes, sometimes large, sometimes small

    • Globular proteins are inherently "unstable"

      (NOT evolved for maximum stability)

    • Energy difference between native and denatured state is very small (5-15 kcal/mol)

      (this is equivalent to ~ 2 H-bonds!)

    • Folding involves changes in both entropy & enthalpy

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    Difficulty of tertiary structure prediction
    Difficulty of Tertiary Structure Prediction Classification

    Folding or tertiary structure prediction problem can be formulated as a search for minimum energy conformation

    • Search space is defined by psi/phi angles of backbone and side-chain rotamers

    • Search space is enormous even for small proteins!

    • Number of local minima increases exponentially with number of residues

    Computationally it is an exceedingly difficult problem!

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    From thursday s lab
    From Thursday's Lab: Classification

    • Homology Modeling - using SWISS-MODEL

      • http://swissmodel.expasy.org//SWISS-MODEL.html

    • Threading - using 3-D JURY(BioinfoBank, a METAserver)

      • http://meta.bioinfo.pl/submit_wizard.pl

      • Be sure to take a look at CASP contest:

        • http://predictioncenter.gc.ucdavis.edu/

        • CASP7 contest in 2006

        • http://www.predictioncenter.org/casp7/Casp7.html

    BCB 444/544 F07 ISU Dobbs #22 - Secondary & Tertiary Structure Prediction


    ad