1 / 54

aims

aims. Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function relationships. Possible scenarios. 1. No homology 1D predictions. Sequence motifs. Limited functional prediction. Ab-initio prediction

Download Presentation

aims

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. aims • Structure prediction tries to build models of 3D structures of proteins that could be useful for understanding structure-function relationships.

  2. Possible scenarios 1. No homology 1D predictions. Sequence motifs. Limited functional prediction. Ab-initio prediction 2. Homology exist but cannot be recognized easily (psi-blast, threading) Low resolution fold predictions are possible. No functional information. 3. Homology can be recognized using sequence comparison tools or protein family databases (blast, clustal, pfam,...). Structural and functional predictions are feasible

  3. AGGCFHIKLAAGIHLLVILVVKLGFSTRDEEASS Average over a window 1D prediction • Prediction is based on averaging aminoacid properties

  4. 1D prediction. Properties • Secondary structure propensitites • Hydrophobicity • Accesibility • ...

  5. Propensities Chou-Fasman Biochemistry 17, 4277 1978 a b turn

  6. Some programs (www.expasy.org) • BCM PSSP - Baylor College of Medicine • Prof - Cascaded Multiple Classifiers for Secondary Structure Prediction • GOR I (Garnier et al, 1978) [At PBIL or at SBDS] • GOR II (Gibrat et al, 1987) • GOR IV (Garnier et al, 1996) • HNN - Hierarchical Neural Network method (Guermeur, 1997) • Jpred - A consensus method for protein secondary structure prediction at University of Dundee • nnPredict - University of California at San Francisco (UCSF) • PredictProtein - PHDsec, PHDacc, PHDhtm, PHDtopology, PHDthreader, MaxHom, EvalSec from Columbia University • PSA - BioMolecular Engineering Research Center (BMERC) / Boston • PSIpred - Various protein structure prediction methods at Brunel University • SOPM (Geourjon and Deléage, 1994) • SOPMA (Geourjon and Deléage, 1995) • AGADIR - An algorithm to predict the helical content of peptides

  7. 1D Prediction • Original methods: 1 sequence and uniform parameters (25-30%) • Original improvements: Parameters specific from protein classes • Present methods use sequence profiles obtained from multiple alignments and neural networks to extract parameters (70-75%, 98% for transmembrane helix)

  8. PredictProtein (PHD) • Building of a multiple alignment using Swissprot, prosite, and domain databases • 1D prediction from the generated profile using neural networks • Fold recognition • Confidence evaluation

  9. PredictProteinAvailable information • Multiple alignments MaxHom • PROSITE motifs • SEG Composition-bias • Threading TOPITS • Secondary structure PHDSec PROFsec • Transmembrane helices PHDhtm, PHDtop • Globularity GLOBE • Coiled-coil COILS • Disulfide bridges CYSPRED Result

  10. PredictProteinAvailable information • Signal peptides SignalP • O-glycosilation NetOglyc • Chloroplast import signal CloroP • Consensus secondary struc. JPRED • Transmembrane TMHMM, TOPPRED • SwissModel

  11. Methods for remote homology • Homology can be recognized using PSI-Blast • Fold prediction is possible using threading methods • Acurate 3D prediction is not possible: No structure-function relationship can be inferred from models

  12. Threading • Unknown sequence is “folded” in a number of known structures • Scoring functions evaluate the fitting between sequence and structure according to statistical functions and sequence comparison

  13. SELECTED HIT ATTWV....PRKSCT .......... 10.5 > .......... 5.2

  14. ATTWV....PRKSCT SequenceHHHHH....CCBBBB Pred. Sec. Struc.eeebb....eeebeb Pred. accesibility .......... Sequence GGTV....ATTW ........... ATTVL....FFRK Obs SS BBBB....CCHH ........... HHHB.....CBCB Obs Acc. EEBE.....BBEB ........... BBEBB....EBBE

  15. Technical aspects • Alignment: Dynamic programming (Needleman & Wunsch, 1970) • Scoring Function: wseq.Pseq + wstr . (PSS + PAC) Pseq: Dayhoff matrix, PSS y PAC: probability model on pred. SS and AC

  16. Threading accurancy

  17. Comparative modelling • Good for homology >30% • Accurancy is very high for homology > 60%

  18. Remainder • The model must be USEFUL • Only the “interesting” regions of the protein need to be modelled

  19. Expected accurancy • Strongly dependent on the quality of the sequence alignment • Strongly dependent on the identity with “template” structures. Very good structures if identity > 60-70%. • Quality of the model is better in the backbone than side chains • Quality of the model is better in conserved regions

  20. Steps • Choose templates: Proteins with experimental 3D structure with significant homology (BLAST, PFAM, PDB) • Building multiple alignment of templates. • Alignment quality is critical for accurancy. Always use structure-based alignment. • Reduce redundancies

  21. Steps • Alignment of template structures • Alignment of unknown sequence against template alignment • Structural alignment may not concide with evolution-based alignment. • Gaps must be chosen to minimize structure distortion

  22. Steps • Alignment of template structures • Alignment of unknown sequence against template alignment • Build structure of conserved regions (SCR) • Coordinates come from either a single structure or averages. • Side chains are adapted to the original or placed in standard conformations

  23. Etapas • Alignment of template structures • Alignment of unknown sequence against template alignment • Build structure of conserved regions (SCR) • Build of unconserved regions (“loops” usually)

  24. “loops” Ab initio PDB

  25. “loops” Chosen manually or energy-based

  26. Optimization • Optimize side chain conformation • Energy minimization restricted to standard conformers and VdW energy • Optimize everything • Global energy minimization with restrains • Molecular dynamics

  27. Quality test • No energy differences between a correct or wrong model • The structure must by “chemically correct” to use it in quantitative predictions

  28. Analysis software • PROCHECK • WHATCHECK • Suite Biotech • PROSA

  29. Sources of information • 300 best structures in PDB • Molecular geometry from CSD database • Theoretical data (Ramachandran, etc.)

  30. Procheck • Covalent geometry • Planarity • Dihedral angels • Quirality • Non-bonded interactions • Satisfied/unsatisfies Hydrogen-bonds • Disulfide bonds

  31. Whatcheck

  32. Prediction software • SwissModel (automatic) • http://www.expasy.org/swissmod/ • SwissModel Repository • http://swissmodel.expasy.org/repository/ • 3D-JIGSAW (M.Stenberg) • http://www.bmm.icnet.uk/servers/3djigsaw/ • Modeller (A.Sali) • http://salilab.org/modeller/modeller.html • MODBASE (A. Sali) • http://alto.compbio.ucsf.edu/modbase-cgi/index.cgi

  33. spdbv Result

  34. Final test • The model must justify experimental data (i.e. differences between unknown sequence and templates) and be useful to understand function.

More Related