1 / 29

An Introduction to Bioinformatics

An Introduction to Bioinformatics. Protein Structure Prediction. Aims. Understand the use of algorithms Recognize different approaches Understand the limitations. Objectives. Predict occurrence of aspects of structure To select appropriate tools. Introduction.

duaa
Download Presentation

An Introduction to Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Bioinformatics Protein Structure Prediction

  2. Aims • Understand the use of algorithms • Recognize different approaches • Understand the limitations Objectives • Predict occurrence of aspects of structure • To select appropriate tools

  3. Introduction • Structure has several levels • 1 primary • 2 secondary • 3 tertiary • 4 quaternary

  4. 1 primary • Amino acid sequence NH2-MRLSWYDPDFQARLTRSNSKCQGQLEV YLKDGWHMVC SQSWGRSSKQWEDPSQASKVCQRLNCGVPLSLGPFLVTYTP QSSIICYGQLGSFSNCSHSRNDMCHSLGLTCLE-COOH

  5. 2 secondary • Localized organisation -helices and -sheets

  6. 3 tertiary Three-dimensional organisation

  7. 4 quaternary Multi protein assembly

  8. The problem….. • The best way is by X-ray crystallography or NMR etc… • Structure databases only hold about 10,000 + structures • Therefore devise programs to deduce structural solutions • Complex!

  9. Secondary Structure prediction • Signal peptides • Intracellular targeting • Trans-membrane -helices • -helices and -sheets • Super-secondary structure (motifs)

  10. Signal peptides • Short N-terminal amino acid sequences • Direct to membrane • Cleaved after translocation • SignalP • Nobel Prize 1999 Günter Blobel

  11. SignalP predicts signal peptide cleavage sites Only first 50-70  Using neural networks

  12. Is the sequence a signal peptide? # Measure Position Value Cutoff Conclusion max. C 25 0.910 0.37 YES max. Y 25 0.861 0.34 YES max. S 12 0.960 0.88 YES mean S 1-24 0.892 0.48 YES # Most likely cleavage site between pos. 24 and 25: SRA-LE

  13. Intracellular targeting • TargetP • Predict subcellular location of eukaryotic protein • Presequences • Chloroplasts • Mitochondria • signal peptide

  14. Transmembrane Domains • Lots of programs • TMHMM • -helices • hydrophobic   • helix topology • R or K +ve charge cytoplasmic side • Hidden Markov Modelling

  15. Paste as FASTA file e.g Serotonin Receptor

  16. Predicts the transmembrane domains and orientation

  17. -helices and -sheets • GOR algorithim • Assigns each residue to one conformational state of -helix, extended chain, reverse turn or coil • 64.4% accurate • Many other sites • most use multiple alignments

  18. -helices and -sheets 10 20 30 40 50 60 70 | | | | | | | MKFSWRTALLWSLPLLVVGFFFWQGSFGGADANLGSNTANTRMTYGRFLEYVDAGRITSVDLYENGRTAI cccceeeeeecccceeeeeeeeccccccccccccccccccchhhhcceeeeccccceeeeeeccccceee VQVSDPEVDRTLRSRVDLPTNAPELIARLRDSNIRLDSHPVRNNGMVWGFVGNLIFPVLLIASLFFLFRR eeccccccchhhhccccccccchhhhhhhhhccccccccceecccceeeeecccccchhhhhhhhheeec SSNMPGGPGQAMNFGKSKARFQMDAKTGVMFDDVAGIDEAKEELQEVVTFLKQPERFTAVGAKIPKGVLL cccccccccchhhhcchhhhhhhhccceeeecchhhhhhhhhhhhhhhhhhcccchhhhhcccccceeee VGPPGTGKTLLAKAIAGEAGVPFFSISGSEFVEMFVGVGASRVRDLFKKAKENAPCLIFIDEIDAVGRQR ecccccchhhhhhhhhcccccceeecccccceeeeeecccchhhhhhhhhcccccceeeecchhhhcccc GAGIGGGNDEREQTLNQLLTEMDGFEGNTGIIIIAATNRPDVLDSALMRPGRFDRQVMVDAPDYSGRKEI ccccccccchhhhhhhhhhhhhcccccccceeeeeeccccchhhhhhccccccceeeeecccccccchhh LEVHARNKKLAPEVSIDSIARRTPGFSGADLANLLNEAAILTARRRKSAITLLEIDDAVDRVVAGMEGTP hhhhhhhhccccccchhhhccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhheeecccccc LVDSKSKRLIAYHEVGHAIVGTLLKDHDPVQKVTLIPRGQAQGLTWFTPNEEQGLTTKAQLMARIAGAMG cccccccchhhhhcccceeeeeecccccccceeeecccccccceeccccccccchhhhhhhhhhhhhhhh GRAAEEEVFGDDEVTTGAGGDLQQVTEMARQMVTRFGMSNLGPISLESSGGEVFLGGGLMNRSEYSEEVA hhhhhhhcccccceeeccccchhhhhhhhhhhhhhhccccccccccccccceeeecccccccccchhhhh TRIDAQVRQLAEQGHQMARKIVQEQREVVDRLVDLLIEKETIDGEEFRQIVAEYAEVPVKEQLIPQL hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccccchhhhhhhhhhcccccccccccc

  19. Super-secondary Structure • Secondary structure elements combined into specific geometric arrangements known as motifs Beta corner

  20. Super-secondary Structure Several programs/websites for specific domains e.g. • PAIRCOIL and MULTICOIL - detect coiled-coiled regions • regions separating domains • TRESPASSER - detects Leucine Zippers • Leu-X6-Leu-X6-Leu-X6-Leu protein interaction domain • NPS@nalysis Helix-Turn-Helix • Protein interaction/DNA binding

  21. Integrated stucture prediction • One stop shop! • Predict Protein at EBI • secondary structure • solvent accessibility globular regions • transmembrane helices coiled-coil regions • a multiple sequence alignment ProSite sequence motifs • low-complexity retions • ProDom domain assignments

  22. Tertiary Structure Prediction • Homology modelling • Fold recognition • Threading • Model building

  23. Protein sequence (primary structure) Homologue of known structure Fold prediction, ab initio methods etc. Comparative modelling Database searching for homologues 3D-structure No homologue of known structure

  24. Homology Modelling • Method of choice following BLAST search • SWISSModel is agood WWWInterface URL: http://www.expasy.ch/swissmod/SWISS-MODEL.html

  25. Homology Modelling • Requires at least one sequence of known 3D-structure with significant similarity to the target sequence. • Compare the target sequence with database - FastA and BLAST. • Sequences with a FastA score 10.0 standard deviations above the mean of the random scores or a P(N) lower than 10-5 (BLAST) considered for the model building • Restrict to those which share at least 30% residue identity

  26. Homology Modelling • Framework construction • compare atom positions - Cs • Build non-conserved loops • Complete backbone - add other atoms • Add side chains • Refine

  27. Insulin like gene from C.elegans Red = Insulin Blue = ILGF1

  28. What if I have no homologue? Ab initio methods - Threading • Sequence of unknown structure • Thread through a through a sequence of known structure • Move query sequence through residue by resudue and compare computationally • include thermodynamic criteria, solvent accessibility, secondary structure information • Computing intensive

  29. http://www.cs.bgu.ac.il/~bioinbgu/form.html

More Related