1 / 39

Protein Structure Prediction Matthew Betts Russell Group, University of Heidelberg, Germany

Protein Structure Prediction Matthew Betts Russell Group, University of Heidelberg, Germany www.russelllab.org. Active/inactive? Binds/does not bind? Substrate specificity?. Function. Structure. Sequence. What is this about?. What we do to find out what a protein might be doing

patty
Download Presentation

Protein Structure Prediction Matthew Betts Russell Group, University of Heidelberg, Germany

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Structure Prediction Matthew Betts Russell Group, University of Heidelberg, Germany www.russelllab.org

  2. Active/inactive? Binds/does not bind? Substrate specificity? Function Structure Sequence

  3. What is this about? • What we do to find out what a protein might be doing • Looking at sequences, with a particular emphasis on finding out something about the protein structure • Some background for practical work

  4. Given a sequence, what should you look for? • Functional domains (Pfam, SMART, COGS, CDD, etc.) • Intrinsic features • Signal peptide, transit peptides (signalP) • Transmembrane segments (TMpred, etc) • Coiled-coils (coils server) • Low complexity regions, disorder (e.g. SEG, disembl) • Hints about structure?

  5. “Low sequence complexity” (Linker regions? Flexible? Junk? Signal peptide (secreted or membrane attached) Transmembrane segment (crosses the membrane) Tyrosine kinase (phosphorylates Tyr) Immunoglobulin domains (bind ligands?) Given a sequence, what should you look for? SMART domain ‘bubblegram’ for human fibroblast growth factor (FGF) receptor 1 (type P11362 into web site: smart.embl.de)

  6. 3D 3D 3D What about structure? • Intrinsic features general mean trouble for structure determination, so they are usually skipped • Knock on effect is that structures for large, flexible multi-domain proteins are rare • Structure determination/prediction therefore typically restricted to parts (with exceptions obviously)

  7. Structure prediction algorithm Sequence Structure

  8. Best predictions are by homology • Is your sequence homologous to a known structure? • If yes, then often very good models of structure can be constructed. • This is what we will do in the practical

  9. Homology Modelling algorithm +

  10. Homology Modelling Steps • Identify a homologue of known structure • Get the best alignment of your sequence to the structure • Model building • Side-chain replacement • Loop building • Optimisation/relaxation/minimisation

  11. Problems with loops Two subtilisin-like serine proteases

  12. Sanchez et al, Nature Struct. Biol. (Suppl), 7, 986-990, 2001

  13. The Twilight Zone • Sander & Schneider (EMBL, ca. 1990) • Compared all known structures to each other using sequence comparison. • For each fragment of a particular length & sequence identity, simply asked the question: is the structure similar or different. • The line to the right is where one can be 90% confident that an alignment of a particular length & sequence identity • Below the line, structures can be either similar or different: the twilight zone. • (Basis for much of the sequence alignment statistics that are now in use today) Based on Sander & Schneider, Proteins, 9, 56, 1991

  14. Similar structures within the twilight zone sequence identity: 80% 8.8% 4.4% …can we find these similarities without known structures if sequence searches fail? Russell et al, J.Mol. Biol., 1997

  15. Fold Recognition (‘Threading’) ? ? ? ? ? >C562_RHOSH TQEPGYTRLQITLHWAIAGL… Does the sequence “fit” on any of a library of known 3D structures?

  16. Fold Recognition (‘Threading’) Jones, Taylor, Thornton, Nature, 358, 86-89, 1992.

  17. Phe Phe Asp Phe Residue pair potentials GOOD Asp BAD Arg

  18. Fold Recognition Executive Summary • Works some of the time • Probably best at identifying distant homologues, where sequence identity is in the twilight zone • Useful sites: • 3D-PSSM, FUGUE, (Gen)-Threader • Meta predictions are the best - combine all and get a consensus • E.g. bioinfo.pl/meta

  19. If no homology… • Is your sequence homologous to a known structure? • If no then actual models are less accurate, but structural insights still possible • First, secondary structure prediction

  20. Secondary-structure prediction algorithm • Neural networks • Inductive logic programming • Spin-glass theory • Human intuition

  21. Secondary-structure prediction E.g. Chou & Fasman, 1974 Helix forming: Glu, Ala, Leu Helix breaking: Pro, Gly Strand forming: Met, Val, Ile Strand breaking: Glu, Lys, Ser, His, Asn Etc. Numerical approach + simple protocol = prediction of secondary structure Said “80%” accuracy. Reality: 50-60% Tested the method on the same proteins used to derive the parameters… big no-no.

  22. Homologous proteins add a lot of information 70% accuracy! SS pred

  23. What about de novo or ab initio prediction? • Can you simulate folding using physics to predict the structure of a protein • No, not usually. • However, advances have been made… • David Baker, co-workers and subsequent followers: fragment based structure prediction. De novo not ab initio

  24. Predicting Fragments Preferences learned from all stretches with a similar structure

  25. Assembling Fragments Database of structures Fragments matching the target sequence Assembly of fragments Selection of best model

  26. The Prediction Irony • General trend: increasing accuracy is more a function of data than algorithms • In other words: as we know more structure, and indeed even sequence data, we get better at predicting • Probably we will have a perfect algorithm for protein structure prediction when we know all of the answers • Structural genomics & the generally increased pace of structure predictions means there aren’t many really “new” structures anymore

  27. Things to Remember • Methods have mostly been developed for soluble, globular proteins or domains • Problems with membrane proteins, low-complexity, etc. • Many segments in proteins should be studied with other methods: • Signal peptides • TM regions • Coiled-coils • Intrinsic Disorder (e.g. http://dis.embl.de)

  28. What we use this for…

  29. We aim to: Understand molecular interactions Predict molecular interactions Focus on those interactions of biomedical importance Apply tools to large datasets Use interaction networks predictively To predict new interactions To predict other details like pathologies, toxicities

  30. Your second favourite protein Your favourite protein N C N C Modelling or predicting interactions by homology Match to known structure Match to known structure Templates in contact? Histidyl adenylate tRNA Synthetase Modelled Interaction

  31. homology (e.g. blast) Two-hybrid network homology homology homology Prediction of Structures of Complexes Five component complex X-ray + Electron microscopy & Mass Spectometry Russell et al, Curr. Opin. Struct Biol. 2004 Aloy & Russell, Nature Rev. Mol. Cell. Biol. 2006 Taverner et al, Adv Chem. Res. 2008

  32. Adding Mechanisms to Interaction Networks Who interacts with whom? What does the interaction look like? Ga/q RGS-4 P Ga/i How strong? How fast? RGS-3 Which piece from which protein?

  33. Bridging the information gap Modelled complexes Aloy & Russell, Nature Rev. Mol. Cell. Biol., 2006.

  34. From Proteomics to Cellular Anatomy? Kuehner et al, Science, 2010

  35. From Proteomics to Cellular Anatomy? Kuehner et al, Science, 2010

  36. Some Links www.russelllab.org/aas Guide to the amino acids www.russelllab.org/gtsp Guide to Structure Prediction meta.bioinfo.pl Meta server (runs virtually all reliable prediction methods)

  37. Active/inactive? Binds/does not bind? Substrate specificity? Function Structure Structure Prediction Practical www.russelllab.org/wiki Sequence In groups of two or more you will attempt to answer functional questions about a particular protein target

  38. Acknowledgements www.russelllab.org Current group members Rob Russell (the boss), Matthew Betts, Leonardo Trabuco, Oliver Wichmann, Mathias Utz, Yvonne Lara Alumni Chad Davis, Olga Kalinina, Ricardo de la Vega, Victor Neduva, Evangelia Petsalaki Damien Devos Complex modeling & interactions collaborators Patrick Aloy (IRB Barcelona) Anne-Claude Gavin (EMBL Heidelberg) Peer Bork (EMBL Heidelberg) Luis Serrano (CRG Barcelona) Achilleas Frangakis (Uni Frankfurt) Bettina Boettcher (Edinburgh)

More Related