1 / 69

Bioinformatics: Applications

Bioinformatics: Applications. ZOO 4903 Fall 2006, MW 10:30-11:45 Sutton Hall, Room 312 Jonathan Wren Protein-Protein Interaction Networks. Lecture overview. What we’ve talked about so far Proteins & their domains Protein 3D structure Overview Proteins do not function in a vacuum

allegra
Download Presentation

Bioinformatics: Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics: Applications ZOO 4903 Fall 2006, MW 10:30-11:45 Sutton Hall, Room 312 Jonathan Wren Protein-Protein Interaction Networks

  2. Lecture overview • What we’ve talked about so far • Proteins & their domains • Protein 3D structure • Overview • Proteins do not function in a vacuum • Methods of detecting protein-protein interactions (PPI) • Structure and types of networks • Behavior of networks

  3. Cells are crowded places! Hopper & Mayer, 1999, Prokaryotes. Am.Sci. 87:518

  4. Importance of protein-protein interactions • Many cellular processes are regulated by multiprotein complexes • Distortions of protein interactions can cause diseases • Protein function can be predicted by knowing functions of interacting partners (“guilt by association”) A comparison of sequence (GenBank) and protein-protein interaction data (DIP database) Adapted from S. Fields, FEBS, 2005

  5. Types of protein-protein interactions (PPI) Non-obligate PPI Obligate PPI usually permanent the protomers are not found as stable structures on their own in vivo Stable (many enzyme-inhibitor complexes) dissociation constant Kd=[A][B] / [AB] 10-7÷ 10-13 M Transient Weak (electron transport complexes) Kd mM-M Non-obligate transient homodimer, Sperm lysin (interaction is broken and formed continuously) Intermediate (antibody-antigen, TCR-MHC-peptide, signal transduction PPI), KdM-nM Strong (require a molecular trigger to shift the oligomeric equilibrium) KdnM-fM Obligate heterodimer Human cathepsin D Non-obligate permanent heterodimer Thrombin and rodniin inhibitor Bovine G protein dissociates into G and G subunits upon GTP, but forms a stable trimer upon GDP

  6. Multiple interactions: Guanine-nucleotide binding protein Adapted from Vetter & Wittinghofer, Science 2001

  7. Multiple interactions: Guanine-nucleotide binding protein Question: How conserved are the interactive vs non-interactive portions of this protein? Adapted from Vetter & Wittinghofer, Science 2001

  8. Pair of duplicated proteins Pair of duplicated proteins Shared interactions Shared interactions Protein evolution - gene duplication Right after duplication Over time

  9. Methods of identifying PPIs • Experimental • Protein-protein arrays • Y2H assay • TAP assay • Computational/Inferential • Interolog analysis • Co-localization, co-expression • Correlated mutations • Text-mining

  10. Interologs • Homolog • Common ancestors • Common 3D structure • Common active sites • Ortholog • Derived from Speciation • Paralog • Derived from Duplication • Interolog • Conserved Protein-Protein Interaction Thus, finding one PPI may yield dividends!

  11. Protein Arrays H Zhu et al (2000) “Analysis of yeast protein kinases using protein chips” Nature Genetics 26: 283-289

  12. The Two-Hybrid System • Two hybrid proteins are generated with transcription factor domains • Both fusions are expressed in a yeast cell that carries a reporter gene whose expression is under the control of binding sites for the DNA-binding domain Activation Domain Prey Protein Bait Protein Binding Domain Reporter Gene

  13. The Two-Hybrid System • Interaction of bait and prey proteins localizes the activation domain to the reporter gene, thus activating transcription. • Since the reporter gene typically codes for a survival factor, yeast colonies will grow only when an interaction occurs. Activation Domain Prey Protein Reporter mRNA Bait Protein Reporter mRNA Reporter mRNA Reporter mRNA Binding Domain Reporter mRNA Reporter Gene

  14. Genome-wide analysis by Y2H • Matrix approach: a matrix of prey clones is added to the matrix of bait clones. Diploids where X and Y interact are selected based on the expression of a reporter gene. • Library approach: one bait X is screened against an entire library. Positives are selected based on their ability to grow on specific substrates. --------------------------------------------------------- Uetz et al Nature 2000 – 957 putative interactions in Yeast Rain et al Nature 2001 – 1,200 putative interactions in H. Pylori Ho et al Nature 2002 – 3,617 putative interactions in Yeast (Mass Spec) Adapted from B. Causier, Mass Spectroscopy Reviews, 2004

  15. Advantages of Y2H • In vivotechnique, good approximation of processes which occur in higher eukaryotes. • Transient interactions can be determined, can predict the affinity of an interaction. • Can be used to detect potential interactions of genes not yet observed to be translated into proteins (e.g. rarely expressed) or novel constructs (e.g. therapeutics) • Relatively fast and efficient.

  16. Disadvantages of Y2H • Fusion of a protein into chimeras can change the structure of a target • Protein interactions can be different in yeast and the organisms where the genes came from • It is difficult to target extracellular proteins • It is hard to detect interactions between proteins active only in a complex • Proteins which can interact in two-hybrid experiments, may never interact in vivo

  17. Tandem affinity purification method (TAP) • Target protein ORF is fused with the DNA sequences encoding TAP tag; • Tagged ORFs are expressed in yeast cells and form native complexes; • The complexes are purified by TAP method; • Components of each complex are found by gel electrophoresis or MS.

  18. Tandem affinity purification method (TAP) TAP tag consists of two IgG binding domains of Staphylococcus protein A and calmodulin binding peptide; -------------------------------------- 7123 interactions can be clustered into 547 complexes (Krogan et al, 2006) O. Puig et al, Methods, 2001

  19. Differences and similarities between Y2H and MS-TAP • TAP permits protein complexes to be isolated, but cannot detect weak/transient PPIs • Both methods generate a lot of false positives, only ~50% interactions are biologically significant • Y2H is in vivo technique • MS can detect large stable complexes and networks of interactions

  20. Text Mining • Searching Medline or PubMed for words or word combinations • Co-occurrence of terms is the simplest metric, yet lends to a higher FP rate • NLP methods are more specific (e.g., “X binds to Y”; “X interacts with Y”; “X associates with Y” etc.) yet are difficult to detect so it has a higher FN rate • Normally requires a list of known gene names or protein names for a given organism

  21. Pre-BIND • Used Support Vector Machine (SVM) to scan literature for PPIs • Precision, accuracy and recall of 92% for correctly classifying PPI abstracts • Estimated to capture 60% of all abstracted protein interactions for a given organism Donaldson et al. BMC Bioinformatics 2003 4:11

  22. Drosophila interaction map From: A Protein Interaction Map of Drosophila Giot et al. Science 302, 1727-1136 (2003)

  23. Comparing large scale data of protein-protein interactions • All methods except for Y2H and synthetic lethality technique are biased toward abundant proteins. • PPI are biased toward certain cellular localizations. • Evolutionarily conserved proteins have much better coverage in Y2H than the proteins restricted to a certain organism. Von Mering et al, Nature, 2002

  24. Functional organization of yeast proteome: network of protein complexes • Essential gene products are more likely to interact with essential rather than nonessential proteins • Orthologous proteins interact with complexes enriched with orthologs Gavin et al, Nature, 2002

  25. PPI Databases online • DIP • http://dip.doe-mbi.ucla.edu/ • MIPS (small scale) • http://mips.gsf.de/proj/ppi/ • BIND (PPI, Prot-DNA, Prot-SM) • http://www.bind.ca (now owned by Unleashed) • OPHID (predicted interactions) • http://ophid.utoronto.ca/ophid/ • MINT - Molecular Interactions Database • http://mint.bio.uniroma2.it/mint/Welcome.do • IntAct (EBI) • http://www.ebi.ac.uk/intact/site/ • InterDom (domain interactions) • http://interdom.lit.org.sg/ • STRING (EMBL) • http://string.embl.de/

  26. Types Experiment (E) Structure detail (S) Predicted Physical (P) Functional (F) Curated (C) Homology modeling (H) *International Molecular Exchange (IMEx) consortium Interaction databases

  27. Comparing the DBs • High FP rate in high- throughput exp. • Disagreement between benchmark sets • Experimental PPI data is sparse relative to all PPIs, so dataset overlap is small and hard to confirm with multiple sources

  28. PPI network properties Nodes & connections

  29. Characteristics of networks n = nodes, k = connections or “edges” K=2 K=2 K=3 K=1 • In biology, n refers to genes/proteins (and/or metabolites) while k refers to interactions

  30. Examples of networks: Proximity-based interactions

  31. Examples of networks: Distant interactions

  32. Elementary features:node (n) diversity and dynamics

  33. Elementary features:edge (k) diversity and dynamics

  34. Elementary features:Network Evolution

  35. Network properties • Network Structure Metrics • Average path length • Degree distribution(connectivity) • Clustering coefficient • Network Structure Types • Regular • Random • Small-world • Scale-free

  36. Structural metrics: Path length & network diameter

  37. Structural Metrics:Degree distribution (connectivity)

  38. Structural Metrics:Clustering coefficient

  39. Network properties • Network Metrics • Average path length • Degree distribution(connectivity) • Clustering coefficient • Network Structures • Regular • Random • Small-world • Scale-free

  40. Regular networks – fully connected

  41. Regular networks –Lattice

  42. Regular networks –Lattice: ring world

  43. Random networks

  44. Random Networks

  45. Small-world networks

  46. Exponential network degree distribution . . . .

  47. Scale-free networks New nodes preferentially attach to highly connected ones Coined by A.L. Barabasi in 1998

  48. Different network models: Barabasi-Alberts. Model of preferential attachment. • At each step, a new node is added to the graph. • The new node is attached to one of old nodes with probability proportional to the vertex degree. ln(P(k)) Degree distribution – power law distribution. ln(k) Barabasi & Albert, Science, 1999

  49. Properties of scale-free networks. Multiplying k by a constant, does not change the shape of the distribution – scale free distribution. From T. Przytycka • Small diameter • Tolerance to errors and attacks • But: sub-networks can be scale-free while underlying degree distribution is not.

  50. Difference between scale-free and random graph models. . Random networks are homogeneous, most nodes have the same number of links. Scale-free networks have a number of highly connected verteces. Adapted from Jeong et al, Nature, 2000

More Related