1 / 36

Protein-protein interactions

Protein Analysis Workshop 2010. Protein-protein interactions. Bioinformatics group Institute of Biotechnology University of helsinki. Hung Ta xuanhung.ta@helsinki.fi. Outline. Why are protein-protein interactions (PPIs) so important?.

callia
Download Presentation

Protein-protein interactions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ProteinAnalysisWorkshop 2010 Protein-protein interactions Bioinformatics group Institute of Biotechnology University of helsinki Hung Ta xuanhung.ta@helsinki.fi

  2. Outline • Why are protein-protein interactions (PPIs) so important?. • Experimental methods (high throughput) for discovering PPIs: • Yeast-two-hybrid. • AP-MS. • PPIs databases: DIP, Biogrid, Intact, HPRD… • Computational prediction of PPIs • Genomics methods • Biological context methods • Integrative methods • STRING (EMBL)

  3. Why are PPIs so important? Gene is the basic unit of heredity. Genomes are availabe. Proteins, the working molecules of a cell, carry out many biological activities Proteins function by interacting with other proteins, DNA, RNA, small molecules. genome proteome interactome

  4. P1 P2 P3 P4 P5 PN Search for drug molecules: Y X A pathogen (virus or bacteria) enters the body and produces its own protein, say X. X interacts with one of proteins, say P1, inhibiting it from its routine activities. Introduce into the body a new molecule, Y such that X is more attracted to Y than to P1, freeing P1 to get back to routine work. Diseases emerge The body produces a list of proteins: P1, P2, P3,… PN.

  5. Search for drug molecules: Bring out an effective drug into the market could: • Take 10-15 years • Cost up to US$800 million • Test up to 30,000 candidate molecules Databases of molecules interactions or linkages could help to cut down the search for drug molecules.

  6. The types of PPIs • Binary (physical) interactions: refer to the binding between two proteins whose residues are in contact at some point in time. • Funtional linkages:implicate pairwise relationships between proteins that work together (participate in a common structural complex or pathway) to implement biological tasks.

  7. Protein physical interactomes and functional linkage maps are available for • S. cerevisiae(Uetz et al. 2000; Ito et al. 2001; Ho et al. 2002; Gavin et al. 2002, 2006; Krogan et al. 2006, Tarassov et al. 2008; Yu et al. 2008) • E. coli(Butland et al. 2005; Arifuzzaman et al. 2006) • C. elegans(Li et al. 2004) • D. melanogaster (Giot et al. 2003) • Humans (Rual et al. 2005; Stelzl et al. 2005, Ewing et al. 2007) • …

  8. High throughput experimental methods for discovering PPIs • Yeast-two-hybrid (Y2H) • Ito T. et al., 2001; Uetz P. et al., 2000; Yu H. et al., 2008 • Rual et al. 2005; Stelzl et al. 2005 • Affinity purification followed by mass spectrometry (AP-MS). • Gavin AC et al., 2002, 2006 • Ho Y. et al., 2002 • Krogan NJ et al., 2006

  9. Y2H experiments Idea: • Use a protein of interest as bait in order to discover proteins that physically interact with the bait protein; these are called prey. • A single transcription factor is cut into two pieces called Binding Domain (BD) and Activation Domain (AD). Bait (prey) protein is fused to the BD (AD). • If bait and prey proteins interact, the transcription of the reporter gene is initiated. • High throughput screening the interactions between the bait and the prey library.

  10. AP-MS experiments • Fuse a TAP tag consisting of protein A and calmodulin binding peptide separated by TEV protease cleavage site to the target protein • After the first AP step using an IgG matrix, many contaminants are eliminated. • In the second AP step, CBP binds tightly to calmodulin coated beads. After washing which removes remained contaminants and the TEV protease, the bound meterial is released under mild condition with EGTA. • Proteins are identified by mass spectrometry

  11. AP-MS experiments • Data output by MS is lists including bait protein and its co-purified partners (preys); each accompanied by a reliability score. • Use a scoring system combining spokes and matrix models to generate a network of binary PPIs. Each interaction has a confidence score • Eliminate low scoring links to obtain high confident network. • The network is partitioned into densely connected regions, which are named complexes.

  12. Computational methods of prediction • Comparative Genomic methods • Gene neighbourhood • Gene fusion • Domain-based method • Phylogenetic • Intergrative methods • Biological context methods • Co-expression • GO • Text mining

  13. Gene neighbourhood based method Protein a and b whose genes are close in different genomes are predicted to interact. Dandekar, T. et al. (1998). Conservation of gene order: A fingerprint of proteins that physically interact. Trends in Biochemical Sciences, 23(9), 324–328

  14. Gene fusion (Rosetta stone) Protein a and b are predicted to interact if they combine (fuse) to form one protein in another organism. Enright, A. Jet al. (1999). Protein interaction maps for complete genomes based on gene fusion events. Nature, 402(6757), 86–90.

  15. perl Java sql Domain based methods Protein B Protein A • Validation of inferred DDIs remains difficult due to lack of sufficient and unbias benchmark datasets. • The methods show limited performance at predicting PPIs. H.X. Ta, L. Holm, Biochem. Biophys. Res. Commun. (2009) AS, MLE, PE Well-known experimental PPIs data Inferred domain-domain interactions (DDIs) Interact/Non-interact AS: association; MLE: Maximum Likehood Estimation; PE: Parsimony Explanation

  16. Phylogeny based methods Protein a and c are predicted to interact if they have similar phylogenetic profiles. Pellegrini, M. et al. (1999). Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles. PNAS, 96(8), 4285–4288

  17. Biological context methods • Gene expression: Two protein whose genes exhibit very similar patterns of expression across multiple states or experiments may then be considered candidates for functional association and possibly direct physical interaction. • GO (Gene Ontology) annotations: two interacting proteins likely have the same GO term annotations. • Text-mining: Extract interacting protein information from literature (PubMed..): ”is protein K mentioned with protein I in publications” The techniques are used to validate PPIs discovered by other approaches or are integrated with others in integrative approaches.

  18. Naive bayes Random Forest Decision Tree Kernels Logistic Regression Support Vector Machines Integrative methods Jansen R. et al., Science 2003 Bader J.S. et al., Nat Biotech 2004 Lin N. et al., BMC Bioinformatics 2004 Zhang L. et al., BMC Bioinformatics 2004

  19. Databases of PPIs • DIP(http://dip.doe-mbi.ucla.edu) • 71,275 interactions • 23,200 proteins • 372 organisms • BioGRID (http://www.thebiogrid.org) • 247,366 non-redundant interactions • 31,254  unique proteins • 17 organisms • IntAct (http://www.ebi.ac.uk/intact) • 232,793 interactions • 69,335 proteins • MINT (http://mint.bio.uniroma2.it) • 89,956 interactions • 31,631 proteins • SGD (http://www.yeastgenome.org) Saccharomyces Genome Database • HPRD (http://www.hprd.org/) • 39,194 interactions • 30,047proteins • MIPs: interactions, complexes • STRING: Known and Predicted Protein-Protein Interactions

  20. DIP • Protein function • Protein-protein relationship • Evolution of protein-protein interaction • The network of interacting proteins • Unknown protein-protein interaction • The best interaction conditions

  21. DIP-Searching information

  22. Find information about your protein

  23. DIP Node (DIP:1143N)

  24. Graph of PPIs around DIP:1143N • Nodes are proteins • Edges are PPIs • The center node is DIP:1143N • Edge width encodes the number of independent experiments identyfying the interaction. • Green (red) is used to draw core (unverified) interactions. • Click on each node (edge) to know more about the protein (interaction).

  25. List of interacting partners of DIP:1143N

  26. STRING: Search Tool for the Retrieval of Interacting Genes/Proteins • A database of known and predicted protein interactions • Direct (physical) and indirect (functional) associations • The database currently covers 2,590,259 proteins from 630 organisms • Derived from these sources: • Supported by

  27. Searching information Query infomation via protein names or protein sequences.

  28. Graph of PPIs • Nodes are proteins • Lines with color is an evidence of interaction between two proteins. The color encodes the method used to detect the interaction. • Click on each node to get the information of the corresponding protein. • Click on each edge to get information of the interaction between two proteins.

  29. List of predicted partners • Partners with discription and confidence score. • Choose different types of views to see more detail

  30. Neighborhood View • The red block is the queried protein and others are its neighbors in organisms. Click on the blocks to obtain the information about corresponding proteins. • The close organisms show the similar protein neighborhood patterns. • Help to find out the close genes/proteins in genomic region.

  31. Occurence Views • Represents phylogenetic profiles of proteins. • Color of the boxes indicates the sequence similarity between the proteins and their homologus protein in the organisms. • The size of box shows how many members in the family representing the reported sequence similarity. • Click on each box to see the sequence alignment.

  32. Gene Fusion View • This view shows the individual gene fusion events per species • Two different colored boxes next to each other indicate a fusion event. • Hovering above a region in a gene gives the gene name; clicking on a gene gives more detailed information

  33. References • Skrabanek L, Saini HK, Bader GD, Enright AJ. Computational prediction of protein-protein interactions. Methods Mol Biol. 2004;261:445-68 • Benjamin A. Shoemaker, Anna R. Panchenko. Deciphering Protein–Protein Interactions. Part I. Experimental Techniques and Databases. PLoS Comput Biol 3(3): e42. doi:10.1371/journal.pcbi.0030042 • Benjamin A. Shoemaker, Anna R. Panchenko. Deciphering Protein–Protein Interactions. Part II. Computational Methods to Predict Protein and Domain Interaction Partners.PLoS Comput Biol 3(4): e43. doi:10.1371/journal.pcbi.0030043 • Pitre S, Alamgir M, Green JR, Dumontier M, Dehne F, Golshani A. Computational methods for predicting protein-protein interactions.Adv Biochem Eng Biotechnol. 2008;110:247-67. • Wodak SJ, Pu S, Vlasblom J, Séraphin B. Challenges and rewards of interaction proteomics. Mol Cell Proteomics. 2009 Jan;8(1):3-18 • Yanjun Qi, Ziv Bar-joseph, Judith Klein-seetharaman. Evaluation of different biological data and computational classification methods for use in protein interaction prediction. PROTEINS: Structure, Function, and Bioinformatics. 63(3):490-500

  34. Why protein-protein interactions (PPI)? PPIs are involved in many biological processes: • Signal transduction • Protein complexes or molecular machinery. • Protein carrier. • Protein modifications (phosphorylation) • … PPIs help to decipher the molecular mechanisms underlying the biological functions, and enhance the approaches for drug discovery

  35. Assessment of large–scale datasets of PPIs • Benchmarking high-throughput interactions: • Y2H: Uetz et al. 2000; Ito et al. 2001 • AP-MS: Gavin et al. 2006; Krogan et al. 2006 • Binary gold standard (GS): positive reference set (PRS) and random reference set (RRS). • MIPs co-complex gold standard. • Measure large-scale datasets against Binary-GS and MIPs-GS Yu H, et al. (2008). Science 322: 104-110

  36. Y2H AP/MS Assessment of large–scale datasets of PPIs • AP/MS performs well at detecting co-complex associations according to MIPs • Y2H performs well at detecting binary interactions according to Binary-GS Yu H, et al. (2008). Science 322: 104-110

More Related