1 / 51

Protein network analysis

Protein network analysis. Network motifs Network clusters / modules Co-clustering networks & expression Network comparison (species, conditions) Integration of genetic & physical nets Network visualization. Network motifs. Network Motifs (Milo, Alon et al. ).

floyd
Download Presentation

Protein network analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein network analysis • Network motifs • Network clusters / modules • Co-clustering networks & expression • Network comparison(species, conditions) • Integration of genetic & physical nets • Network visualization

  2. Networkmotifs

  3. Network Motifs (Milo, Alonet al.) • Motifs are “patterns of interconnections occurring in complex networks.” • That is, connected subgraphs of a particular isomorphic topology • The approach queries the network for small motifs (e.g., of < 5 nodes) that occur much more frequently than would be expected in random networks • Significant motifs have been found in a variety of biological networks and, for instance, correspond to feed-forward and feed-back loops that are well known in circuit design and other engineering fields. • Pioneered by Uri Alon and colleagues

  4. Motif searches in 3 different contexts How many motifs (connected subgraph topologies) exist involving three nodes? If the graph is undirected? If the graph is directed?

  5. All 3-node directed subgraphs What is the frequency of each in the network?

  6. Outline of the Approach • Search network to identify all possible n-node connected subgraphs (here n=3 or 4) • Get # occurrences of each subgraph type • The significance for each type is determined using permutation testing, in which the above process is repeated for many randomized networks (preserving node degrees– why?) • Use random distributions to compute a p-value for each subgraph type. The “network motifs” are subgraphs with p < 0.001

  7. Schematic view of network motif detection Networks are randomized preserving node degree

  8. Concentration of feedforward motif: (Num. appearances of motif divided byall 3 node connected subgraphs) Mean+/-SD of 400 subnetworks

  9. Transcriptional network results

  10. Neural networks

  11. Food webs

  12. World Wide Web

  13. Electronic circuits

  14. Interesting questions • Which networks have motifs in common? • Which networks have completely distinct motifs versus the others? • Does this tell us anything about the design constraints on each network? • E.g., the feedforward loop may function to activate output only if the input signal is persistent (i.e., reject noisy or transient signals) and to allow rapid deactivation when the input turns off • E.g., food webs evolve to allow flow of energy from top to bottom (?!**!???), whereas transcriptional networks evolve to process information

  15. Identifying modules in the network • Rives/Galitski PNAS paper 2003 • Define distance between each pair of proteins in the interaction network • E.g., d = shortest path length • To compute shortest path length, use Dijkstra’s algorithm • Cluster w/ pairwise node similarity = 1/d2

  16. Integration ofnetworks and expression

  17. Querying biological networks for “Active Modules” Color network nodes (genes/proteins) with:Patient expression profileProtein statesPatient genotype (SNP state)Enzyme activityRNAi phenotype Interaction Database Dump, aka “Hairball” Active Modules Ideker et al. Bioinformatics (2002)

  18. A scoring system for expression “activity” A B C D

  19. Scoring over multiple perturbations/conditions Perturbations /conditions

  20. Searching for “active” pathways in a large network • Score subnetworks according to their overall amount of activity • Finding the highest scoring subnetworks is NP hard, so we use heuristic search algs. to identify a collection of high-scoring subnetworks (local optima) • Simulated annealing and/or greedy search starting from an initial subnetwork “seed” • During the search we must also worry about issues such as local topology and whether a subnetwork’s score is higher than would be expected at random

  21. Simulated Annealing Algorithm

  22. Network regions whose genes change on/off or off/on after knocking out different genes

  23. Initial Application to Toxicity:Networks responding to DNA damage in yeast • Tom Begley and Leona Samson; MIT Dept. of Bioengineering • Systematic phenotyping of gene knockout strains in yeast • Evaluation of growth of each strain in the presence of MMS (and other DNA damaging agents) • Sensitive • Not sensitive • Not tested • MMS sensitivity in ~25% of strains • Screening against a network of protein interactions…

  24. Begley et al., Mol Cancer Res, (2002)

  25. Networks responding to DNA damage as revealed byhigh-throughput phenotypic assays Begley et al., Mol Cancer Res, (2002)

  26. Host-pathogen interactions regulating early stage HIV-1 infection Genome-wide RNAi screens for genes required for infection utilizing a single cycle HIV-1 reporter virus engineered to encode luciferase and bearing the Vesicular Stomatitis Virus Glycoprotein (VSV-G) on its surface to facilitate efficient infection… SumitChanda

  27. Project onto a large network of human-human and human-HIV protein interactions

  28. Network modules associated with infection Konig et al. Cell 2008

  29. Network-based classification

  30. Disease aggression(Time from Sample Collection SCto Treatment TX) Network-based classification Chuang et al. MSB 2007 Lee et al. PLoS Comp Bio 2008 Ravasi et al. Cell 2010

  31. The Mammalian Cell Fate Map:Can we classify tissue type using expression, networks, etc? Gilbert Developmental Biology 4th Edition

  32. Interaction coherence within a tissue class r = 0.9 A B Endoderm r = 0.0 A B Mesoderm r = 0.2 A B Ectoderm (incl. CNS) Taylor et al. Nature Biotech 2009

  33. Protein interactions, not levels, dictate tissue specification

  34. Functional Enrichment

  35. ::: Introduction. Gene Set Enrichment Analysis - GSEA - GSEA MIT BroadInstitute v 2.0 availablesinceJan 2007 Version 2.0 includesBiocarta, BroadInstitute, GeneMAPP, KEGG annotations and more... Platforms: Affymetrix, Agilent, CodeLink, custom... (Subramanian et al. PNAS. 2005.)

  36. ::: Introduction. Gene Set Enrichment Analysis - GSEA - GSEAappliesKolmogorov-Smirnof test tofindassymmetricaldistributionsfordefined blocks of genes in datasetswholedistribution. Is this particular Gene Set enriched in my experiment? Genes selected by researcher, Biocarta pathways, GeneMAPP sets, genes sharing cytoband, genes targeted by common miRNAs …up to you…

  37. Dataset distribution Gene set 2 distribution ::: Introduction. ::: K-S test Gene Set Enrichment Analysis - GSEA - The Kolmogorov–Smirnov test is used to determine whether two underlying one-dimensional probability distributions differ, or whether an underlying probability distribution differs from a hypothesized distribution, in either case based on finite samples. The one-sample KS test compares the empirical distribution function with the cumulative distribution functionspecified by the null hypothesis. The main applications are testing goodness of fit with the normal and uniform distributions. The two-sample KS test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples. Gene set 1 distribution Number of genes Gene Expression Level

  38. FDR<0.05 ttest cut-off FDR<0.05 Biological meaning? ::: Introduction. Gene Set Enrichment Analysis - GSEA - ClassA ClassB ...testing genes independently...

  39. Gene set 3 enriched in Class B ttest cut-off Gene set 2 enriched in Class A ::: Introduction. Gene Set Enrichment Analysis - GSEA - Gene Set 1 Gene Set 2 Gene Set 3 ClassA ClassB - Correlationwith CLASS +

  40. Subramaniam, PNAS 2005

  41. ::: Introduction. Gene Set Enrichment Analysis - GSEA - The Enrichment Score ::: NES pval FDR Benjamini-Hochberg

  42. Network Alignment Species 1 vs. species 2 Physical vs. genetic

  43. Cross-comparison of networks: • Conserved regions in the presence vs. absence of stimulus • Conserved regions across different species Suthram et al. Nature 2005 Sharan et al. RECOMB 2004 Kelley et al. PNAS 2003 Sharan & Ideker Nat. Biotech. 2006 Scott et al. RECOMB 2005 Ideker & Sharan Gen Res 2008

  44. Plasmodium: a network apart? Plasmodium-specificprotein complexes Conserved Plasmodium / Saccharomyces protein complexes Suthram et al. Nature 2005La Count et al. Nature 2005

  45. Human vs. Mouse TF-TF Networks in Brain Tim Ravasi, RIKEN Consortium et al. Cell 2010

  46. Finding physical pathways to explain genetic interactions Genetic Interactions: • Classical method used to map pathways in model species • Highly analogous tomulti-genic interaction in human disease and combination therapy • Thousands are being uncovered through systematic studies Thus as with other types, the number of known genetic interactions is exponentially increasing… Adapted from Tong et al., Science 2001

  47. Integration of genetic and physical interactions 160 between-pathway models 101 within-pathway models Num interactions:1,102 genetic933 physical Kelley and Ideker Nature Biotechnology (2005)

  48. Systematic identification of “parallel pathway” relationships in yeast

  49. Unified Whole Cell Model of Genetic and Physical interactions

More Related