1 / 22

Identifying Active Transcription Factors from Expression Data using Pathway Queries

Identifying Active Transcription Factors from Expression Data using Pathway Queries. Florian Sohler, Ralf Zimmer. Outline. Pathway Queries Query network information and functional annotations to find relevant contexts for experimental (expression) data Query language structure

manon
Download Presentation

Identifying Active Transcription Factors from Expression Data using Pathway Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Identifying Active Transcription Factors from Expression Data using Pathway Queries Florian Sohler, Ralf Zimmer Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  2. Outline • Pathway Queries • Query network information and functional annotations to find relevant contexts for experimental (expression) data • Query language structure • Matching algorithm • Visualization • Application: Transcription factor activities • Scoring methods • Data sets • Results Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  3. Context information is necessary • Basic analysis steps for gene expression data • Image analysis • Normalization • Calculation of gene-wise features: • Fold changes • P-values for differential expression • Lists of these features need to be interpreted • What is the biological mechanism causing the regulation? • What is the effect of the observed regulation, e.g. on the metabolism? Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  4. Regulation of Transcription: Signaling Pathways Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  5. Metabolic Pathways Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  6. Biological Networks Networks contain relevant information, but hard to screen manually… Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  7. Biological networks Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  8. Pathway Queries • Researchers often want to look at certain aspects of the data • Mechanisms explaining the data (hypotheses) • Effect on the metabolism or another biological process • Links to known disease-relevant genes • A natural formulation for many of these aspects is in terms of network-like structures • Hypotheses of biological mechanisms • Network context • Provide a language to formulate network templates and an algorithm that finds all instances Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  9. Pathway Query Language Example Query Graph Kinase max. distance: 1 TranscriptionFactor max. distance: 1 RegulatedGenes … Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  10. Pathway Query Language • Specifying genes/proteins: • Boolean expressions based on available annotations • Examples: • Gene appears differentially expressed (p-value<0.01) • GO classification is Transcription Factor • Specifying connections: • Network distance • Proteins and interactions on the path • Multiplicity of nodes • Aggregate e.g. all regulated targets of the transcription factor • Scoring: • Different scoring methods can be indicated in the query • Visualization: • Visualization layers can be defined in the query • Recursive structure: • Other queries can be used as building blocks Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  11. Pathway Queries • Specification • Input: • Network, annotations (expression data, functional annotations) • Query description (network template, hypothesis of biological mechanisms) • Output • All instances of the query in the network • Scores • Visualization • Framework for implementation • ToPNet: A Toolbox for Protein Networks • Sanofi-Aventis Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  12. Matching Algorithm Query Graph Instance Graph Clique Search Kinase Kinase Verifiedconnections Fully connected(unrestricted) TranscriptionFactor TranscriptionFactor RegulatedGenes RegulatedGenes … Pathway Instances Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  13. Visualized Result Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  14. Application: Activity of Transcription Factors (Higher level) regulators Activation/Inhibition Transcription factors, Interactions Expression • Why infer transcription factor activities? • Expression levels of genes can be measured using microarrays • Expression is directly mediated by transcription factors • Transcription factor activity not determined by expression level Inference of transcription factor activity is a first step in a causal analysis of gene expression data Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  15. Transcription Factors Molecular function: Transcription Factor max. distance: 1, regulation … Target genes Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  16. Scoring a Transcription Factor • Given a transcription factor T with M being the set of potential target genes of T • Strategy 1: • Compare the set of genes M with some other (relevant) set of genes (overrepresentation analysis) • Fisher’s exact test • Strategy 2: • Look at the expression data on M and test if the data have the same distribution on M as on the rest of the proteins • Wilcoxon rank test, t-test … Other possibilities: Tian et al., PNAS 102(39), Sep. 2005 Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  17. Scoring a Transcription Factor Transcription Factor Targets Regulated Genes All Genes Fisher’s exact test computes significance of the intersection Question: Which genes are ‘regulated’? Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  18. Scoring a Transcription Factor Transcription Factor Targets All Genes Wilcoxon rank test to compute significance of the target regulation Distribution-free Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  19. Data Sets • Gene expression data • Hughes et al. 2000 conducted ~300 yeast knockout experiments and measured RNA expression levels. • For each gene, p-values and ratios of differential expression were computed. • Biological networks • Genome-wide location analysis of Lee et al. 2003 gives network of ~100 transcription factors and putative regulatees. • For kinase activity:Database of Interacting Proteins (DIP) with ~4700 Proteins and ~15000 interactions for yeast Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  20. Results: Activity Scores of TFs Ste12 is an important regulator for mating functions Bas1 is an important regulator for purine biosynthesis Arg80 and Arg81 are regulators for argenine metabolism Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  21. Correlation of Activity Scores Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

  22. Summary • Rich sources of network data are available and contain relevant context information for gene expression data • Pathway Queries provide a mechanism to exploit this context information • Interesting queries must be developed • Scoring methods must be applied • Successful application of the method • Transcription factor activity • Other examples in the paper: • Co-operating transcription factors • Kinases Identifying Active Transcription Factors from Expression Data using Pathway Queries, Florian Sohler

More Related