1 / 34

Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments

Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments . Joshua Stender December 3, 2002 Department of Biochemistry. What is a genetic network?.

Download Presentation

Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Microarrays to Functional Genomics:Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry

  2. What is a genetic network? Gene networks are usually represented as directed graphs where the nodes are defined as the genes and the edges represent regulation. Networks summarized a limited relationship between a subset of genes in both positive and negative feedback loops. Jenssen et al. 2001

  3. Why interested in Genetic Networks? • Drug therapies for complex diseases • Gain insights for stimulus-response interactions • Identify novel pathways • Understand cell physiology • Understand multifactor gene-gene or gene-protein relationships in normal and disease states

  4. Modeling Network Framework • Need to define a map from sequence space to functional space • Stage of Regulation (RNA, Protein) • Temporal Regulation • Spatial Regulation(Nucleus,Cytoplasm, etc)

  5. Prazhnik et al. Gene networks:how to put the function in genomics. Trends in Biochem 20: 467-72.

  6. Methods for Developing Gene Networks • Two types of experiments used for network design: Time series and Steady-State gene knock-out • Co-expression clustering • Cis acting elements in promoters(Amy Creekmore) • Reverse Engineering: use of algorithms to generate new networks

  7. Time-Series Approach • Expression level of a certain gene at a time point can be modeled as some function of previous time points. • Problem exists with dimensionality where more genes then time points. Better results require more time points • Solution in the literature: Basic Linear Model, Singular Value Decomposition, and Bayesian Networks

  8. Steady-State Approach • Takes advantage of gene deletions or over expression • If gene A goes up after gene B deleted, perhaps gene B is negative modulator of A and so on • Microarrays offer opportunities to identify gene deletion consequences on entire genomes

  9. Genetic Network Generation Schematic Jong Modeling and simulation of genetic regulatory systems: a literature review. J. Comput Biol 2002;9(1):67-103

  10. Algorithmic Approach to Network Design • Boolean Binary State along with co expression clustering • Continuous Steady-State(Non-Linear):Assumes genes can have intermediate states • Singular Value Decomposition

  11. Methods for Generating Gene Networks • D’Haeseleer et al. Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics 16(8): 707-26. • Fuente et al. Linking the genes: inferring quantitative gene networks from microarray data. Trends in Genetics 18(8): 395-98. • Toh et al. Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling. Bioinformatics 18(2): 287-297.

  12. Types of Clustering • Non-hierarchical- clusters N objects into K Groups until a preset threshold is established. Examples include: K-means, SOM, and Expectation-maximization • Hierarchical- returns a hierarchy of nested clusters (agglomerative vs. divisive)

  13. Why use clustering? • Wealth of data from microarray is overwhelming • Cluster to limit gene list to one that has genes that change significantly • Inference of functional annotation • Extraction of regulatory motifs • Molecular signature for distinguishing cell or tissue types • Use of learning machines to characterize unknown genes

  14. Determining Distances Between Genes • Majority of clustering algorithms use matrix of pair wise distances between genes • Distances can be calculated based on: • Similarity according to positive correlations • Similarity based on positive and negative correlations • Similarity based on mutual information

  15. Guilt-by Association(GBA) • Gene selected at random and determine its nearest neighbor • Genes are clustered based on arbitrary cut-off distances in expression space • Assumes that genes regulated in the same pattern participate in similar processes

  16. K-means Clustering • Partitions N genes into K groups • Centroids are weighted center of a cluster • Each gene is assigned to a cluster and the centroid is calculated • Centroid continuously recalculated and genes reassigned

  17. Self-Organizing Maps (SOM) • Very similar to K-means, however cluster centers are placed on a grid • At each iteration, gene pattern chosen at random and nearest cluster neighbor and cluster center updated • Requires user to define number cluster and grid size

  18. Expectation-Maximization • Clustering similar to K-means, however genes assigned to multiple categories • Membership to a cluster is based on Gaussian distribution of probabilities • Continuously update membership and the 3 following parameters are assigned for each cluster: centroid, covariance, and mixture weight

  19. Determining which clustering analysis to use • Each combination of distance measure and clustering algorithm will emphasize different types of regularities associated with data • Best to complement data with more than one clustering analysis due to variety of algorithms and the multiple functions of each gene

  20. Construction of a Simple Network Clustering Brazhnik et al.

  21. Boolean Networks • Simplification: each gene represented in the binary ON/Off state • Each gene is regulated by other genes using Boolean functions • Most genes are in an intermediate state and therefore are continuous

  22. Example of a Boolean Network Jong Modeling and simulation of genetic regulatory systems: a literature review. J. Comput Biol 2002;9(1):67-103

  23. Limitations of Boolean Networks • Fail to reveal causality • Non-Quantitative • Does not take into account multiple gene states • In the future Protein-Protein interaction maps need to be included

  24. Graphical Gaussian Model • Toh et al. Inference of a Genetic Network by a Combined Approach of Cluster Analysis and Graphical Gaussian Modeling. Bioinformatics 18(2): 287-297. • Goal: To establish a method to combine Clustering and GGM for genetic network predictions.

  25. Graphical Gaussian Modeling • GGM is a multivariate analysis to infer or test a statistical model for the relationship among a plural of variables where a partial correlation is used • Data: 2467 Saccharomyces cerevisiae genes under 79 different conditions

  26. Graphical Gaussian Method • Genes were clustered into 34 distinct clusters • To reduce dimensionality, each cluster was averaged for each condition

  27. Stepwise iterative algorithm developed by Wermuth and Scheidht(1977) Step 0: Complete Graph generated with M nodes and every node connected to each other. Step 1: Calculate partial correlation Matrix P(t) from correlated Coefficient Matrix C(t) where t indicates iteration. Step 2: Find element with smallest absolute value in P(t) and replace it with 0. Step 3: Reconstruct C(t+1) from P(t) Step 4: Termination is dependent on deviance Dev1= Nlog ( | C(t+1) |/|C(0)|) Dev2= Nlog ( | C(t+1) |/|C(t)|) Calculate dev1 and dev2. If either dev <.05 iteration stopped. Else go to step1

  28. Graphical Gaussian Method Sub graph of the independence graph corresponding to partial correlation coefficient matrix

  29. Graphical Gaussian Method Results and conclusions • Algorithm stopped after 189 iterations • SUC2(sucrose hydrolyzing enzyme) was used as model to evaluate accuracy of method: Among 40 known correlations for other genes, method identified 3 to be of same cluster,8 to have correlation of 0 and 29 to interact. • Conclude that about 75% accurate. • Could be a highly effective method for gene network generation if combined with previous knowledge

  30. Linear Additive Method • Fuente et al. “Linking the genes: inferring quantitative gene networks from microarray data.” Trends in Genetics 18(8): 395-8. • Goal: To establish a method for inferring gene networks and the corresponding gene interaction strengths • Represent gene networks that consider expression levels as continuous variables

  31. Linear Additive Method Co-control coefficient FR=Fluorescence Intensities

  32. Linear Additive Method Conclusions • In Silico approach is useful in testing inferred networks • Can be used with experiments with one gene disruption at a time • Generated method for developing gene networks that include quantitative interaction strengths

  33. New and Improved Network Designs • Continuous-value network inference: uses differential equations and allows genes to be continuous variables • Gene Duplication: Network nodes are randomly duplicated to help network connections evolve • Many computer simulations are being developed to help mimic real data to aid in the design of new algorithms

  34. Conclusion and Outlook • Integration of large amount of biological data and computational power increasing our knowledge of complex systems • Increasing need to standardize microarray experiments and create databases • Gradual improvement of cluster and gene inference algorithms • Addition of differential proteomics and also incorporation of multiple regulation steps

More Related