MATISSE: A Modular Analysis Tool for Interactions and Gene Similarity Networks
220 likes | 345 Views
MATISSE is a powerful analysis tool designed to identify functional modules within biological networks by leveraging network topology alongside high-throughput biological data. It enables users to cluster gene expression data, extract regulatory networks, and analyze protein interaction networks. The tool supports low-quality data integration and provides insightful visualizations. MATISSE identifies gene sets with correlated expression patterns and produces connected modules without prior specification of the number of modules. Extensions like CEZANNE and DEGAS enhance its functionality, enabling further analysis of gene regulation and pathways.
MATISSE: A Modular Analysis Tool for Interactions and Gene Similarity Networks
E N D
Presentation Transcript
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts http://acgt.cs.tau.ac.il/matisse Igor Ulitsky and Ron Shamir Identification of Functional Modules using Network Topology and High-Throughput Data. BMC Systems Biology 1:8 (2007).
Microarray data analysis • Input: expression levels of (all) genes in several conditions • Analysis methods: • Clustering (CLICK) • Biclustering (SAMBA) • Extraction of regulatory networks
Protein interaction network analysis • Input: Network with nodes=proteins/genes edges=interactions • Analysis methods: • Global properties • Motif content analysis • Complex extraction • Cross-species comparison
Integrated analysis • Combined support for low quality data • Joint visualization • Statistics of known pathways • Detection of “hot spots”
MATISSE • Identify sets of genes (modules) that • Have highly correlated expression patterns • Induce connected subgraphs in the interaction network Interaction High Similarity
MATISSE workflow • Seed generation • Greedy optimization • Significance filtering
Advantages of MATISSE • No need for confidence estimation on individual measurements • Works even when only a fraction of the genes have expression patterns • Can handle any similarity data, not only expression • Produces connected modules • No need to specify the number of modules
Osmotic shock response of S. cerevisiae • Network of 6,246 genes and 65,990 protein-protein and protein-DNA interactions • 133 experimental conditions – response of perturbed strains to osmotic shock (O’Rourke and Herskowitz, 2004) • 2,000 genes filtered based on variation criterion
Pheromone response subnetwork Back Front
Back Front Proteolysis subnetwork
Performance comparison % of modules % of modules with category enrichment at p< 10-3
Performance comparison (2) % of annotations % annotations w enrichment at p<10-3 in modules
Human cell cycle • Constructed a network with 6,000 nodes, 25,000 edges • HPRD • BIND • Y2H studies • SPIKE • HeLa cell cycle time series (Whitfield ’02) • Produced subnetworks enriched with all the phases of the cell cycle
Extensions of MATISSE • CEZANNE • Utilizes confidence-based networks • Extracts subnetworks that are connected with high confidence and co-expressed • Applied to 11 studies of gene expression in the blood • Not yet implemented in the MATISSE application
Extensions of MATISSE • DEGAS • Utilizes case-control expression data • Identifies disregulated pathways – areas in the network in which many genes are dysregulated in most of the cases • Beta version implemented in the MATISSE software • Ulitsky, Karp and Shamir RECOMB 2008
Difficulties with prior approaches • In case-control data, gene pattern correlation can be due to diverse non-disease related factors • Patients are different • Genetic background • Other diseases/confounding factors • Disease grade • Current methods assume that the same genes are dysregulated in all the patients • A weaker assumption – a lot of dysregulated genes appear in the same dysregulated pathway www.hrphotocontest.com
HD down-regulated • The pathway down-regulated in Huntington’s disease (HD) • Enriched with: • HD modifiers • HD relevant genes • Calcium signalling Clear outlier Huntingtin
Extensions of MATISSE • Identification of modules correlated with external parameters • Numerical parameters: Age, tumor grade etc. • Logical parameters: Gender, tumor type • Identifies subnetworks with genes that are both • Correlated with the clinical parameter • Correlated with one another
MATISSE tool capabilities • MATISSE algorithm execution • Dynamic subnetwork layout • Customized node/edge highlighting • Dynamic expression matrix viewer • Module annotation • TANGO – Gene Ontology • Annotations with custom datasets • Calculation of different coefficients based on network/expression