1 / 33

Basic Data Structures for Graph based Visualization and Analysis of Metabolic Networks

Basic Data Structures for Graph based Visualization and Analysis of Metabolic Networks. Jan Griebsch & Arno Buchner & Hanjo Täubig Lehrstuhl für Effiziente Algorithmen Prof. E.W. Mayr Institut für Informatik, TU München. BFAM Workshop 16.-17.01.2004. Outline. Application Requirements

Download Presentation

Basic Data Structures for Graph based Visualization and Analysis of Metabolic Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basic Data Structures forGraph based Visualization and Analysis of Metabolic Networks Jan Griebsch & Arno Buchner & Hanjo Täubig Lehrstuhl für Effiziente Algorithmen Prof. E.W. Mayr Institut für Informatik, TU München BFAM Workshop16.-17.01.2004

  2. Outline • Application Requirements • Related Work • Graph Concepts • Existing Software • Conclusions for Data Models • A Test Case

  3. User-defined Requirement Profile • Work with (metabolic) networks including up to several 1000 nodes (reactions, substrates) • Store arbitrary context information for each node • Search for/filter/extract enzymes, metabolites, pathways/subnetworks according to user-defined criteria • Visualization of such networks • Support expand / contract meta-/nodes • Enable the use of graph algorithms efficiently • Accommodate abstractions such as clusters of nodes

  4. Compound Graphs Definition A compound graph C = (G,D) consists of a graph G = (V,EG) and a directed acyclic graph D=(V,ED) that share the same set of nodes. [Sugiyama and Misue 1991]

  5. Clustered Graphs Definition A clustered graph C = (G, T) consists of a base graph G and a rooted tree T, such that the leaves of T are exactly the vertices of G. [Eades and Feng, 1996]

  6. Graph Views Concept Definition Let G = (VG,EG) be the base graph. The hierarchy is defined by the tree T = (VT,ET), with the leaves L(T) = VG. A view is defined as a subset of VT that induces a partition of VG. [Buchsbaum and Westbrook 2000]

  7. Existing Graph Software Software/Libraries • LEDA, Boost, GTL • no concept of hierarchies • Wilmascope, GVF • Clusters • no concept of views

  8. Data + id : Integer + name : String <<constructor>> + Data (id: int) <<constructor>> + Data (id: int, name : String) <<getter>> + getID () : Integer <<setter>> + setName (name: String) : void GraphControl + id : Integer + name : String <<constructor>> : + Data (id: int) <<constructor>> : + Data (id: int, name: String) <<getter>> : + getID () : Integer <<setter>> : + setName (name: String) : void GraphElement + data : Data + redraw () : void + toString () : String GraphNode # edges : (Vector) <<constructor>> + GraphNode (data : Data ) <<getter>> + getEdgesIterator () : Iterator + addEdge (edge : Edge) : void + removeEdge (edge: Edge) : void Edge + startNode : GraphNode + endNode : GraphNode + directed : boolean = false <<constructor>> + Edge (data : Data) <<setter>> + setStartNode (node : GraphNode) : void <<setter>> + setEndNode (node : GraphNode) : void ClusterNode - members : Vector - internalEdges : Vector <<constructor>> ~ClusterNode (id : int) <<getter>> + getNodes () : Vector + containsNode : boolean + addNode (node : GraphNode) : void + removeNode (node : GraphNode) :void + addInternalEdge (edge : Edge) : void Graph # clusters : Vector # nodes : Vector # edges : Vector <<constructor>> + Graph () <<getter>> + getNodes () : Vector <<getter>> + getParentClusters (node : GraphNode) : Vector + containsNode : boolean + addNode (node : GraphNode) : void + removeNode (node : GraphNode) : void + addEdge (edge : Edge) : void + removeEdge (edge : Edge) : void + createCluster (clusteredNodes : Vector) : ClusterNode Graph Class DiagramExample: WilmaScope

  9. GUI General User Interface Graph Visualisation Tools Moses (CCC Group) KEGG WIT BRENDA Biochemical Visualisation and Analysis Framework for Metabolic Networks (BVAM) Graph Graph Graph Analysis Tools Data Exchange Layer Datasources BioPath Database (CCC)

  10. GraphElement # PropertyMap : HashMap + GraphElement() + add_attribute(keytype, valuetype) + remove_attribute(keytype) + has_attribute(keytype) : bool + get_value(keytype) : value Graph #adj : List<Edge> #nodes : List<Node> + Graph() + add_node(Node) + add_edge(Node, Node) + remove_node(Node) + remove_edge(Edge) Hierarchy + hierarchy(Graph&) + add_node(Node) + remove_node(Node) + father_edge() : Edge + son_edges() : edge_iterator + is_predecessor(Node, Node) : bool + induced_edge(Node, Node) : bool View + view(hierarchy&) + expand(node) + collapse(node) Class Diagram GraphElement Node Relation Edge Graph View Hierarchy

  11. Class Diagram GraphElement How could arbitrary many hierarchies and views be modelled ? Node Relation Observer PatternView and Hierarchy are updated through callbacks [Raitner, 2003]. Edge Graph View Hierarchy

  12. Class Diagram GraphElement Node Relation Edge Graph Observer + Observer(ObservedGraph&) + add_node_handler(Node)+ remove_node_handler(Node) + add_edge(Node, Node) + remove_edge(Edge) ObservedGraph+ ObservedGraph() + add(Observer&) + remove(Observer&) 1 1..* Hierarchy View

  13. Space Time Trade-offs • Induced edges are calculated when needed • No redundant information • Expand/contract worst case: O(|EG|+ |VG|) • Quick edit operations on the base graph • Induced edges are stored[Buchsbaum and Westbrook, 2000] • Expand/contract in optimal time • Space required: O(|EG|D2) • Updates of the base graph are more complicated and need O(D2) expected time • Is there a good compromise ?

  14. Example: Constructing Hierarchies on Metabolic Networks • Motivation • Explore properties of hierarchies on metabolic networks • Test prototype implementations • Study the question to what extend metabolic networks can be said to be composed of distinct sub- and sub-subnetworks (Betweenness centrality could also be used for detecting key reactions/enzymes) • Previous work • Large-scale organization of metabolic networks[Jeong et al., 2000] • The small world inside large metabolic networks[Fell and Wagner, 2001] • Exploring the pathway structure of metabolism[Schuster et al., 2002] • Subnetwork hierarchies of biochemical networks[Holme et al., 2002] • Hierarchical analysis of dependency in metabolic networks [Gagneur et al., 2003]

  15. Example: Constructing Hierarchies on Metabolic Networks • Data • BioPath Database, Computer Chemie Centrum, Prof. Gasteiger • Decomposition • successively delete nodes according to a global centrality measure

  16. Example: Constructing Hierarchies on Metabolic Networks • Deleted Metabolites Metabolite Betweeness PROTON (5.31227e+06) WATER (5.2434e+06) ATP (3.04506e+06) NAD (2.03219e+06) NADP (reduced) (2.26506e+06) COENZYME A (1.75499e+06) NADP (1.83248e+06) NAD (reduced) (2.23032e+06) PYROPHOSPHATE (2.04188e+06) ADP (2.08268e+06) CARBON DIOXIDE (1.89917e+06) PHOSPHATE (with GTP) (2.16805e+06) L-GLUTAMATE (1.78612e+06) ACETYL-COENZYME A (1.71162e+06) AMP (1.9995e+06) PYRUVATE (1.29878e+06) GLYCINE (1.31198e+06) AMMONIA (1.18789e+06) 2-OXOGLUTARATE (1.19791e+06) PHOSPHATE (protonated) (1.14856e+06) FAD (linked with enzyme) (1.2225e+06) OXALOACETATE (1.55343e+06) SUCCINYL-COENZYME A FORMATE L-SERINE L-METHIONINE UTP GLYCERALDEHYDE 3-PHOSPHATE UDP PALMITOYL-ENZYME Initial Graph: 3548 Nodes 8956 Edges

  17. Example: Constructing Hierarchies on Metabolic Networks

  18. Screenshots Graph including all Biopath reactions (Random Layout with LEDA)

  19. Screenshots Graph after deleting of 30 most central Metabolites (Spring embedded 3D Layout by LEDA)

  20. Screenshot Graph with Data from Citrate-Cycle Random Layout, manually changed

  21. Future Work • Implement graphical user interface • Include more data sources • Add chemical analysis abilities:Interact with MOSES, Prof. Gasteiger • Visualization

  22. Thanks We want to thank Prof. E. W. Mayr, Dr. Jens Ernst, Klaus Holzapfel and Moritz Maass for ideas and discussion and Hanjo Täubig for practical support.

  23. References [1] Buchsbaum, A. L. and Westbrook, J.R.. Maintaining Hierarchical Graph Views. 11th ACM-SIAM Symposium on Discrete Algorithms, 2000. [2] Eades, P. and Feng, Q.W. Multilevel Visualization of clustered graphs.Proc. Graph Drawing, LNCS, Vol. 1190, 101-112, Springer Verlag [3] Sugiyama, K. and Misue, K. Visualization of structural information: Automatic Drawing of Compound Digraphs.IEEE Trans. Systems, Man and Cybernetics, 21(4), 876-892. [4] Brandes, U. A Faster Algorithm for Betweenness Centrality. Journal of Mathematical Sociology, 25(2): 163-177, 2001 [5] Gagneur, J., Jackson, D. B. and Casari, G. Hierarchical analysis of dependency in metabolic networks. Bioinformatics, Vol. 19, 2003 [6] Schuster, S. , Pfeiffer, T., Moldenhauer, F., Koch, I. And Dandekar, T. Exploring the pathway structure of metabolism: decomposition into subnetworks and application to Mycloplasma pneumoniae [7] Holme, P., Huss, M. and Jeong, H. Subnetwork hierarchies of biochemical pathways [7] M. Raitner A Library for Hierarchies, Graphs and Views

  24. Visualization using Hierarchies • Example taken from EcoCyc (http://ecocyc.org/) • Useful for getting overview • Limitations here: Only few levels and global change of detail-level

  25. Analysis using Hierarchies Recent approaches: • Holme et al., 2002 • Detecting subnetwork hierarchies of biochemical networks using the betweeness centrality of reactions • Gagneur et al., 2003 • Analysis of hierarchical dependencies of subnetworks using connectivity ranking of metabolites • Schuster et al. , 2003 • Decomposition of the metabolic network using connectivity ranking for metabolites

  26. Resulting Data model Bipartite hierarchical Graph • Bipartite node structure for different representation of reactions and metabolites • Specific information can be labelled to the respective graph element (and used for analysis and visualization) • Graph specific algorithms can be implemented for calculation (pathway searches, statistics) • Biological concepts (pathways, cell compartments) can be modelled and visualized using hierarchical structures • Hierarchical clustering approaches with different criteria can be used for automated network decomposition

  27. Wrapper Datasource Wrapper Concept IDNameSet1, IDNameSet2, … “Tell_IDSets”

  28. “Build( IDNameSet )” Wrapper Datasource Wrapper Concept Edgeset<IDValue1, IDValue2, IDName1,IDName2> : Set

  29. Wrapper Datasource Wrapper Concept “Tell_PropertySet ( IDName )” PropertyName-Set

  30. Wrapper Datasource Wrapper Concept Propertyset<IDValue, PropertyValue> : Set “Get_PropertySet( IDValueSet, PropertyName )”

  31. Graph Concepts and Software • Definitions • Clustered Graphs, [Eades and Feng, 1996] • Compound Graphs, [Sugiyama and Misue 1991] • Hierarchical Graph Views, [Buchsbaum and Westbrook 2000] • Software/Libraries • LEDA • Boost • GTL • Wilmascope • GVF • HGV

  32. Supported Graph Operations • Navigation/View • Expand a node • Contract nodes • Structure Base Graph • Insert a new edge between two nodes s,t • Delete an edge • Insert a new node • Delete a node Hierarchy • Insert a new step into the hierarchy • Remove a step from the hierarchy

More Related