1 / 25

Mining Graphs with Constrains on Symmetry and Diameter

Mining Graphs with Constrains on Symmetry and Diameter. Natalia Vanetik Deutsche Telecom Laboratories at Ben-Gurion University. Graph mining (1) Problem statement. Graph mining (2) Motivation. Graphs are everywhere Chemical compounds ( Cheminformatics )

Download Presentation

Mining Graphs with Constrains on Symmetry and Diameter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Graphs with Constrains on Symmetry and Diameter Natalia Vanetik Deutsche Telecom Laboratories at Ben-Gurion University IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  2. Graph mining (1)Problem statement IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  3. Graph mining (2)Motivation • Graphs are everywhere • Chemical compounds (Cheminformatics) • Protein structures, biological pathways/networks (Bioinformactics) • Program control flow, traffic flow, and workflow analysis • XML databases, Web, and social network analysis • Graph is a general model • Trees, lattices, sequences, and items are degenerated graphs • Diversity of graphs • Directed vs. undirected, labeled vs. unlabeled (edges & vertices), weighted, with angles & geometry (topological vs. 2-D/3-D) • Complexity of algorithms: many problems are of high complexity (NP complete or even P-SPACE !) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  4. Graphs, graphs, everywhere from H. Jeong et al Nature 411, 41 (2001) Aspirin Yeast protein interaction network Co-author network Internet

  5. Constraints: diameter • Diameter d(G) of a graph G is the maximum among minimal distances between pairs of its vertices. • d(G)=1 implies that G is complete. • d(G)= implies that G is not connected. d(G)=1 d(G)=2 d(G)=2 d(G)= IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  6. Constraints: symmetry • Symmetries of a graph G are determines by its automorphism group Aut(G). • Aut(G) is a permutation group. • Largest possible automorphism group for a graph of size n is Sn, which has order n! Aut(G)=S5 Aut(G)=S3 Aut(G)=D5 Aut(G)=S5 IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  7. Measuring symmetry and diameter (1) • Graph diameter is computable in polynomial time. • Automorphism group of a graph is not likely to be computable in polynomial time. • Best known algorithm: Nauty by B. McKay, outputs a set of generators of Aut(G). • Intuitively, graphs with smaller diameter and higher symmetry are more interesting. d(G)=2 d(G)=3 IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  8. Measuring symmetry and diameter (2) • Symmetry is harder to measure. • Observation: maximum symmetry of a graph is achieved when is automorphism group is the symmetric group of order equal to the size of a graph. • Suggestion: measure symmetry of G as s(G)=|S5|/|S5|=1 s(G)=|S3|/|S5|= 1/20 s(G)=|D5|/|S5|= 1/12 IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  9. Tree decomposition of a graph • Let G=(V,E) be a graph. Tree T is called a tree decomposition of G if • Nodes of T are subsets X1,…,Xn  V such that X1…Xn=V • If node vXiXj , then every node Xk of T on the path from Xi to Xj contains v as well. • For every edge e=(v,u) there exists i so that u,v  Xi. • 2 • 4 1 2 • 2 • 3 4 2 3 4 3 4 T2={{1,2,4},{2,3,4}}, {({1,2,4},{2,3,4})}} G T1={{1,2,3,4}, } IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  10. Minimal tree decomposition • Width of a tree decomposition T is (max i |Xi|)-1. • Minimum width among all tree decomposition is called tree width of a graph. • Tree width equals maximum clique size minus 1. • Tree decomposition of minimum width is called minimal tree decomposition. • Computing minimal tree decomposition is NP-hard problem as it contains the problem of finding all maximum cliques in a graph. IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  11. Different tree decompositions 1 2 3 Non-minimal tree decomposition Minimal tree decomposition 8 4 7 6 5 • 2 • 8 2 3 4 • 2 • 8 2 3 4 2 6 8 2 4 6 • 4 • 6 8 8 7 6 4 5 6 8 7 6 4 5 6 IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  12. Intuition behind the proposed algorithm • Compute the finest tree decomposition possible for every DB transaction under given time constraints. • Use basic pattern growing algorithm, such as FSG or gSpan to extend instances of frequent patterns. • Every time an instance of a frequent pattern is extended by an edge of a node • Compute its diameter and symmetry estimates based on pattern’s position within tree decomposition of a DB transaction; • if one of the estimates is lower than user-specified symmetry or diameter constraints, remove patterns instance from instance list, • otherwise, keep the instance in the list. • If the count of instances is higher than support bound, this is a frequent pattern. IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  13. How does it work? • Let T be tree decomposition of DB graph transaction t. • Let Gt be an instance of a candidate pattern. • Let TG=(VG,EG)T be minimal subtree of T containing G. Claim 1. d(G)d(TG). Claim 2. s(G)≤(|LAut(TG)|X VG|X\EG|!e EG|e|!)/|G|! where LAut is automorphism group of TG viewed as tree where each node X is labeled by |X|. IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  14. Example (1) 1 2 3 8 4 7 6 5 • 2 • 8 2 3 4 2 6 8 2 4 6 Diameter is at least 1 Diameter is at least 2 8 7 6 4 5 6 Pattern instance and corresponding subtree of minimal Tree decomposition IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  15. Example (2) 1 2 3 8 4 7 6 5 • 2 • 8 2 3 4 2 6 8 2 4 6 Symmetry is at most 1 Symmetry is at most 2*2!*1!*1!/4!=1/6 8 7 6 4 5 6 Pattern instance and corresponding subtree of minimal Tree decomposition IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  16. Properties of estimates IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  17. The algorithm IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  18. Correctness IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  19. Complexity concerns IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  20. Test results (symmetry) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  21. Test results (symmetry) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  22. Test results (symmetry) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  23. Test results (diameter) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  24. Test results (diameter) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

  25. Test results (diameter) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China

More Related