1 / 63

RTG: A Recursive Realistic Graph Generator using Random Typing

RTG: A Recursive Realistic Graph Generator using Random Typing. Leman Akoglu and Christos Faloutsos Carnegie Mellon University. Outline. Motivation Problem Definition Related Work A Little History Proposed Model Experimental Results Conclusion. Motivation - 1.

Download Presentation

RTG: A Recursive Realistic Graph Generator using Random Typing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RTG: A Recursive Realistic Graph Generator using Random Typing Leman Akogluand Christos Faloutsos Carnegie Mellon University

  2. Outline • Motivation • Problem Definition • Related Work • A Little History • Proposed Model • Experimental Results • Conclusion Akoglu, Faloutsos ECML PKDD 2009

  3. Motivation - 1 Complex graphs --WWW, computer, biological, social networks, etc. exhibit many common properties: - power laws - small and shrinking diameter - community structure - … How can we produce synthetic but realistic graphs? http://www.aharef.info/static/htmlgraph/ Akoglu, Faloutsos ECML PKDD 2009

  4. Motivation - 2 Why do we need synthetic graphs? • Simulation • Sampling/Extrapolation • Summarization/Compression • Motivation to understand pattern generating processes Akoglu, Faloutsos ECML PKDD 2009

  5. Problem Definition Discover a graph generator that is: G1. simple: the more intuitive the better! G2. realistic: outputs graphs that obey all “laws” G3. parsimonious: requires few parameters G4. flexible: able to produce the cross-product of un/weighted, un/directed, uni/bipartite graphs G5. fast: generation should take linear time with the size of the output graph Akoglu, Faloutsos ECML PKDD 2009

  6. Outline • Motivation • Problem Definition • Related Work • A Little History • Proposed Model • Experimental Results • Conclusion Akoglu, Faloutsos ECML PKDD 2009

  7. Related Work • Graph Properties What we want to match 2. Graph Generators  What has been proposed earlier Akoglu, Faloutsos ECML PKDD 2009

  8. Related Work 1: Graph Properties Akoglu, Faloutsos ECML PKDD 2009

  9. Related Work 2: Graph Generators • Erdős-Rényi (ER)model [Erdős, Rényi `60] • Small-world model [Watts, Strogatz `98] • Preferential Attachment [Barabási, Albert `99] • Winners don’t take all [Pennock et al. `02] • Forest Fire model [Leskovec, Faloutsos `05] • Butterfly model [McGlohon et al. `08] Akoglu, Faloutsos ECML PKDD 2009

  10. Related Work 2: Graph Generators • Model somestatic graph property • Neglectdynamic properties • Cannot produce weightedgraphs. • Erdős-Rényi (ER)model [Erdős, Rényi `60] • Small-world model [Watts, Strogatz `98] • Preferential Attachment [Barabási, Albert `99] • Winners don’t take all [Pennock et al. `02] • Forest Fire model [Leskovec, Faloutsos `05] • Butterfly model [McGlohon et al. `08] Akoglu, Faloutsos ECML PKDD 2009

  11. Related Work 2: Graph Generators • Random dot-product graphs [Kraetzl, Nickel `05] [Young, Scheinerman `07] • Utility-based models [Fabrikant et al. ’02] [Even-Bar et al. `07] [Laoutaris, `08] • Kroneckergraphs [Leskovec et al. `07] [Akoglu et al. `08] Akoglu, Faloutsos ECML PKDD 2009

  12. Related Work 2: Graph Generators • Produces onlyundirected graphs • Cannot produce weightedgraphs. • Requires quadratictime • Random dot-product graphs [Kraetzl, Nickel `05] [Young, Scheinerman `07] • Utility-based models [Fabrikant et al. ’02] [Even-Bar et al. `07] [Laoutaris, `08] • Kroneckergraphs [Leskovec et al. `07] [Akoglu et al. `08] Akoglu, Faloutsos ECML PKDD 2009

  13. Related Work 2: Graph Generators • Produces onlyundirected graphs • Cannot produce weightedgraphs. • Requires quadratictime • Random dot-product graphs [Kraetzl, Nickel `05] [Young, Scheinerman `07] • Utility-based models [Fabrikant et al. ’02] [Even-Bar et al. `07] [Laoutaris, `08] • Kroneckergraphs [Leskovec et al. `07] [Akoglu et al. `08] • Hardto analyze Akoglu, Faloutsos ECML PKDD 2009

  14. Related Work 2: Graph Generators • Produces onlyundirected graphs • Cannot produce weightedgraphs. • Requires quadratictime • Random dot-product graphs [Kraetzl, Nickel `05] [Young, Scheinerman `07] • Utility-based models [Fabrikant et al. ’02] [Even-Bar et al. `07] [Laoutaris, `08] • Kronecker graphs [Leskovec et al. `07] [Akoglu, `08] • Hardto analyze • Multinomial/Lognormal distrib. • Fixed number of nodes Akoglu, Faloutsos ECML PKDD 2009

  15. Outline • Motivation • Problem Definition • Related Work • A Little History • Proposed Model • Experimental Results • Conclusion Akoglu, Faloutsos ECML PKDD 2009

  16. A Little History - 1 [Zipf, 1932] In many natural languages, the rank r and the frequency frof words follow a power law: fr ∝ 1/r count rank Akoglu, Faloutsos ECML PKDD 2009

  17. A Little History - 2 [Mandelbrot, 1953] “Humans optimize avg. information per unit transmission cost.” Akoglu, Faloutsos ECML PKDD 2009

  18. A Little History - 2 [Miller, 1957] “A monkey types randomly on a keyboard:  Distribution of words follow a power-law.” . . . . . $ a b λ + Space k equiprobable keys Akoglu, Faloutsos ECML PKDD 2009

  19. A Little History - 2 [Conrad and Mitzenmacher, 2004] “Same relation still holds when keys have unequal probabilities.” . . . + a b λ $ Space Akoglu, Faloutsos ECML PKDD 2009

  20. Outline • Motivation • Problem Definition • Related Work • A Little History • Proposed Model • Experimental Results • Conclusion Akoglu, Faloutsos ECML PKDD 2009

  21. Preliminary Model 1RTG-IE: RTG with Independent Equiprobable keys Space Akoglu, Faloutsos ECML PKDD 2009

  22. Lemma 1. W is super-linear on N (power law):Lemma 2. W is super-linear on E (power law):Lemma 3. In(out)-weight Wn of node n is super-linear on in(out)-degree dn (power law): Preliminary Model 1RTG-IE: RTG with Independent Equiprobable keys , where Please find the proofs in the paper. Akoglu, Faloutsos ECML PKDD 2009

  23. Graph Properties Akoglu, Faloutsos ECML PKDD 2009

  24. Lemma 1. W is super-linear on N (power law):Lemma 2. W is super-linear on E (power law):Lemma 3. In(out)-weight Wn of node n is super-linear on in(out)-degree dn (power law): Preliminary Model 1RTG-IE: RTG with Independent Equiprobable keys L05. Densification PL L11. Weight PL L10. Snapshot PL , where Please find the proofs in the paper. Akoglu, Faloutsos ECML PKDD 2009

  25. Advantages of the Preliminary Model 1 G1 - Intuitive G1 - Easy to implement G2 - Realistic –provably follows several rules G3 - Handful of parameters –k, q, W G5 - Fast –generating random sequence of char.s Akoglu, Faloutsos ECML PKDD 2009

  26. Problems of the Preliminary Model 1 1- Multinomial degree distributions count count rank in-degree Akoglu, Faloutsos ECML PKDD 2009

  27. Problems of the Preliminary Model 1 2- No homophily, no community structure  Node i connects to any node j with prob. di*djindependently, rather than connecting to ‘similar’ nodes. Akoglu, Faloutsos ECML PKDD 2009

  28. Preliminary Model 2RTG-IU:RTG with Independent Un-equiprobable keys Solution to Problem 1: [Conrad and Mitzenmacher, 2004] count count . . . . . . . . λ b $ Space count a + a b λ $ + Space count rank in-degree in-degree rank Akoglu, Faloutsos ECML PKDD 2009

  29. Proposed ModelRTG:Random Typing Graphs • Solution to Problem 2: • “2D keyboard” • Generate source- • destination labels • in one shot. • Pick one of the nine • keys randomly. Akoglu, Faloutsos ECML PKDD 2009

  30. Proposed ModelRTG:Random Typing Graphs • Solution to Problem 2: • “2D keyboard” • Repeat recursively. • Terminate each label • when the space key • is typed on each • dimension (dark blue). Akoglu, Faloutsos ECML PKDD 2009

  31. Proposed ModelRTG:Random Typing Graphs Solution to Problem 2: “2D keyboard” How do we choose the keys? Independent model does not yield community structure! pa*pa pa*pb pa*q pb*pa pb*pb pb*q q*q q*pa q*pb Akoglu, Faloutsos ECML PKDD 2009

  32. Proposed ModelRTG:Random Typing Graphs • Solution to Problem 2: • “2D keyboard” • Boost probability • of diagonal keys and • decrease probability • of off-diagonal ones • (0<β<1: imbalance factor) Akoglu, Faloutsos ECML PKDD 2009

  33. Proposed ModelRTG:Random Typing Graphs • Solution to Problem 2: • “2D keyboard” • Boost probability • of diagonal keys and • decrease probability • of off-diagonal ones • (0<β<1: imbalance factor) • Favoring of diagonal keys • creates homophily. Akoglu, Faloutsos ECML PKDD 2009

  34. Proposed Model • Parameters • k: Number of keys • q: Probability of hitting • the space key S • W: Number of multi- • edges in output • graph G • β: imbalance factor Akoglu, Faloutsos ECML PKDD 2009

  35. Proposed Model Up to this point, we discussed directed, weighted and unipartite graphs. Generalizations - Undirected graphs: Ignore edge directions; edge generation is symmetric. - Unweighted graphs: Ignore duplicate edges. - Bipartitegraphs: Different key sets on source and destination; labels are different. Akoglu, Faloutsos ECML PKDD 2009

  36. Outline • Motivation • Problem Definition • Related Work • A Little History • Proposed Model • Experimental Results • Conclusion Akoglu, Faloutsos ECML PKDD 2009

  37. Experimental Results How does RTG model real graphs? • Blognet: a social network of blogs based on citations  undirected, unweighted and unipartite  N = 27, 726; E = 126, 227; over 80 time ticks. • Com2Cand: the U.S. electoral campaign donations network from organizations to candidates  directed, weighted ($amounts) and bipartite  N = 23, 191; E = 877, 721; W = 4, 383, 105, 580 over 29 time ticks. Akoglu, Faloutsos ECML PKDD 2009

  38. Experimental Results Blognet RTG count count degree degree L01. Power-law degree distribution [Faloutsos et al. `99, Kleinberg et al. `99, Chakrabarti et al. `04, Newman `04] Akoglu, Faloutsos ECML PKDD 2009

  39. Experimental Results Blognet RTG count count triangles triangles L02.Triangle Power Law (TPL) [Tsourakakis `08] Akoglu, Faloutsos ECML PKDD 2009

  40. Experimental Results 1 Blognet RTG λrank λrank rank rank L03.Eigenvalue Power Law (EPL) [Siganos et al. `03] Akoglu, Faloutsos ECML PKDD 2009

  41. Graph Properties Akoglu, Faloutsos ECML PKDD 2009

  42. Experimental Results 1 Blognet RTG #edges #edges #nodes #nodes L05. Densification Power Law (DPL) [Leskovec et al. `05] Akoglu, Faloutsos ECML PKDD 2009

  43. Experimental Results Blognet RTG diameter diameter time time L06.Small and shrinking diameter [Albert and Barabási `99, Leskovec et al. `05] Akoglu, Faloutsos ECML PKDD 2009

  44. Experimental Results Blognet RTG size size time time L07.Constant size 2nd and 3rd connected components [McGlohon et al. `08] Akoglu, Faloutsos ECML PKDD 2009

  45. Experimental Results 1 Blognet RTG λ1 λ1 #edges #edges L08.Principal Eigenvalue Power Law (λ1PL) [Akoglu et al. `08] Akoglu, Faloutsos ECML PKDD 2009

  46. Experimental Results 1 Blognet RTG entropy entropy resolution resolution L09. Bursty/self-similar edge/weight additions [Gomez and Santonja `98, Gribble et al. `98, Crovella and Bestavros `99, McGlohon et al. `08] Akoglu, Faloutsos ECML PKDD 2009

  47. Graph Properties Akoglu, Faloutsos ECML PKDD 2009

  48. Experimental Results 2 Com2Cand RTG diameter diameter time time size size time time Akoglu, Faloutsos ECML PKDD 2009

  49. Experimental Results 2 Com2Cand RTG λ1 λ1 #edges #edges λrank λrank rank rank Akoglu, Faloutsos ECML PKDD 2009

  50. Experimental Results 2 Com2Cand RTG count count in-degree in-degree entropy entropy resolution resolution Akoglu, Faloutsos ECML PKDD 2009

More Related