1 / 59

Graph Mining, self-similarity and power laws

Graph Mining, self-similarity and power laws. Christos Faloutsos Carnegie Mellon University. Overview. Achievements global patterns and ‘laws’ (static/dynamic) generators influence propagation communities; graph partitioning local patterns: frequent subgraphs Challenges.

Download Presentation

Graph Mining, self-similarity and power laws

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graph Mining, self-similarity and power laws Christos Faloutsos Carnegie Mellon University

  2. Overview • Achievements • global patterns and ‘laws’ (static/dynamic) • generators • influence propagation • communities; graph partitioning • local patterns: frequent subgraphs • Challenges C. Faloutsos

  3. Motivating questions: • How does the Internet look like? • What constitutes a ‘normal’ social network? • ‘network value’ of a customer? [Domingos+] • which gene/species affects the others the most? C. Faloutsos

  4. Problem #1 - topology How does the Internet look like? Any rules? C. Faloutsos

  5. -0.82 att.com log(degree) ibm.com log(rank) Solution#1: Rank exponent R • A1: Power law in the degree distribution [SIGCOMM99] internet domains C. Faloutsos

  6. Power laws • In- and out-degree distribution of web sites [Barabasi], [IBM-CLEVER] log(freq) from [Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins ] log indegree C. Faloutsos

  7. epinions.com count • who-trusts-whom [Richardson + Domingos, KDD 2001] (out) degree C. Faloutsos

  8. Web Site Traffic log(count) Zipf log(freq) Even more power laws: • web hit counts [w/ A. Montgomery] “yahoo.com” C. Faloutsos

  9. Overview • Achievements • global patterns and ‘laws’ (static/dynamic) • generators • influence propagation • communities; graph partitioning • local patterns: frequent subgraphs • Challenges C. Faloutsos

  10. Problem#2: evolution Given a graph: • how will it look like, next year? [from Lumeta: ISPs 6/1999] C. Faloutsos

  11. Evolution of diameter? • Prior analysis, on power-law-like graphs, hints that diameter ~ O(log(N)) or diameter ~ O( log(log(N))) • i.e.., slowly increasing with network size • Q: What is happening, in reality? C. Faloutsos

  12. Evolution of diameter? • Prior analysis, on power-law-like graphs, hints that diameter ~ O(log(N)) or diameter ~ O( log(log(N))) • i.e.., slowly increasing with network size • Q: What is happening, in reality? • A: It shrinks(!!), towards a constant value C. Faloutsos

  13. Shrinking diameter ArXiv physics papers and their citations [Leskovec+05a] C. Faloutsos

  14. Shrinking diameter ArXiv: who wrote what C. Faloutsos

  15. Shrinking diameter U.S. patents citing each other C. Faloutsos

  16. Shrinking diameter Autonomous systems C. Faloutsos

  17. Temporal evolution of graphs • N(t) nodes; E(t) edges at time t • suppose that N(t+1) = 2 * N(t) • Q: what is your guess for E(t+1) =? 2 * E(t) C. Faloutsos

  18. Temporal evolution of graphs • N(t) nodes; E(t) edges at time t • suppose that N(t+1) = 2 * N(t) • Q: what is your guess for E(t+1) =? 2 * E(t) • A: over-doubled! x C. Faloutsos

  19. E(t) N(t) Densification Power Law ArXiv: Physics papers and their citations 1.69 C. Faloutsos

  20. E(t) N(t) Densification Power Law U.S. Patents, citing each other 1.66 C. Faloutsos

  21. E(t) N(t) Densification Power Law Autonomous Systems 1.18 C. Faloutsos

  22. E(t) N(t) Densification Power Law ArXiv: who wrote what 1.15 C. Faloutsos

  23. Summary of ‘laws’ Static graphs • power law degrees; power law eigenvalues • communities (within communities) • small diameters Dynamic graphs • shrinking diameter • densification power law C. Faloutsos

  24. Overview • Achievements • global patterns and ‘laws’ (static/dynamic) • generators • influence propagation • communities; graph partitioning • local patterns: frequent subgraphs • Challenges C. Faloutsos

  25. Problem#3: Generators • Q: what local behavior can generate such graphs? C. Faloutsos

  26. Problem#3: Generators Q: what local behavior can generate such graphs? • A1: Preferential attachment [Barabasi+] • A2: ‘copying’ model [Kleinberg+] • A3: ‘forest-fire’ model [Leskovec+] • A4: Kronecker [Leskovec+] • A5: Economic reasons [Papadimitriou+] C. Faloutsos

  27. Problem#3: Generators Q: what local behavior can generate such graphs? • A1: Preferential attachment • A2: ‘copying’ model • A3: ‘forest-fire’ model • A4: Kronecker • A5: Economic reasons power law degree + communities + DPL + easily parallilizable C. Faloutsos

  28. Overview • Achievements • global patterns and ‘laws’ (static/dynamic) • generators • influence propagation • communities; graph partitioning • local patterns: frequent subgraphs • Challenges C. Faloutsos

  29. Problem#4: influence propagation • how do influence/rumors/viruses propagate? • what is the best customer to market to? C. Faloutsos

  30. Problem#4: influence propagation • how do influence/rumors/viruses propagate? • tipping point [Kleinberg+] • first eigenvalue -> epidemic threshold [Chakrabarti+] • what is the best customer to market to? • network value of a customer [Domingos+] C. Faloutsos

  31. Overview • Achievements • global patterns and ‘laws’ (static/dynamic) • generators • influence propagation • communities; graph partitioning • local patterns: frequent subgraphs • Challenges C. Faloutsos

  32. Problem #5: communities • how to find ‘natural’ communities, quickly? • how to find ‘strange’/suspicious/valuable edges? C. Faloutsos

  33. Problem #5: communities • how to find ‘natural’ communities, quickly? • network flow [Flake+] • node/edge betweeness (~ ‘stress’) • cross-associations [Chakrabarti+] • 2nd eigenvalue; METIS [Karypis+] • random walks [Newman] • etc etc etc C. Faloutsos

  34. Problem #5: communities • connection sub-graphs [Faloutsos+] C. Faloutsos

  35. Problem #5: communities • connection sub-graphs [Faloutsos+] C. Faloutsos

  36. Problem #5: communities • connection sub-graphs [Faloutsos+] • BANKS system [Chakrabarti+] • ObjectRank [Papakonstantinou+] C. Faloutsos

  37. Overview • Achievements • global patterns and ‘laws’ (static/dynamic) • generators • influence propagation • communities; graph partitioning • local patterns: frequent subgraphs • Challenges C. Faloutsos

  38. Problem #6: local patterns • Which sub-graphs are common/frequent? C C C C H N N C C H molecule #M molecule #1 C. Faloutsos

  39. Problem #6: local patterns • Which sub-graphs are common/frequent? C C C C H N N C C H molecule #M molecule #1 C. Faloutsos

  40. Problem #6: local patterns • Clever extensions of ‘Association Rules’ (= frequent itemsets / market basket analysis) • [Jiawei Han+] • [Jian Pei+] • [G. Karypis] • [M. Zaki+] • ... C. Faloutsos

  41. Overview • Achievements • Challenges • time evolving graphs • multi-graphs C. Faloutsos

  42. Time evolving graphs • Q: what will happen next? (eg., on a traffic matrix?) 1/1/2005 C. Faloutsos

  43. Time evolving graphs • Q: what will happen next? (eg., on a traffic matrix?) 1/2/2005 C. Faloutsos

  44. Time evolving graphs • Q: what will happen next? (eg., on a traffic matrix?) 1/3/2005 C. Faloutsos

  45. Time evolving graphs • Q: what will happen next? (eg., on a traffic matrix?) 1/4/2005 ? ? ? ? C. Faloutsos

  46. Overview • Achievements • Challenges • time evolving graphs • multi-graphs C. Faloutsos

  47. Multi-graphs • Patterns/outliers? friends co-authors C. Faloutsos

  48. Multi-graphs • Relational Learning • link prediction, feature extraction [Jensen+] etc • Dis-ambiguation, de-duplication [Getoor+] etc friends co-authors C. Faloutsos

  49. I x J x K Promising solution: tensors source person type of contact destination person C. Faloutsos

  50. K x R I x J x K C Ix R J x R B = A R x R x R +…+ Tensors: ~SVD, for >=3 modes ¼ Foil: from Tamara Kolda (Sandia) C. Faloutsos

More Related