650 likes | 965 Views
DeltaCon : A Principled Massive-Graph Similarity Function. Danai Koutra Joshua T. Vogelstein Christos Faloutsos. SDM, 2-5 May 2013, Texas-Austin, USA. Problem Definition: Graph Similarity. G A. Given : (i) 2 graphs with the same nodes and different edge sets
E N D
DeltaCon: A Principled Massive-Graph Similarity Function Danai Koutra Joshua T. Vogelstein Christos Faloutsos SDM, 2-5 May 2013, Texas-Austin, USA
Problem Definition:Graph Similarity GA • Given: (i) 2 graphs with the same nodes and different edge sets (ii) node correspondence • Find: similarity score s [0,1] GB Danai Koutra (CMU)
Problem Definition:Graph Similarity GA • Given: (a) 2 graphs with the same nodes and different edge sets (b) node correspondence • Find: similarity score, s [0,1] s = 0: GA <> GB s = 1: GA == GB GB Danai Koutra (CMU)
Motivation (1) Classification 1 different brain wiring? Discontinuity Detection 2 Day 1 Day 2 Day 3 Day 4 Day 5 Danai Koutra (CMU)
Motivation (2) Behavioral Patterns 3 FB message graph vs. wall-to-wall network 4 Intrusion detection Danai Koutra (CMU)
Problem: Graph Similarity Is there any obvious solution? Danai Koutra (CMU)
One Solution GA Edge Overlap (EO) # of common edges (normalized or not) GB Danai Koutra (CMU)
… but “barbell”… EO(B10,mB10) ==EO(B10,mmB10) GA GA GB GB’ Danai Koutra (CMU)
Contributions Delta Connectivity Theory • Axioms • Desired Properties Practice • DeltaCon algorithm • Real-world applications • Experiments on synthetic & real graphs Danai Koutra (CMU)
Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • Experiments • Related Work • Conclusions Danai Koutra (CMU)
Intuition (1) GA STEP 1: Compute the pairwise node influence, SA& SB SA= GB SB = Danai Koutra (CMU)
Intuition (2) STEP 2: Find the similarity between SA & SB. SA= SB = Danai Koutra (CMU)
Intuition (2) STEP 2: Find the similarity between SA & SB. sim(SA , SB) = 0.3 SA= SB = Danai Koutra (CMU)
Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • Experiments • Related Work • Conclusions Danai Koutra (CMU)
… many similarity functions can be defined… But … … what properties should a good similarity function have? Danai Koutra (CMU)
Axioms A1.Identity property sim( , ) = 1 A2.Symmetric property sim(, ) = sim(, ) A3.Zero property sim(, ) = 0 Danai Koutra (CMU)
Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • Experiments • Related Work • Conclusions Danai Koutra (CMU)
Desired Properties (1) • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability Danai Koutra (CMU)
Desired Properties (2) • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability Creation of disconnected components matters more than small connectivity changes. Danai Koutra (CMU)
Desired Properties (3) w=1 • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability ✗ w=5 ✗ The bigger the edge weight, the more the edge change matters. Danai Koutra (CMU)
Desired Properties (4) n=5 GA • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability GB GA GB “Diminishing Returns”: The sparser the graphs, the more important is a ‘’fixed’’ change. Danai Koutra (CMU)
Desired Properties (1) random GB GA • Intuitiveness P1. Edge Importance P2. Weight Awareness P3. Edge-“Submodularity” P4. Focus Awareness • Scalability targeted GB’ Targeted changes are more important than randomchanges of the same extent. Danai Koutra (CMU)
How do state-of-the-art methods fare? edge weight returns focus Danai Koutra (CMU)
Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Experiments • Applications • Related Work • Conclusions Danai Koutra (CMU)
Proposed algorithm: DeltaCon0 BASE ALGO • Find the pairwise node influence, SA& SB. SA= SB = Danai Koutra (CMU)
STEP 1: How to compute node influence? • A1: Pagerank • A2: Personalized Random Walk with Restart (RWR) • A3: Lazy RWR • A4: “Electrical network analogy” - resistances • A5: Belief Propagation FaBP • … Danai Koutra (CMU)
STEP 1: Intuition of BP BACKGROUND iterative message-based method Iteration 1 Iteration 2 e.g., CS person 0 0 0 Danai Koutra (CMU)
STEP 1: Fast BP (1) BACKGROUND ithrow 1 0 1 1 1 0 0 1 1 ? 0 1 0 1 1 1 d1 d2 d3 similar to RWR Danai Koutra (CMU)
STEP 1: Fast BP (1) BACKGROUND ithrow 1 0 1 1 1 0 0 1 1 ? 0 1 0 1 1 1 d1 d2 d3 strength of influence between neighbors similar to RWR Danai Koutra (CMU)
STEP 1: Fast BP (1) BACKGROUND ithrow 1 0 1 1 1 0 0 1 1 ? 0 1 0 1 1 1 d1 d2 d3 strength of influence between neighbors final influence from node i similar to RWR Danai Koutra (CMU)
STEP 1: Fast BP (2) ithrow 1 0 1 1 1 0 0 1 1 ? 0 1 0 1 1 1 d1 d2 d3 ORpairwise influence matrix: 1 0.2 0.1 0.3 1 0.2 0 0.5 1 Danai Koutra (CMU)
STEP 1: Why FaBP? DETAILS • Sound theoretical background (MLE on marginals) • Fast: linear on the edges • Attenuating Neighboring Influence Danai Koutra (CMU)
STEP 1: Why FaBP? INTUITION • Sound theoretical background (MLE on marginals) • Fast: linear on the edges • Attenuating Neighboring Influence for small ε: 1-hop 2-hops … ε > ε2 > ... 0<ε<1 Danai Koutra (CMU)
Proposed algorithm: DeltaCon0 BASE ALGO • Find the pairwise influence (FaBP), SA& SB. • Find distance. SA= SA,SB = Matusita distance SB = Danai Koutra (CMU)
Proposed algorithm: DeltaCon0 BASE ALGO • Apply FaBP to find the pairwise influence matrices, SA& SB. • Find distance. • Find similarity, SA= SA,SB = Matusita distance SB = Danai Koutra (CMU)
… but O(n2) … f a ster? Danai Koutra (CMU)
Proposed Algorithm:DeltaCon – step 1 (1) FASTER ALGO 2 1 1a Create gdisjoint & coveringnode groups. 3 Adjacency matrix A= 4 1 2 3 4 Danai Koutra (CMU)
Proposed Algorithm:DeltaCon – step 1 (2) FASTER ALGO 2 1 1a Create gdisjoint & covering node groups. 1b For group i, find node-group influence (FaBP) 3 4 Danai Koutra (CMU)
Proposed Algorithm:DeltaCon – step 1 (3) INTUITION S’A= 1be.g., for group 1, find node-group influence (FaBP): SA= g rou p s 1234 row-wise 1 2 3 4 Danai Koutra (CMU)
Proposed Algorithm:DeltaCon – step 1 (4) FASTER ALGO S’A= S’B= 2 1 1a Create gdisjoint & covering node groups. 1b For group i, find node-group influence (FaBP) 1c Create node-group influence matrices, S’A& S’B. 3 4 g rou p s 1234 1234 Danai Koutra (CMU)
Proposed Algorithm:DeltaCon (5) FASTER ALGO S’A= S’B= 2 1 1a Create gdisjoint & covering node groups. 1b For group i, find node-group influence (FaBP) 1c Create node-group influence matrices, S’A& S’B. 3 4 g rou p s 1234 1234 + Steps 2 & 3 as before Danai Koutra (CMU)
Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • ENRON: anomaly detection • Brain Graphs: clustering • Experiments • Conclusions Danai Koutra (CMU)
Temporal Anomaly Detection in ENRON (1) • Nodes: employees • Edges: email exchange • DeltaCon similarities of consecutive timestamps sim1 sim2 sim3 sim4 Day 1 Day 2 Day 3 Day 4 Day 5 Danai Koutra (CMU)
Temporal Anomaly Detection in ENRON (2) IMR similarity consecutive days Danai Koutra (CMU)
Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • ENRON: anomaly detection • Brain Graphs: clustering • Experiments • Related Work • Conclusions Danai Koutra (CMU)
Brain Connectivity Graph Clustering (1) • 114 aligned connectomes (FMRI) • Nodes: 70 cortical regions • Edges: connections • Attributes: gender, IQ, age… Danai Koutra (CMU)
Brain Connectivity Graph Clustering (2) • pairwiseDeltaCon similarities • hierarchical clustering • t-test / ANOVA for given attributes Ward’s linkage Danai Koutra (CMU)
Brain Connectivity Graph Clustering (3) High CCI t-test / ANOVA for given attributes p-value = 0.0057 Low CCI Danai Koutra (CMU)
Roadmap • Intuition • Axioms & Properties • Proposed Algorithm: DeltaCon • Applications • Experiments • Scalability • Conclusions Danai Koutra (CMU)
Scalability SLOPE = 1 runtime (min) # of edges = max{m1,m2} # of edges in GA & GB # of nodes Dataset: Kronecker graphs DeltaConis linear on the edges + groups; O(g×n + g×(m1+m2). Danai Koutra (CMU)