1 / 19

Fast algorithm for detecting community structure in networks M. E. J. Newman

Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University of Michigan. Sub-topics for today. A little step back.. Background and motivation The Algorithm presented

ova
Download Presentation

Fast algorithm for detecting community structure in networks M. E. J. Newman

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University of Michigan Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  2. Sub-topics for today • A little step back.. • Background and motivation • The Algorithm presented • The good, the bad, the ugly (advantages and drawbacks discussion) • Applications • Summary Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  3. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary A little step back... • Edge-betweenness of an edge is the number of shortest paths between pairs of nodes that run along it.  0 4 1 3 2 5 Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  4. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary A little step back... • Quality function Q: • The fraction of within-community edges minus the expected value of the same quantity for randomized network (edges fall at random with no regard to community structure) Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  5. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary Background and motivation • Community structure in networks is of increasing interest. • Tendency to devide into tightly-knit groups: • Inner edges? Many. • Between-group edges? A lot less. • Enter the Girvan and Newman algorithm. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  6. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary The Girvan And Newman Algorithm • The betweenness of all existing edges in the network is calculated. • The edge with the highest betweenness is removed. • The betweenness of all edges affected by the removal is recalculated. • Steps 2 and 3 are repeated until no edges remain. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  7. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary The Girvan And Newman Algorithm 1 0 1 1 8 2 1 1 24 3 7 9 1 6 9 1 4 1 1 3 5 1 0 1 2 3 4 5 6 7 8 9 As we move down the tree, we see the partitioning of groups. DENDROGRAM Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  8. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary Background and motivation • The G&N algorithm presented runs in worst case O(m^2n), or O(n^3) on a sparse graph. • This limits us to networks with only thousands of nodes. • Skype: 300 million users. • Whatsapp: 450 million users. • Twitter: 243 million active users (monthly). • Facebook: 1.23 billion (!!!) users. • So obviously, we need to find a better solution. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  9. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary The Algorithm presented • The quality function “Q” presented earlier indicates whether a division is meaningful. • Why not use it? Optimize Q over all possible divisions and find the best one! • The Problem is that doing this, in a straight-forward manner, will take anexponential amount of time. • A possible solution is a greedy implementation. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  10. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary The Algorithm presented • Initially, each of the n nodes is a sole member of its own community. • We join communities together in pairs iteratively. • On each step, we choose the join that gives the largest increase (or smallest decrease) in Q. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  11. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary The Algorithm presented ∆Q = eij + eji − 2aiaj = 2(eij − aiaj) • Singleton communities (a=1, b=2, c=3, d=4) • Join (4 choose 2 = 6 options), best 1U2 (a,b=1, c=2, d=3) • Join (3 choose 2 = 3) maximal, best 2U3 (a,b=1, c,d=2) • Further partitioning is negative. A B A B D C D C Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  12. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary The Algorithm presented 0 0 1 2 3 4 5 6 7 8 9 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 2 1 1 0 0 1 0 1 0 0 0 3 0 0 0 0 1 1 0 0 0 0 4 0 0 1 1 0 1 0 0 0 0 5 0 0 0 1 1 0 0 0 0 0 6 0 0 1 0 0 0 0 1 0 0 7 0 0 0 0 0 0 1 0 1 1 8 0 0 0 0 0 0 0 1 0 1 9 0 0 0 0 0 0 0 1 1 0 1 8 2 7 6 9 4 3 5 0 1 2 3 4 5 6 7 8 9 As the *algorithm iterates, we get a partition of the graph. DENDROGRAM * Algorithm implementation from: http://www.elemartelot.org/ Erwan Le Martelot Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  13. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary The Algorithm presented • Operates on completely different principles than the G&N algorithm. • Agglomerative. • Runs in worst case O((m+n)n) or O(n^2) on sparse graphs. • Completes in a reasonable time on a network with a million vertices. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  14. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary Advantages and Drawbacks • Gives generally good divisions. • Typically, when executed is a lot faster then G&N. • THOUSANDS OF TIMES FASTER THEN G&N. • Usually not better then G&N at correctly identifying communities. • Why? Because our algorithm makes desicions based on local information. G&N actively analyzes the entire network. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  15. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary Applications • Random graphs of n=128 vertices devided into 4 groups of 32, with varying avg Zin and Zout values for vertices, where Zin+Zout=16. • G&N generally performs better, although usually only by ~1% identification difference. On high Zin, new algorithm wins. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  16. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary Applications • Real world networks • Zachary Karate Club. • Similar performance to G&M. • American college Football teams. • G&M wins by points on accuracy. • New algorithm is faster. • Callaboration between physicists. • New algorithm wins by knockout on speed • 42 minutes VS estimated 3-5 years. • Results correlate to human observence. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  17. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  18. A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary Summary • The new algorithm is faster and pretty accurate, although not as G&N. • Allows us to study much larger systems than previously possible. • For smaller networks G&N. For larger networks new algorithm. • As you’ll see in the next presentation, there is always room for improvement  Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

  19. THANK YOU! Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

More Related