1 / 26

Community Structure in Large Complex Networks

Community Structure in Large Complex Networks. Liaoruo Wang and John E. Hopcroft Dept. of Computer Engineering & Computer Science, Cornell University In Proc. 7th Annual Conference on Theory and Applications of Models of Computation (TAMC) , June 2010 Presented by Nam Nguyen. Agenda.

lance
Download Presentation

Community Structure in Large Complex Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Community Structure in Large Complex Networks Liaoruo Wang and John E. Hopcroft Dept. of Computer Engineering & Computer Science, Cornell University In Proc. 7th Annual Conference on Theory and Applications of Models of Computation (TAMC), June 2010 Presented by Nam Nguyen

  2. Agenda • Motivation • Introduction • Contributions of the paper • Definitions • WHISKER is NP-Complete. • Algorithms.

  3. Motivation • C.S is a classical but still-hot topic in complex networks. • Previous studies: Communities were assumed to be densely connected inside but sparsely connected outside. • A different point of view: We should disregard “whiskers” and elaborate “cores” in the networks.

  4. Introduction • Roughly speaking • Whiskers: Subsets of vertices that are barely connected to the rest of the network. • Cores: Connected subgraphs that are densely connected inside and well-connected to the rest of the network, i.e., “real communities” • Why??? • For real-world societies, communities are also well connected to the rest of the network. • Imagine a close-nit community, CISE Dept., with only one connection with the outer world. • Definitions come right away.

  5. Contributions • More concrete definitions of “whiskers” and “cores” in a networks. • WHISKER is NP-Complete • Three heuristic algorithms for finding approximate cores. • Simulation results.

  6. Definition • Graph G = (V,E) undirected, A = (Ai,j). For S⊆V, let SC = V\S. • Conduction of S where • A suitable cut

  7. Definition(cont’d) • A k-whisker • A maximal k-whisker

  8. Definition (cont’d) • A whisker • A maximal whisker

  9. Definition (cont’d) • A core

  10. Lemmas Proof The only suitable cut of size = 26 > |S ⋃ T| = 25

  11. Lemmas (cont’d) Proof (1a) exr + exz + eyr + eyz ≤ vx + vy (1b) eyr + exy + ezr + exz ≤ vy + vz (1c) exr + eyr + ezr > vx + vz (1a) + (1b) and use (1c) gives exr+2eyr+ezr+exy+eyz+2exz ≤ vx+2vy+vz < exr+eyr+ezr+vy  eyr + exy + eyz < vy

  12. NP-Completeness • NAE-3-SAT: The problem of determining whether there exists a truth assignment for a 3-CNF Boolean formula such that each clause has at least one true literal and at least one false literal. Fact: NAE-3-SAT is NP-Complete [1] • WHISKER: Given an unweighted undirected graph, determine whether there exists a whisker or not. WHISKER is NP-Complete (of course, from a reduction from NAE-3-SAT)

  13. WHISKER is NP-Complete • Road map • 1. Construct a special graph G of 2n vertices and show that G admits 2n whiskers and no more. • 2. Construct a G-like graph for the 3-SAT problem. • 3. Make a reduction from NAE-3-SAT problem to WHISKER

  14. NP-Completeness • WHISKER is in NP • Reduction from NAE-3-SAT to WHISKER • Consider the following graph (constructed in poly time) • At each row, pick only one vertex (i.e., either xi or ¬xi) • The resulted graph G of n vertices is a whisker • Total number of whiskers is 2n ………… • And no more than that

  15. NP-Complete • 2n whiskers and no more than that!!! Why??? • Suppose there is a whisker W of 2k+j vertices • Cut size of W • By definition of suitable cut size, we have which implies !!!!

  16. NP-Complete • NAE-3-SAT ≤PWHISKER • Consider an instance of NAE-3-SAT with n variables and c clauses. • Construct G1, G2, …, Gc as follow

  17. NP-Complete • NAE-3-SAT ≤PWHISKER • Now, combine all Gi’s and add up all edge weights to get G’. • Next update G G G* 3CNF has a satisfied assignment update G’ G’ contains a whisker

  18. NP-Complete • Update G ( ) • Update G’ • Amplify all edge weights of G’ by a small amount δ where cn2δ << 1 • All whiskers in new G are the same as in old G.

  19. NP-Complete • G* = G + G’ • Goal: If the 3CNF instance has a satisfied truth assignment, then selecting true literal from each row of G* gives us a whisker of size n, and vice versa. • For any truth assignment of 3SAT, rearrange the literals in to TRUE and FALSE columns. • If there is a satisfied not-all-equal assignment for 3SAT • Each clause must have one TRUE and one FALSE literals. • Not all the literals in each clause can be in the same column. • For each ith clause, Gi contains n2-2 edges connecting its two columns • Total cut size is required to satisfied

  20. NP-Complete • If there is NO satisfied not-all-equal assignment for 3SAT • At least one clause i has its literals located in the same column  n2 edges between the two columns of Gi. • For the other (c-1) clauses, there are at most (n2-2) edges connecting the their two columns. Total number of edges: (c-1)(n2-2)+n2 = cn2–2c+2. • Of course, we don’t want selecting the true literal in each row give us a whisker, thus Combining the two inequalities, if ℇ and δ is chosen such that Then If the 3CNF instance has a satisfied truth assignment, then selecting true literal from each row of G* gives us a whisker of size n, and vice versa. • Hence, NAE-3-CNF ≤PWHISKER □

  21. Heuristic Algorithms

  22. Results • On random graph • Alg 2 can positively find an approximate core • Alg 3 fails to find approximate core • The size of core growing linearly with d = np (fixed n) and logarithmically with n (fixed d) • ??? G(n,p) displays core structure with high probability when p > 1/n ???

  23. Results • Textual graph • Vertices and Edges: Words and their semantic Correlations • Data is crawled from 10K scientific papers of KDD conf. (1992-2003) • Pointwise mutual information • Total: 685 vertices and 6.432 edges

  24. Results • Both alg 2 and 3 successfully find approximate cores. • Higher values of λ indicate smaller core sizes. • Fig (b), the best community of the textual graph has a large conductance of .3  best community has as many internal edges as cut edges. • Alg 3 is believed to be more useful.

  25. Comment • Is a “whisker” make sense?

  26. Reference • [1] Schaefer, T. J. The complexity of satisfiability problems. In Proc. 10th Ann. ACM Symp. on Theory of Computing (1978), Association for Computing Machinery, pp. 216-226.

More Related