1 / 13

Modularity and Community Structure in Networks*

Modularity and Community Structure in Networks*. Final project *Based on a paper by M.E.J Newman in PNAS 2006. Introduction. Networks. A network: presented by a graph G(V,E): V = nodes, E = edges (link node pairs) Examples of real-life networks: social networks (V = people)

lidia
Download Presentation

Modularity and Community Structure in Networks*

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006

  2. Introduction

  3. Networks • A network: presented by a graph G(V,E):V = nodes, E = edges (link node pairs) • Examples of real-life networks: • social networks (V = people) • World Wide Web (V= webpages) • protein-protein interaction networks (V = proteins)

  4. Protein-protein Interaction Networks • Nodes – proteins (6K), edges – interactions (15K). • Reflect the cell’s machinery and signaling pathways.

  5. Communities (clusters) in a network • A community (cluster) is a densely connected group of vertices, with only sparser connections to other groups.

  6. Searching for communities in a network • There are numerous algorithms with different "target-functions": • "Homogenity" - dense connectivity clusters • "Separation"- graph partitioning, min-cut approach • Clustering is important for understanding the structure of the network • Provides an overview of the network

  7. Distilling Modules from Networks Motivation: identifying protein complexes responsible for certain functions in the cell

  8. Newman's network division algorithm http://www.pnas.org/content/103/23/8577.full

  9. Important features of Newman's clustering algorithm • The number and size of the clusters are determined by the algorithm • Attempts to find a division that maximizes a modularity score Q • heuristic algorithm • Notifies when the network is non-modular

  10. Overview of the algorithm

  11. Spectral 2-division algorithm • Input: adjacency matrix A (n vertices) • Output: a (1)-vector of size n representing the 2-division • "-1" cluster (vertices whose corresponding entry is -1) and "+1" cluster (vertices whose corresponding entry is +1) • Build a modularity matrix B from A • Compute the leading eigen-pair (u1, 1) of B • u1 is the eigen-vector (size n), 1is the eigen-value.leading eigen-pair: Bu1 = 1u1.1 is the maximal eigen value • If (1== 0) => the network is indivisible • Else (heuristic...) • Transform u1into vector (1)-vectors • Q = sTBs • if (Q > 0) return s, else return (+1,....,+1)

  12. Dividing into more than 2 • How to compute into more than 2? • Idea: apply the algorithm recursively* on every group. • The algorithm should be generalized for a 2-division of a group in the network

  13. Newman's clustering algorithm • P* = {{1,....,n}} (*singleton nodes should be removed) • For each group g in P • Remove g from P • Perform a spectral 2-division on g • if g is divisible - improve the 2-division by additional heuristic. • Add* each subgroup in the 2-division to P * if the subgroup has more than one element, and is different from g.

More Related