linkscan overlapping community detection using the link space transformation n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation PowerPoint Presentation
Download Presentation
LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation

Loading in 2 Seconds...

play fullscreen
1 / 33

LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation - PowerPoint PPT Presentation


  • 116 Views
  • Uploaded on

ICDE 2014. LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation. Sungsu Lim † , Seungwoo Ryu ‡ , Sejeong Kwon § , Kyomin Jung ¶ , and Jae-Gil Lee † † Dept . of Knowledge Service Engineering, KAIST ‡ Samsung Advanced Institute of Technology

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'LinkSCAN *: Overlapping Community Detection Using the Link-Space Transformation' - feryal


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
linkscan overlapping community detection using the link space transformation

ICDE 2014

LinkSCAN*: Overlapping Community Detection Using the Link-Space Transformation

Sungsu Lim †, SeungwooRyu‡, Sejeong Kwon§,

  • Kyomin Jung ¶, and Jae-Gil Lee †

† Dept. of Knowledge Service Engineering, KAIST

‡ Samsung Advanced Institute of Technology

§ Graduate School of Cultural Technology, KAIST

  • ¶ Dept. of Electrical and Computer Engineering, SNU
contents
Contents
  • Motivation
  • Link-Space Transformation
  • Proposed Algorithm: LinkSCAN*
  • Experiment Evaluation
  • Conclusions
community detection
Community Detection
  • Network communities
    • Sets of nodes where the nodes in the same set are similar (more internal links) and the nodes in different sets are dissimilar (less external links)
    • Communities, clusters, modules, groups, etc.
  • Non-overlapping community detection
    • Finding a good partitionof nodes

Clusters are NOT overlapped

overlapping community detection
OverlappingCommunity Detection
  • A person (node) can belong to multiple communities, e.g., family, friends, colleagues, etc.
  • Overlapping community detection allows that a node can be included in different groups

family,

friends,

colleagues,

existing methods
Existing Methods
  • Node-based: A node overlaps if more than one belonging coefficient values are larger than some threshold
    • Label Propagation (COPRA) [Gregory 2010, Subelj and Bajec 2011]
  • Structure-based: A node overlaps if it participates in multiple base structures with different memberships
    • Clique Percolation (CPM) [Palla et al. 2005, Derenyi et al. 2005]
    • Link Partition[Evans and Lambiotte 2009 , Ahn et al. 2010]

f(i,c1)=0.35, f(i,c2)=0.05, f(i,c3)=0.4, …

Base structure:

cliques of size

Base structure: links

=0.3

=4

i

i

i

f(i,c)=mean(f(j,c))

j nbr(i)

limitations of existing methods
Limitations of Existing Methods
  • The existing methods do not perform well for
    • 1. networks with many highly overlapping nodes,
    • 2. networks with various base structures, and
    • 3. networks with many weak-ties

f(i,c1)=0.2, f(i,c2)=0.15, f(i,c3)=0.25, f(i,c4)=0.2, …

Weak-tie

c2

=0.3

c1

c3

i

i

i

c4

i: overlapping

COPRA fails

i: non-overlapping

Link partition fails

i: non-overlapping

CPM fails

contents1
Contents
  • Motivation
  • Link-Space Transformation
  • Proposed Algorithm: LinkSCAN*
  • Experiment Evaluation
  • Conclusions
our solution
Our Solution
  • We propose a new framework called the link-space transformation that transforms a given graph into the link-space graph
  • We develop an algorithm that performs a non-overlapping clustering on the link-space graph, which enables us to discover overlapping clustering

Original

Graph

Link-Space

Graph

Link

Communities

Overlapping Communities

Link-Space Transformation

Non-overlapping Clustering

Membership Translation

overall procedure
Overall Procedure
  • We propose an overlapping clustering algorithm using the link-space transformation

Original

Graph

Link-Space

Graph

Link

Communities

Overlapping Communities

Link-Space Transformation

Non-overlapping Clustering

Membership Translation

link space transformation
Link-Space Transformation
  • Topological structure
    • Each link of an original graph maps to a node of the link-space graph
    • Two nodes of the links-space graph are adjacent if the corresponding two links of the original graph are incident
  • Weights
    • Weights of links of the link-space graph are calculated from the similarity of corresponding links of the original graph

i1

j1

3

0

1

2

4

i0

j2

j3

i2

i

j

jk

j4

ik

k

k8

k5

5

7

6

8

k7

k6

overall procedure1
Overall Procedure
  • Overlapping clustering algorithm using the link-space transformation

Original

Graph

Link-Space

Graph

Link

Communities

Overlapping Communities

Link-Space Transformation

Non-overlapping Clustering

Membership Translation

clustering on link space graph
Clustering on Link-Space Graph
  • Applying a non-overlapping clustering algorithm to the link-space graph
  • We use structural clustering that can assign a node into hubs or outliers (neutral membership)

0

4

1

03

13

34

Another weights are less than 1/3

3

1/2

1/2

1

1

2

5

12

45

35

23

1/2

1/2

Original graph

Non-overlapping clustering on the link-space graph

overall procedure2
Overall Procedure
  • Overlapping clustering algorithm using the link-space transformation

Original

Graph

Link-Space

Graph

Link

Communities

Overlapping Communities

Link-Space Transformation

Non-overlapping Clustering

Membership Translation

membership translation
Membership Translation
  • Memberships of nodes of the link-space graph map to the memberships of links of the original graph
  • Memberships of a node of the original graph are from the memberships of incident links of the node

0

03

4

1

13

34

1/2

1/2

3

1

1

12

45

35

23

1/2

1/2

2

5

Non-overlapping clustering on the link-space graph

Membership translation

advantages of link space graph
Advantages of Link-Space Graph
  • Inheriting the advantages of the link-space graph, finding disjoint communities enables us to find overlapping communities where its original structure is preserved since similarity properly reflect the structure of the original graph.

Preserving the original structure

Easier to find overlapping communities

Link-space graph

Easier to find overlapping communities while preserving the original structure

contents2
Contents
  • Motivation
  • Link-Space Transformation
  • Proposed Algorithm: LinkSCAN*
  • Experiment Evaluation
  • Conclusions
linkscan
LinkSCAN*
  • We propose an efficient overlapping clustering algorithm using the link-space transformation

For a massive graph, it may be dense

Original

Graph

Link-Space

Graph

Link

Communities

Overlapping Communities

Link-Space Transformation

Structural Clustering

Membership Translation

linkscan1
LinkSCAN*
  • We propose an efficient overlapping clustering algorithm using the link-space transformation

Sampling process

Original

Graph

Link-Space

Graph

Link

Communities

Overlapping Communities

Link-Space Transformation

Structural Clustering

Membership Translation

linkscan2
LinkSCAN*
  • We propose an efficient overlapping clustering algorithm using the link-space transformation

Original

Graph

Link-Space

Graph

Sampled Graph

Link

Communities

Overlapping Communities

Link-Space Transformation

Link

Sampling

Structural Clustering

Membership Translation

link sampling
Link Sampling
  • Sampling Strategy: For each node , we sample incident links of , where and is the degree of
  • Thm 1 guarantees that sampling errors are not significant even when is small
  • For real nets, a sampled graph and the link-space graph are close (NMI>0.9) , while sampling rate is small (~0.1)
  • Thm 1 (Error bound)
    • Applying Chernoff bound, the estimation error of selecting core nodes decreases exponentially as the ’s increase.
contents3
Contents
  • Motivation
  • Link-Space Transformation
  • Proposed Algorithm: LinkSCAN*
  • Experiment Evaluation
  • Conclusions
network datasets
Network Datasets
  • Synthetic network: LFR benchmark networks[Lancichinetti and Fortunato 2009]
  • Real network: Social and information networks [snap.stanford.edu/data/ and www.nd.edu/~networks/resources.htm]
performance evaluation
Performance Evaluation
  • When ground-truth is known
    • NMI for overlapping clustering [ancichietti et al. 2009]
    • F-score (performance of identifying overlapping nodes)
  • When ground-truth is unknown
    • Quality (Mov): Modularity for overlapping clustering [Lazar et al. 2010]
    • Coverage (CC): Clustering coverage [Ahn et al. 2010]
problem 1
Problem 1
  • For networks with many highly overlapping nodes, LinkSCAN* outperforms the existing methods.
problem 2
Problem 2
  • For networks with various base-structures, our method performs well compared to the existing methods
problem 3
Problem 3
  • For networks with many weak ties, the existing methods fail for the following toy networks. But, LinkSCAN* detects all the clusters well
real networks
Real Networks
  • For real network datasets, the normalized measure of (Quality + Coverage) indicates that LinkSCAN* is better than the existing methods.
link sampling1
Link Sampling
  • The comparisons between the use of the link-space graph (LinkSCAN) and the use of sampled graphs (LinkSCAN*) show that LinkSCAN* improves efficiency with small errors

Enron-email network

# nodes = 37K

# links = 184K

scalability
Scalability
  • The running time of LinkSCAN∗ for a set of LFR benchmark networks shows that LinkSCAN∗ has near-linear scalability

LFR benchmark networks

# nodes = 1K to 1M

# links = 10K to 10M

contents4
Contents
  • Motivation
  • Link-Space Transformation
  • Proposed Algorithm: LinkSCAN*
  • Experiment Evaluation
  • Conclusions
conclusions
Conclusions
  • We propose a notion of the link-space transformation and develop a new overlapping clustering algorithms LinkSCAN* that satisfy membership neutrality
  • LinkSCAN* outperforms existing algorithms for the networks with many highly overlapping nodes and those with various base-structures
acknowledgement
Acknowledgement
  • Coauthors
  • Funding Agencies
    • This research was supported by National Research Foundation of Korea