Discovering Overlapping Groups in Social Media. Xufei Wang , Lei Tang, Huiji Gao, and Huan Liu [email protected] Arizona State University. Social Media. Facebook 500 million active users 50% of users log on to Facebook everyday Twitter 100 million users 300, 000 new users everyday
Connect with others to form “Friends”
Interactwith others (comment, discussion, messaging)
Bookmarkwebsites/URLs (StumbleUpon, Delicious)
Joingroupsif explicitly exist (Flickr, YouTube)
Sharecontent (Flickr, YouTube, Delicious)
Given a User-Tag subscription matrix M, and the number of clusters k, find koverlappingcommunities which consist of both users and tags.
In an Edge-centric view
Similarity between two edges e and e’ can be defined, but not limited, by
Differentiate nodes with varying degrees by normalizing each node with its nodal degree
u Х t
u Х k
Plug in L,W,Z, we obtain
We fix the number of users, tags, and density, but vary the number of clusters
We fixed the number of users, tags, and clusters, but vary the inner-cluster density
Category information reveals personal interests, view group affiliation as features to infer personal interests via cross-validation
The correlation between the number of co-occurrence of two users in different affiliations and their connectivity in real networks.
The larger the co-occurrence of two users, the more likely they are connected
Tag cloud for Category Health
Tag cloud for Cluster Health
Tag cloud for Cluster Nutrition
I. S. Dhillon, “Co-clustering documents and words using bipartite spectral graph partitioning,” in KDD ’01, NY, USA
L. Tang and H. Liu, “Scalable learning of collective behavior based on sparse social dimensions,” in CIKM’09, NY, USA.
L. Tang and H. Liu, “Community Detection and Mining in Social Media,” Morgan & Claypool Publishers, Synthesis Lectures on Data Mining and Knowledge Discovery, 2010.
G. Palla, I. Dernyi, I. Farkas, and T. Vicsek, “Uncovering the overlapping community structure of complex networks in nature and society,” Nature’05, vol.435, no.7043, p.814
K. Yu, S. Yu, and V. Tresp, “Soft clustering on graphs,” in NIPS, p. 05, 2005.
U. Luxburg, “A tutorial on spectral clustering,” Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007.
M. E. J. Newman and M. Girvan, “Finding and evaluating community structure in networks,” Phys. Rev. E, vol. 69, no. 2, p. 026113, Feb 2004.
S. Fortunato, “Community detection in graphs,” Physics Reports, vol. 486, no. 3-5, pp. 75 – 174, 2010.