1 / 12

Analysis of Social Media MLD 10-802, LTI 11-772

Analysis of Social Media MLD 10-802, LTI 11-772. William Cohen 2-15-11. The “force” on nodes in a graph. Suppose every node has a value (IQ, income,..) y( i ) Each node i has value y i … and neighbors N( i ), degree d i

jabari
Download Presentation

Analysis of Social Media MLD 10-802, LTI 11-772

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Social MediaMLD 10-802, LTI 11-772 William Cohen 2-15-11

  2. The “force” on nodes in a graph • Suppose every node has a value (IQ, income,..) y(i) • Each node ihas value yi… • and neighbors N(i), degree di • If i,jconnected then j exerts a force -K*[yi-yj] on i • Total: • Matrix notation: F = -K(D-A)y - the Laplacian • Interesting (?) goal: set y so (D-A)y = c*y • Picture: neighbors pull iup or down, but net force doesn’t change relative positions of nodes

  3. Spectral Clustering: Graph = Matrix How do I pick y to be an eigenvector for a block-stochastic matrix?

  4. Spectral Clustering: Graph = MatrixW*v1 = v2 “propogates weights from neighbors” e2 0.4 0.2 x x x x x x x x x x x 0.0 x -0.2 y z y y z e3 z z -0.4 y z z z z z z z y e1 e2 -0.4 -0.2 0 0.2 [Shi & Meila, 2002] M

  5. Another way the Laplacian comes up: it defines a cost formula for y where y assigned nodes to + or – classes so as to keep connected nodes in the same class. • Turns out: to minimize yT X y / (yTy) find smallest eigenvector of X • But: this will not be +1/-1, so it’s a “relaxed” solution

  6. Some more terms • If A is an adjacency matrix (maybe weighted) and D is a (diagonal) matrix giving the degree of each node • Then D-A is the (unnormalized) Laplacian • W=AD-1 is a probabilistic adjacency matrix • I-W is the (normalized or random-walk) Laplacian • etc…. • The largest eigenvectors of W correspond to the smallest eigenvectors of I-W • So sometimes people talk about “bottom eigenvectors of the Laplacian”

  7. A W K-nn graph (easy) A Fully connected graph, weighted by distance W

  8. Spectral Clustering: Graph = MatrixW*v1 = v2 “propogates weights from neighbors” e2 0.4 0.2 x x x x x x x x x x x 0.0 x -0.2 y z y y z e3 z z -0.4 y z z z z z z z y e1 e2 -0.4 -0.2 0 0.2 [Shi & Meila, 2002] M

  9. Spectral Clustering: Graph = MatrixW*v1 = v2 “propogates weights from neighbors” • If Wis connected but roughly block diagonal with k blocks then • the top eigenvector is a constant vector • the next k eigenvectors are roughly piecewise constant with “pieces” corresponding to blocks M

  10. Spectral Clustering: Graph = MatrixW*v1 = v2 “propogates weights from neighbors” • If W is connected but roughly block diagonal with k blocks then • the “top” eigenvector is a constant vector • the next k eigenvectors are roughly piecewise constant with “pieces” corresponding to blocks • Spectral clustering: • Find the top k+1 eigenvectors v1,…,vk+1 • Discard the “top” one • Replace every node a with k-dimensional vector xa= <v2(a),…,vk+1 (a) > • Cluster with k-means M

  11. Experimental results: best-case assignment of class labels to clusters Eigenvectors of W Eigenvecs of variant of W

More Related