1 / 17

Clustering In Large Graphs And Matrices

Clustering In Large Graphs And Matrices. Petros Drineas, Alan Frieze, Ravi Kannan, Santosh Vempala, V. Vinay Presented by Eric Anderson. Outline. Clustering: discrete vs. continuous Singular Value Decomposition (SVD) Applying SVD to clustering Algorithm Analysis and results. Clustering.

Download Presentation

Clustering In Large Graphs And Matrices

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustering In Large Graphs And Matrices Petros Drineas, Alan Frieze, Ravi Kannan, Santosh Vempala, V. Vinay Presented by Eric Anderson

  2. Outline • Clustering: discrete vs. continuous • Singular Value Decomposition (SVD) • Applying SVD to clustering • Algorithm • Analysis and results

  3. Clustering • Group m similar points in Ân, or equivalently, group similar rows of an m x n matrix A • m, n considered variable, k fixed • Many options for goals

  4. Discrete Clustering (DCP) • Minimizes sum of squared distances to k cluster centers: • Cluster centers are the centroids of the cluster points • Each point belongs to one, and only one, cluster • Slow Voronoi algorithm supplied:

  5. Continuous Clustering (CCP) • Minimizes sum of squared distances to some k-dimensional subspace V of Ân: • Gives a lower bound on the optimal value of DCP, eg. V=span(B) • Result: each point belongs to each cluster with some intensity • Overlap is allowed, but now intensity vectors (of clusters) must be orthogonal

  6. CCP Continued • " i, let xi be the ith cluster, an m-vector of intensities • The weight of x is • Require • Optimal clustering of A is a set of orthonormal x1, …, xk where xi is a maximum weight cluster of A subject to being orthogonal to x1, …, xi-1 • Orthogonality: xiTxj=0 for i≠j

  7. More CCP • Orthogonality needed: let v=λu+w where u and w are orthogonal, u is the maximum weight cluster. Then so λ should be 0 for v to be of maximum weight when u is removed

  8. Approximating DCP with CCP • Compute V from CCP • Project A onto V and solve DCP in k dimensions • Result is shown to be a 2-approximation for full DCP (optimal value is off by a factor of no more than 2)

  9. Frobenius Norm • Definition: • Similar to 2-norm for vectors • Not the matrix 2-norm

  10. Singular Value Decomposition (SVD) • The SVD of a matrix A is • Singular values • Singular vectors • Frobenius norm:

  11. Use of SVD • Minimizes error in rank k approximations: • This solves CCP: where is the projection of A onto V, is minimized by Dk, since is of rank at most k.

  12. Algorithm • SVD is rather slow, especially for large matrices • Choose random columns of A for SVD, forming A* • Want to find columns so that with D* induced by the first k singular vectors of A*, for some ε>0.

  13. Algorithm Continued • Steps: 1. Choose c>0, ε>0, δ<1. Let s=4k/(εcδ). For each i, include column with probability , in matrix S. 2. Find STS. 3. Find the top k eigenvectors pi of STS, and for each I, return as the clusters.

  14. Analysis of Algorithm • It is shown that with probability at least 1- δ, • In practice, can pick fewer columns • Actual method: check error by randomly sampling elements of and repeat if not satisfactory • Running time: O(k3/ε6+k2m/ε4)

  15. Preliminary Results • Generated 1000 x 1000 random matrices with certain singular value distributions • Distributions defined by q: fraction of Frobenius norm contained in first k singular values • Checked number of columns of A necessary to get a 3% error bound (ε=0.03)

  16. Preliminary Results

  17. Conclusion • Useful new definition of clusters • Good (linear in m) running time to approximate CCP • Forms 2-approximation for DCP • A new use for the SVD

More Related