1 / 34

Lecture 3-4 Clustering (1hr) Gaussian Mixture and EM (1hr)

Lecture 3-4 Clustering (1hr) Gaussian Mixture and EM (1hr). Tae-Kyun Kim. Vector Clustering. 2D data vectors (green) are grouped to two homogenous clusters (blue and red). Clustering is achieved by an iterative algorithm (left to right). The cluster centers are marked x. .

quade
Download Presentation

Lecture 3-4 Clustering (1hr) Gaussian Mixture and EM (1hr)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 3-4Clustering (1hr)Gaussian Mixture and EM (1hr) Tae-Kyun Kim

  2. Vector Clustering 2D data vectors (green) are grouped to two homogenous clusters (blue and red). Clustering is achieved by an iterative algorithm (left to right). The cluster centers are marked x.

  3. Pixel Clustering (Image Quantisation) Image pixels are represented by 3D vectors of R,G,B values. The vectors are grouped to K=10,3,2 clusters, and represented by the mean values of the respective clusters. R G B ``

  4. Patch Clustering (BoW in Lecture 9-10) Image patches are harvested around feature points in a large number of images. They are represented by finite dimensional vectors, and clustered to form a visual dictionary. SIFT or raw pixels 20 D=400 20 … dimension D …… …… K codewords …

  5. Image Clustering Whole images are represented as finite dimensional vectors. Homogenous vectors are grouped together in Euclidean space. ……

  6. K-means vs GMM Two representative techniques are k-means and Gaussian Mixture Model (GMM). K-means assigns data points to the nearest clusters, while GMM assigns data to the Gaussian densities that best represent the data. Hard clustering: a data point is assigned only one cluster. Soft clustering: a data point is assigned multiple Gaussians probabilistically.

  7. Matrix and Vector Derivatives

  8. K-means Clustering

  9. till converge

  10. K=2 rnk μ 1 2 μ

  11. Convergence proof (yes) Global minimum (no)

  12. V= V=

  13. Statistical Pattern Recognition Toolbox for Matlab http://cmp.felk.cvut.cz/cmp/software/stprtool/ …\stprtool\probab\cmeans.m …\stprtool\probab\cmeans_tk.m

  14. Mixture of Gaussians

  15. Maximum Likelihood s.t.

  16. objective ftn. f(x) constraints g(x) max f(x) s.t. g(x)=0 max f(x) + g(x) http://en.wikipedia.org/wiki/Lagrange_multiplier

  17. till converge

  18. Statistical Pattern Recognition Toolbox for Matlab http://cmp.felk.cvut.cz/cmp/software/stprtool/ …\stprtool\visual\pgmm.m …\stprtool\demos\demo_emgmm.m

  19. Supplementary Material

  20. Information Theory (for Lecture 7-8)

  21. Advanced topic (optional) http://www.iis.ee.ic.ac.uk/~tkkim/mlcv/lecture_clustering_em.pdf

  22. EM Algorithm in General

More Related