1 / 27

KCK-means A Clustering Method based on Kernel Canonical Correlation Analysis

KCK-means A Clustering Method based on Kernel Canonical Correlation Analysis. Dr. Yingjie Tian. Outline. Motivation & Challenges KCCA, Kernel Canonical Correlation Analysis Our method: KCK-means Experiments Conclusions. Outline. Motivation & Challenges

wells
Download Presentation

KCK-means A Clustering Method based on Kernel Canonical Correlation Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KCK-meansA Clustering Method based on KernelCanonical Correlation Analysis Dr. Yingjie Tian

  2. Outline • Motivation & Challenges • KCCA, Kernel Canonical Correlation Analysis • Our method: KCK-means • Experiments • Conclusions

  3. Outline • Motivation & Challenges • KCCA, Kernel Canonical Correlation Analysis • Our method: KCK-means • Experiments • Conclusions

  4. Motivation • Previous Similarity Metrics • Euclidean distance • Squared Mahalanobis distance • Mutual neighbor distance • … • Fail when there are non-linear correlation between attributes

  5. Motivation • In some interesting application domains, attributes can be naturally split into two subsets, either of which suffices for learning • Intuitively, there may be some projections can reveal the ground truth in these two views • KCCA is a technique that can extract common features from a pair of multivariate data • It is the most promising candidate

  6. Outline • Motivation & Challenges • KCCA, Kernel Canonical Correlation Analysis • Our method: KCK-means • Experiments • Conclusions

  7. Canonical Correlation Analysis(1/2) • X = {x1, x2, … , xl} and Y = {y1, y2, … , yl} denote two views • CCA finds projection vectors wx and wymax the correlation coefficient between and • That is: • Cxyis the between-sets covariance matrix of X and Y, Cxx and Cyy are within-sets covariance matrices.

  8. Canonical Correlation Analysis(2/2) • Cyy is invertible, then solving for the generalized eigenvectors, then we can obtain the sequence of wx’s and then find the corresponding wy’s by using

  9. Why Kernel CCA • Why use Kernel extension of CCA? • CCA may not extract useful descriptors of the data because of its linearity • In order to find nonlinear correlated projections • Sx = { }, Sy= { } KCCA maps xi and yi to and • then and are treated as instances to run CCA routine.

  10. KCCA • Objective function: where α andβ are two desirable projections Kx= andKy= are two kernel matrices • We use Partial Gram-Schmidt Orthogonolisation (PGSO) to approximate the kernel matrices

  11. How to solve KCCA • α can be solved from is used for regularization • βcan be obtained from • a number of α andβ (and corresponding λ) can be found

  12. Outline • Motivation & Challenges • KCCA, Kernel Canonical Correlation Analysis • Our method: KCK-means • Experiments • Conclusions

  13. Project into ground truth • Two kernel functions are defined as Kx(xi, xj) = Ky(yi, yj)= • For any x* and y*, their projections can be obtained by P(x*)= Kx(xi, X)α and P(y*)= Ky(yi, Y)β for two views respectively

  14. Why use other pairs of projections? • In accordance to (Zhou, Z.H, et al), two views are conditionally independent given the class label, the biggest α and β should be in accordance with the ground-truth. • However, in real-world, such conditional independence rarely holds, and information conveyed by the other pairs of correlated projections should not be omitted

  15. Similarity measure based on KCCA μis a parameter which regulates the proportion of the distance between the original instances and the distance of their projections

  16. KCK-means for 2-views • Our method is proposed based on K-means • In fact, we just extend K-means by adding the process of solving the fsim

  17. KCK-means for 1-view • However, two-view data sets are rare in real world • (Nigam, K. et al.) points out that if there is sufficient redundancy among the features, we are able to identify a fairly reasonable division of them • Similarly, we try to randomly split 1-view data set into two parts and treat them as the 2 views of the original data set to perform KCK-means.

  18. Outline • Motivation & Challenges • KCCA, Kernel Canonical Correlation Analysis • Our method: KCK-means • Experiments • Conclusions

  19. Evaluation Metrics • Pair-Precision: • Mutual Information: • Intuitive-Precision:

  20. Results on 2-views and 1-views

  21. Influence of η • There is a precision parameter (or stopping criterion)—η in the PGSO algorithm • The dimensions of the projections rely on η • We also investigate its influence on the performance of KCK-means

  22. Influence of η(2-views)

  23. Influence of η(1-view)

  24. Outline • Motivation & Challenges • KCCA, Kernel Canonical Correlation Analysis • Our method: KCK-means • Experiments • Conclusions

  25. Conclusions(1/2) • Results reflect that by using KCK-means, much better quality of clusters could be obtained than those obtained from K-means and agglomerative hierarchical clustering • We also note that when μis set to be very small or even zero, the performance of KCK-means is the best • It means using the projections obtained from KCCA the similarity between instances already can be measured good enough

  26. Conclusions(1/2) • However, when the number of dimensions of the projections obtained from KCCA is very small, the performance of KCK-means descends very much even worse than those of the two traditional clustering algorithms. • It means, in real-world applications, information conveyed by the other pairs of correlated projections should be also considered • All in all, Dimensions of projections used in KCK-means must be enough

  27. Thank You !

More Related