Learning multiple nonredundant clusterings

1 / 23

# Learning multiple nonredundant clusterings - PowerPoint PPT Presentation

Learning multiple nonredundant clusterings. Presenter : Wei- Hao Huang Authors : Ying Gui , Xiaoli Z. Fern, Jennifer G. DY TKDD, 2010. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Learning multiple nonredundant clusterings' - gwen

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Learning multiple nonredundantclusterings

Presenter : Wei-Hao Huang

Authors : Ying Gui, Xiaoli Z. Fern, Jennifer G. DY

TKDD, 2010

Outlines
• Motivation
• Objectives
• Methodology
• Experiments
• Conclusions
Motivation
• Data exist multiple groupings that are reasonable and interesting from different perspectives.
• Traditional clustering is restricted to ﬁnding only one single clustering.
Objectives
• To propose a new clustering paradigm for ﬁnding all non-redundant clustering solutions of the data.
Methodology
• Orthogonal clustering
• Cluster space
• Clustering in orthogonal subspaces
• Feature space
• Automatically Finding the number of clusters
• Stopping criteria
Orthogonal clustering

)

Residue space

Clustering in orthogonal subspaces

Projection Y=ATX

• Feature space
• linear discriminant analysis (LDA)
• singular value decomposition (SVD)
• LDA v.s. SVD
• where
Clustering in orthogonal subspaces

A(t)= eigenvectors of

Residue space

Compare moethod1 and mothod2

A(t)= eigenvectors of

M’=M then P1=P2

• Residue space
• Moethod1
• Moethod2
• Moethod1 is a special case of Moethod2.
Experiments
• To use PCA to reduce dimensional
• Clustering
• K-means clustering
• Smallest SSE
• Gaussian mixture model clustering (GMM)
• Largest maximum likelihood
• Dataset
• Synthetic
• Real-world
• Face, WebKB text, Vowel phoneme, Digit
Experiments

Evaluation

Experiments

Synthetic

Experiments

Face dataset

Experiments

WebKB dataset

Vowe phoneme dataset

Experiments

Digit dataset

Experiments
• Finding the number of clusters
• K-means  Gap statistics
Experiments
• Finding the number of clusters
• GMMBIC
• Stopping Criteria
• SSE is less than 10% at first iteration
• Kopt=1
• Kopt> Kmax Select Kmax
• Gap statistics
• BIC Maximize value of BIC
Experiments

Synthetic dataset

Experiments

Face dataset

Experiments

WebKB dataset

Conclusions
• To discover varied interesting and meaningful clustering solutions.
• Method2 is able to apply any clustering and dimensionality reduction algorithm.