120 likes | 294 Views
A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets. Presenter : Keng -Yu Lin Author : Amir Ahmad , Lipika Dey PRL . 2011. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.
E N D
A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets Presenter : Keng-Yu Lin Author : Amir Ahmad , LipikaDey PRL. 2011
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation Almost all subspace clustering algorithms proposed so far are designed for numeric datasets.
Objectives • This paper present a k-means type clustering algorithm that finds clusters in data subspaces in mixed numeric and categorical datasets.
Methodology • k-means clustering algorithm • Place K points into the space represented by the objects that are being clustered. These points represent initial group centroids. • Assign each object to the group that has the closest centroid. • When all objects have been assigned, recalculate the positions of the K centroids. • Repeat Steps 2 and 3 until the centroids no longer move. This produces a separation of the objects into groups from which the metric to be minimized can be calculated.
Experiments error rate : 4.8% Zaki et al. error rate : 3.8% Vote dataset
Experiments error rate : 4.1% Zaki et al. error rate : 0.3% Mushroom datasets
Experiments error rate : 17% DNA datasets
Experiments error rate : 13.9% Huang et al.(2005) error rate: 15% Australian credit data
Conclusions This paper presented a clustering algorithm for subspace clustering for mixed numeric and categorical data.
Comments • Advantage • Applications • Subspace clustering.