1 / 33

K-means*: Clustering by Gradual Data Transformation

K-means*: Clustering by Gradual Data Transformation. Mikko Malinen and Pasi Fränti. Speech and Image Processing Unit School of Computing University of Eastern Finland. K-means* clustering. Gradual transformation of data. Fit the data to a model. Model. Intermediate. Final.

Download Presentation

K-means*: Clustering by Gradual Data Transformation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. K-means*: Clustering by Gradual Data Transformation Mikko Malinen and Pasi Fränti • Speech and Image Processing Unit • School of Computing • University of Eastern Finland

  2. K-means* clustering Gradual transformation of data Fit the data to a model Model Intermediate Final Data

  3. K-means clustering Iterate between two steps: 1. Assignment step Assign the points to the nearest centroids 2. Update step Update the location of centroids

  4. K-means* clustering

  5. Example of clustering (s2 dataset)

  6. 0% done

  7. 10% done

  8. 20% done

  9. 30% done

  10. 40% done

  11. 50% done

  12. 60% done

  13. 70% done

  14. 80% done

  15. 90% done

  16. 100% done

  17. Empty clusters problem

  18. Time Complexity

  19. Time Complexity Fixedk-means

  20. s1 d = 2 n = 5000 k = 15 s2 d = 2 n = 5000 k = 15 s3 d = 2 n = 5000 k = 15 s4 d = 2 n = 5000 k = 15 bridge d = 16 n = 4096 k= 256 missa d = 16 n = 6480 k= 256 house d = 3 n=34000 k=256 thyroid d = 5 n = 215 k = 2 iris d = 4 n = 150 k = 2 wine d = 13 n = 178 k = 3 Datasets

  21. Mean square error

  22. Mean square error vs.number of steps

  23. Mean square error vs.number of steps

  24. Mean square error vs.number of steps

  25. Mean square error vs.number of steps

  26. Mean square error vs.number of steps

  27. Mean square error vs.number of steps

  28. Mean square error vs.number of steps

  29. Number of incorrect clusters All correct: proposed:36% k-means:14%

  30. Number of incorrect clusters 1 incorrect: proposed:64% k-means:38%

  31. Number of incorrect clusters 2 incorrect: proposed: 0% k-means:34%

  32. Number of incorrect clusters 3 incorrect: proposed: 0% k-means:10%

  33. Summary We have presented a clustering method based on gradual transformation of data and k-means. Instead of fitting the model to data, we fit the data to a model. The proposed method gives better mean square error than k-means.

More Related