Manifold Clustering of Shapes Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside
Outline • Problem formulation • Shape space representation. Similarity metric. • Manifold clustering of shapes • Handling noisy and bridged clusters • Experimental evaluation
Problem formulation • Object recognition systems dependent heavily on the accurate identification of shapes • Learning the shapes without supervision is essential when large image collections are available • In this work we propose a robust approach for clustering of 2D shapes *The malaria images are part of the Hoslink medical databank, and the diatoms images are part of the collection used in the ADIAC project.
Data representation • Requirements • invariant to basic geometric transformations • handle limited rotations • low dimensionality for meaningful clustering • Centroid-based “time series” representation • All extracted time series are further standardised and resampled to the same length
Measuring shape similarity • The Euclidean distance does not capture the real similarities • Rotationally invariant distance rd approximate rotations as: and define: • Metric properties of rd
Manifold clustering of shapes • Vision data often reside on a nonlinear embedding that linear projections fail to reconstruct • We apply Isomap to detect the intrinsic dimensionality of the shapes data. • Isomap moves further apart different clusters, preserving their convexity
short circuits disconnected components Handling noisy and bridged clusters • Instability of the Isomap projection • The degree-k-bounded minimum spanning tree (k-MST) problem • The b-Isomap algorithm
Experimental evaluation • Diatom dataset • 4classes • 2 classes (Stauroneis and Flagilaria)
Experimental evaluation • Marine creatures • Arrowheads
Conclusions and future work • We presented a method for clustering of shapes data invariantly to basic geometric transformations • We demonstrated that the Isomap projection built on top of a rotationally invariant distance metric can reconstruct correctly the intrinsic nonlinear embedding in which the shape examples reside. • The degree-bounded MST modification of the Isomap algorithm can decreases the effect of bridging elements and noise in the data. • Our future efforts are targeted towards an automatic adaptive approach for combining the features of Isomap and b-Isomap Thank you!