1 / 38

Dissimilarity representation

Dissimilarity representation. Veronika Cheplygina (with slides by Bob, Marco and David). Representation. Generalization. Sensor. Representation. A. A. (area). B. B. (perimeter). Examples. Examples. max. max. Examples. a Î A , points of A b Î B , points of B

rumer
Download Presentation

Dissimilarity representation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dissimilarity representation Veronika Cheplygina (with slides by Bob, Marco and David)

  2. Representation Generalization Sensor Representation A A (area) B B (perimeter) Representation

  3. Examples

  4. Examples

  5. max max Examples aÎA, points of A b Î B, points of B d(a,b): Euclidean distance D(A,B) = max_a{min_b{d(a,b)}} D(B,A) = max_b{min_a{d(b,a)}} D(A,B) ≠ D(B,A) Hausdorff Distance : DH = max{D(A,B), D(B,A)}

  6. Examples Strings: shapes, amino acid sequences… Alignment of strings X and Y DE (X,Y) = # of edit operations X  Y(insertions, deletions, substitutions) DE ( cat, scan) = 2 Cat  can  scan

  7. Examples A F B B E E F C D C D Graph ( Nodes, Edges, Attributes ) Distance (Graph_1, Graph_2 )

  8. Unlabeled object x k-Nearest Neighbor Training set B Dissimilarities dij between all training objects A Only use k distances from dx.

  9. Unlabeled object x Nearest Neighbor Training set B Dissimilarities dij between all training objects A We are not using DT. ! Can we use this information to do better?

  10. Kernels? • K = N x N matrix of kernels / similarities • Used with support vector machines, BUT optimization only convex for positive semi-definite K (dis)similarity functions used in practice p.s.d. kernels

  11. Alternatives for the Nearest Neighbor Rule • Embedding • Dissimilarity space Pekalska, Duin. The dissimilarity representation for pattern recognition. World Scientific, 2005.

  12. Embedding of Dissimilarities

  13. Embedding Given n x n dissimilarity matrix D Is there a feature matrix X for which Dist(X,X) = D ?

  14. Example Similarity of campaigns Kiescompas.nl 14

  15. Classical scaling • Inner products matrix • Distances in D can be expressed in terms of G • We can do X  G  D • We want to do: D  G  X

  16. Classical scaling • Rewrite G in terms of D (assume zero mean data)

  17. Classical scaling • Eigendecomposition of G • Remember that • Therefore • Columns of V are eigenvectors, with corresponding eigenvalues on the diagonal of λ

  18. Classical scaling • X is originally an n x p matrix, but X’ is an n x n matrix • n eigenvalues of G • p non-zero eigenvalues corresponding to dimensions with largest variance • n-p eigenvalues close to 0

  19. PCA • Rotate zero-mean n x p matrix X to principal axes • Distances in X are preserved in Z • Configuration in d (d < p) dimensions (with largest variance), minimizes “fit” • CS solution of EuclDist(X,X) = Z

  20. Euclidean - Non Euclidean - Non Metric Representation

  21. Non-Euclidean distance

  22. Non-metric distance • Live example!

  23. Non-metric distance • Live example! similar Dutch is German to

  24. Non-metric distance

  25. Non-metric distances Single-linkage clustering / variants of Hausdorff distance

  26. Non-metric dissimilarities We are only given (an estimate of) D We want to find X such that Dist(X,X) = D If Dist is Euclidean, we can find X (up to a rotation). What if D is not Euclidean (or even metric)?

  27. Pseudo-Euclidean Embedding • Euclidean D positive (or zero) eigenvalues: • More positive λi = larger variance along eigenvector • Further apart on dimensionwith λi > 0 = further apart in feature space • Non-Euclidean D  p positive and q negative eigenvalues: • Further apart on dimension with λi< 0 ?

  28. Pseudo-Euclidean Embedding • Non-Euclidean D  p positive and q negative eigenvalues • Ignore? • Use absolute value? • D1  X  D2, D1 ≠ D2 psem.m

  29. Summary Embedding • Find a (possibly lower-dimensional) representation X from n x n dissimilarity matrix D • Exact (up to rotation) reconstruction with classical scaling for Euclidean D. Link to PCA. • D can be non-Euclidean or non-metric, this may be informative, open question how to deal with this

  30. Dissimilarity Space

  31. B A r2(d4) r1(d1) r3(d7) Dissimilarity space Training set Dissimilarities r1 r2 r3 Unlabeled object Selection of 3 objects for representation

  32. Default R = T Selecting R Ì T Random Feature selection Clustering Sparse classifiers (Friday) Prototype selection r1 r2 r3

  33. Example: NIST Digits 3 and 8 Pękalska, E., Duin, R.P.W. and Paclík, P. "Prototype selection for dissimilarity-based classifiers." Pattern Recognition 39.2 (2006): 189-208.

  34. Nearest neighbor vs dissimilarity space knndc.m knnc.m clevald.m

  35. Nearest neighbor vs dissimilarity space d(blue) d(red)

  36. Conclusions

  37. Dissimilarity representation is an alternative for features Classifiers can be built in: (pseudo-)Euclidean spaces by embedding dissimilarity space by selecting a representation set Conclusions

  38. expert knowledge more information in dissimilarity matrix  better performance less restrictions on dissimilarity matrix (non-Euclidean / non-metric) favorite classifier Conclusions

More Related