1 / 80

Computational Movement Analysis Lecture 2: Clustering Joachim Gudmundsson

Computational Movement Analysis Lecture 2: Clustering Joachim Gudmundsson. Fundamental tools: clustering. Clustering: Group similar objects into clusters. . Fundamental tools: clustering. Clustering: Group similar (sub)curves into clusters. Similarity measure: Fr é chet distance.

ananda
Download Presentation

Computational Movement Analysis Lecture 2: Clustering Joachim Gudmundsson

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Movement AnalysisLecture 2: ClusteringJoachim Gudmundsson

  2. Fundamental tools: clustering Clustering: Group similar objects into clusters.

  3. Fundamental tools: clustering Clustering: Group similar (sub)curves into clusters. Similarity measure:Fréchet distance Question: Do we need any constraints on a cluster? Constraints on subcurves in a cluster?

  4. Aim: Cluster subcurves Cluster of subcurves

  5. Subtrajectory clustering

  6. Subtrajectory clustering

  7. Subtrajectory clustering

  8. Subtrajectory clustering

  9. Recall: Fréchet Distance Fréchet Distance measures the similarity of two curves. Dog walking example • Person is walking his dog (person on one curve and the dog on other) • Allowed to control their speeds but not allowed to go backwards! • Fréchet distance of the curves: minimal leash length necessary for both to walk the curves from beginning to end

  10. Recall: Fréchet Distance Input: Two polygonal chains P=p1, … , pn and Q=q1, … , qm in Rd. The Fréchet distance between P and Q is: where  and  range over all continuous non-decreasing reparametrizations. Note that (0)=p1, (1)=pn, (0)=q1 and (1)=qm. Well-suited for the comparison of curves since it takes the continuity of the curves into account. (P,Q) =

  11. Decision algorithm: compute path Algorithm: 1. Compute Free Space diagram mn cells  O(mn) time 2. Compute a non-xy-decreasing path from (q1,p1) to (qm,pn). Build network O(mn) time. Find a path O(mn) time. (qm,pn) P (q1,p1) Q

  12. Cluster Input: A polygonal curve T, an integer m>1 and a distance d. Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves. Constraints?

  13. Cluster Input: A polygonal curve T, an integer m>1 and a distance d. Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves. Constraint 1: subcurvesare pairwise disjoint

  14. Cluster Input: A polygonal curve T, an integer m>1 and a distance d. Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves. Constraint 1: subcurvesare pairwise disjoint More constraints? d  infinite number of clusters

  15. Cluster Input: A polygonal curve T, an integer m>1 and a distance d. Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves. Constraint 1: subcurvesare pairwise disjoint Constraint 2: cluster has to be maximal “length” d  infinite number of clusters

  16. Decision Problem Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that: the subcurves are pairwise disjoint, the distance between any two subcurves is at most d, and at least one subcurve has length l.

  17. Decision Problem Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that: the subcurves are pairwise disjoint, the distance between any two subcurves is at most d, and at least one subcurve has length l.

  18. Decision Problem Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that: the subcurves are pairwise disjoint, the distance between any two subcurves is at most d, and at least one subcurve has length l. The length of a subcurve cluster is assumed to be maximal.

  19. Decision Problem Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that: the subcurves are pairwise disjoint, the distance between any two subcurves is at most d, and at least one subcurve has length l. The length of a subcurve cluster is assumed to be maximal.

  20. Decision Problem Given a trajectory T, a subtrajectory cluster SC(m,l,d) of T consists of at least m subtrajectoriesT1, … , Tm of T such that: the subtrajectoriesare pairwise disjoint, the distance between any two subtrajectoriesis at most d, and at least one subtrajectory has length l. The length of a subtrajectory cluster is assumed to be maximal.

  21. Problem Decision version:Subtrajectory cluster SC(m,l,d) Given a trajectory T, is there a subtrajectory cluster with parameters m, l and d? Optimisation versions: SC(m,max,d) – maximise length of cluster

  22. Hardness results Theorem 1: Finding any approximation of the SC(m,max,d) problem is 3SUM-hard. Theorem 2: The decision problem SC(m,l,d) is NP-complete. Theorem 3: The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard. [Gudmundsson & van Kreveld’08]

  23. Hardness results Theorem 2: The decision problem SC(m,l,d) is NP-complete. Reduction from MaxClique MaxClique: Is there a clique of size k ina given graph G=(V,E)? Clique of size 4

  24. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d

  25. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d

  26. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d

  27. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d

  28. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d

  29. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d

  30. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d

  31. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d

  32. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). b c d e a b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d SC(m,l=n,d)  Clique of size m in G Problem as hard as MaxClique!

  33. Hardness results Theorem 2: The decision problem SC(m,l,d) is NP-complete.

  34. a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d

  35. Hardness results Theorem 3: The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard.

  36. Hardness results Theorem 3: The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard. Corollary 1: The problem of computing a (2-)-distance approximation of SC(max, l, r), for any constant 0 < < 1, is at least as hard as approximating MaxClique.

  37. Hardness results Theorem 3: The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard. Corollary 1: The problem of computing a (2-)-distance approximation of SC(max, l, r), for any constant 0 < < 1, is at least as hard as approximating MaxClique. Can we find a 2-distance approximation in polynomial time?

  38. Fréchet distance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni The Fréchetdistance of F can be computed by computing the Fréchetdistance between every pair of curves. Time: O( (ninjlog ninj)) i,j If |Fi| = n/m then O((n/m)4log n/m).

  39. Fréchet distance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Observation: Given F1, F2 and F3, we have: F(F1,F3)  F(F1,F2) + F(F2,F3). [Dumitrescu & Rote’04]

  40. Fréchetdistance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Observation: Given F1, F2 and F3, we have: F(F1,F3)  F(F1,F2) + F(F2,F3). [Dumitrescu & Rote’04] a  a+b b Can we use this observation to get an approximation?

  41. Fréchetdistance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Idea: Select a representative curve F1 of F. Compute the maximum Fréchetdistance D between F1 and all other curves in F.

  42. Fréchetdistance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Idea: Select a representative curve F1 of F. Compute the maximum Fréchet distance D between F1 and all other curves in F.  D  F  2D Observation: Gives a 2-approximation

  43. Fréchetdistance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Idea: Select a representative curve F1 of F. Compute the maximum Frechet distance D between F1 and all other curves in F.  D  F  2D Observation: Gives a 2-approximation Time:O((n1ni log n1ni)) i

  44. Decision algorithm: compute path Recall: Deciding if the Fréchet distance between two curves P and Q is less than r can be computed in O(mn) time. The Fréchetdistance between two polygonal curves P and Q can be computed in O(mn log mn) time using parametric search. (qm,pn) P Q P (q1,p1) Q

  45. Recall the problem Given a trajectory T, a subtrajectory cluster SC(m,l,d) of T consists of at least m subtrajectories T1, … , Tm of T such that: the subtrajectories are pairwise disjoint, the distance between any two subtrajectories is at most d, and at least one subtrajectory has length l.

  46. Recall the problem • Input: A trajectory T with n points, an integer m>1 and a real value d>0. • Output: SC(m,max,d) Constraint: For simplicity we will assume that all sub-trajectories in a cluster has to start and end at a vertex. Idea: Create a free space diagram describing the distance between T and T.

  47. Free space diagram of T T

  48. Free space diagram of T T

  49. Free space diagram of T T A B D(A,C)  d D(B,C)  d D(A,B)  2d C

  50. Free space diagram of T C: representative trajectory The length of the SC {A,B,C} is the length of the representative trajectory. A B C

More Related