computational movement analysis lecture 2 clustering joachim gudmundsson n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Computational Movement Analysis Lecture 2: Clustering Joachim Gudmundsson PowerPoint Presentation
Download Presentation
Computational Movement Analysis Lecture 2: Clustering Joachim Gudmundsson

Loading in 2 Seconds...

play fullscreen
1 / 80

Computational Movement Analysis Lecture 2: Clustering Joachim Gudmundsson - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

Computational Movement Analysis Lecture 2: Clustering Joachim Gudmundsson. Fundamental tools: clustering. Clustering: Group similar objects into clusters. . Fundamental tools: clustering. Clustering: Group similar (sub)curves into clusters. Similarity measure: Fr é chet distance.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Computational Movement Analysis Lecture 2: Clustering Joachim Gudmundsson' - ananda


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
fundamental tools clustering
Fundamental tools: clustering

Clustering:

Group similar objects into clusters.

fundamental tools clustering1
Fundamental tools: clustering

Clustering:

Group similar (sub)curves into clusters.

Similarity measure:Fréchet distance

Question:

Do we need any constraints on a cluster?

Constraints on subcurves in a cluster?

aim cluster subcurves
Aim: Cluster subcurves

Cluster of subcurves

recall fr chet distance
Recall: Fréchet Distance

Fréchet Distance measures the similarity of two curves.

Dog walking example

  • Person is walking his dog (person on one curve and the dog on other)
  • Allowed to control their speeds but not allowed to go backwards!
  • Fréchet distance of the curves: minimal leash length necessary for both to walk the curves from beginning to end
recall fr chet distance1
Recall: Fréchet Distance

Input: Two polygonal chains P=p1, … , pn and Q=q1, … , qm in Rd.

The Fréchet distance between P and Q is:

where  and  range over all continuous non-decreasing reparametrizations.

Note that (0)=p1, (1)=pn, (0)=q1 and (1)=qm.

Well-suited for the comparison of curves since it takes the continuity of the curves into account.

(P,Q) =

decision algorithm compute path
Decision algorithm: compute path

Algorithm:

1. Compute Free Space diagram

mn cells  O(mn) time

2. Compute a non-xy-decreasing path

from (q1,p1) to (qm,pn).

Build network O(mn) time.

Find a path O(mn) time.

(qm,pn)

P

(q1,p1)

Q

cluster
Cluster

Input: A polygonal curve T, an integer m>1 and a distance d.

Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves.

Constraints?

cluster1
Cluster

Input: A polygonal curve T, an integer m>1 and a distance d.

Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves.

Constraint 1: subcurvesare pairwise disjoint

cluster2
Cluster

Input: A polygonal curve T, an integer m>1 and a distance d.

Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves.

Constraint 1: subcurvesare pairwise disjoint

More constraints?

d

 infinite number of clusters

cluster3
Cluster

Input: A polygonal curve T, an integer m>1 and a distance d.

Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves.

Constraint 1: subcurvesare pairwise disjoint

Constraint 2: cluster has to be maximal “length”

d

 infinite number of clusters

decision problem
Decision Problem

Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that:

the subcurves are pairwise disjoint,

the distance between any two subcurves is at most d, and

at least one subcurve has length l.

decision problem1
Decision Problem

Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that:

the subcurves are pairwise disjoint,

the distance between any two subcurves is at most d, and

at least one subcurve has length l.

decision problem2
Decision Problem

Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that:

the subcurves are pairwise disjoint,

the distance between any two subcurves is at most d, and

at least one subcurve has length l.

The length of a subcurve cluster is assumed to be maximal.

decision problem3
Decision Problem

Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that:

the subcurves are pairwise disjoint,

the distance between any two subcurves is at most d, and

at least one subcurve has length l.

The length of a subcurve cluster is assumed to be maximal.

decision problem4
Decision Problem

Given a trajectory T, a subtrajectory cluster SC(m,l,d) of T consists of at least m subtrajectoriesT1, … , Tm of T such that:

the subtrajectoriesare pairwise disjoint,

the distance between any two subtrajectoriesis at most d, and

at least one subtrajectory has length l.

The length of a subtrajectory cluster is assumed to be maximal.

problem
Problem

Decision version:Subtrajectory cluster SC(m,l,d)

Given a trajectory T, is there a subtrajectory cluster with parameters m, l and d?

Optimisation versions:

SC(m,max,d) – maximise length of cluster

hardness results
Hardness results

Theorem 1:

Finding any approximation of the SC(m,max,d) problem is 3SUM-hard.

Theorem 2:

The decision problem SC(m,l,d) is NP-complete.

Theorem 3:

The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard.

[Gudmundsson & van Kreveld’08]

hardness results1
Hardness results

Theorem 2:

The decision problem SC(m,l,d) is NP-complete.

Reduction from MaxClique

MaxClique:

Is there a clique of size k ina given graph G=(V,E)?

Clique of size 4

longest subtrajectory cluster np complete

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

a

b

c

d

e

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

longest subtrajectory cluster np complete1

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

a

b

c

d

e

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

slide26

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

a

b

c

d

e

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

longest subtrajectory cluster np complete2

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

a

b

c

d

e

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

longest subtrajectory cluster np complete3

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

a

b

c

d

e

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

longest subtrajectory cluster np complete4

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

a

b

c

d

e

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

longest subtrajectory cluster np complete5

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

a

b

c

d

e

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

longest subtrajectory cluster np complete6

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

a

b

c

d

e

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

longest subtrajectory cluster np complete7

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

b

c

d

e

a

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

SC(m,l=n,d)  Clique of size m in G

Problem as hard as MaxClique!

hardness results2
Hardness results

Theorem 2:

The decision problem SC(m,l,d) is NP-complete.

longest subtrajectory cluster np complete8

a

d

c

e

b

MaxClique

Longest subtrajectory cluster: NP-complete

Problem: SC(m,l=n,d).

a

b

c

d

e

b,c,d

a,c,e

a,b

a,e

b,d

d,e

b,c

a,c

e

d

hardness results3
Hardness results

Theorem 3:

The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard.

hardness results4
Hardness results

Theorem 3:

The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard.

Corollary 1:

The problem of computing a (2-)-distance approximation of

SC(max, l, r), for any constant 0 < < 1, is at least as hard as

approximating MaxClique.

hardness results5
Hardness results

Theorem 3:

The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard.

Corollary 1:

The problem of computing a (2-)-distance approximation of

SC(max, l, r), for any constant 0 < < 1, is at least as hard as

approximating MaxClique.

Can we find a 2-distance approximation in polynomial time?

fr chet distance between m curves
Fréchet distance between m curves

Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni

The Fréchetdistance of F can be computed by computing the Fréchetdistance between every pair of curves.

Time: O( (ninjlog ninj))

i,j

If |Fi| = n/m then O((n/m)4log n/m).

fr chet distance between m curves1
Fréchet distance between m curves

Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni

Observation: Given F1, F2 and F3, we have:

F(F1,F3)  F(F1,F2) + F(F2,F3).

[Dumitrescu & Rote’04]

fr chet distance between m curves2
Fréchetdistance between m curves

Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni

Observation: Given F1, F2 and F3, we have:

F(F1,F3)  F(F1,F2) + F(F2,F3).

[Dumitrescu & Rote’04]

a

 a+b

b

Can we use this observation to get an approximation?

fr chet distance between m curves3
Fréchetdistance between m curves

Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni

Idea: Select a representative curve F1 of F.

Compute the maximum Fréchetdistance D between F1 and all other curves in F.

fr chet distance between m curves4
Fréchetdistance between m curves

Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni

Idea: Select a representative curve F1 of F.

Compute the maximum Fréchet distance D between F1 and all other curves in F.

 D  F  2D

Observation: Gives a 2-approximation

fr chet distance between m curves5
Fréchetdistance between m curves

Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni

Idea: Select a representative curve F1 of F.

Compute the maximum Frechet distance D between F1 and all other curves in F.

 D  F  2D

Observation: Gives a 2-approximation

Time:O((n1ni log n1ni))

i

decision algorithm compute path1
Decision algorithm: compute path

Recall:

Deciding if the Fréchet distance between two curves P and Q is less than r can be computed in O(mn) time.

The Fréchetdistance between two polygonal curves P and Q can be computed in O(mn log mn) time using parametric search.

(qm,pn)

P

Q

P

(q1,p1)

Q

recall the problem
Recall the problem

Given a trajectory T, a subtrajectory cluster SC(m,l,d) of T consists of at least m subtrajectories T1, … , Tm of T such that:

the subtrajectories are pairwise disjoint,

the distance between any two subtrajectories is at most d, and

at least one subtrajectory has length l.

recall the problem1
Recall the problem
  • Input: A trajectory T with n points, an integer m>1 and a real value d>0.
  • Output: SC(m,max,d)

Constraint: For simplicity we will assume that all sub-trajectories in a cluster has to start and end at a vertex.

Idea: Create a free space diagram describing the distance between T and T.

free space diagram of t2
Free space diagram of T

T

A

B

D(A,C)  d

D(B,C)  d

D(A,B)  2d

C

free space diagram of t3
Free space diagram of T

C: representative trajectory

The length of the SC {A,B,C}

is the length of the representative trajectory.

A

B

C

approximation algorithm
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

R

L

While sweeping maintain network of critical points.

approximation algorithm1
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm2
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm3
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm4
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm5
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm6
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm7
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm8
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm9
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm10
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm11
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm12
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

approximation algorithm13
Approximation algorithm

Sweep the free space diagram from left to right with two vertical lines (L and R)

At each event point decide

if there are m monotone

curves between L and R

a) If “yes” then move R to

the right

b) If “no” and R-L=1 then

move R to the right

the right

c) If “no” and R-L>1 then

move L to the right

R

L

data structures
Data structures

Number of event points?

R

L

data structures1
Data structures

Number of event points?

R

L

data structures2
Data structures

Number of event points?O(n)

R

L

Two types of events:

  • L moves to the right
  • R moves to the right

How to handle an event?

Decide if there are m non-overlapping xy-monotone paths between L and R

handle event
Handle event

Start with top-most corner u on R.

Find the top-most corner u’ on L that can be reached by a xy-monotone path P.

u

R

L

P

u’

Observation: No point on R below u can reach a point on L above u’ with an xy-monotone path.

handle event1
Handle event

Start with top-most corner u on R.

Find the top-most corner u’ on L that can be reached by a xy-monotone path P.

u

R

L

P

v

v’

u’

Observation: No point on R below u can reach a point on L above u’ with an xy-monotone path.

handle event2
Handle event

Start with top-most corner u on R.

Find the top-most corner u’ on L that can be reached by a xy-monotone path P.

u

R

L

P

Next take the top-most corner v on R below u’. Find the top-most corner on L that can be reached by a xy-monotone path.

Continue until:

m curves found, or

no more corners on R.

u’

v

v’

path query in the free space diagram
Path Query in the Free Space diagram

In worst case the algorithm performs n path queries.

How do we perform a path query?

Recall querying for a path in

lecture 1.

O(n2) time per query

O(n) events, n points on R

Total: O(n3w) time and O(nw) space, where w = max (R-L)

path query in the free space diagram1
Path Query in the Free Space diagram

In worst case the algorithm performs n path queries.

How do we perform a path query?

Can it be improved?

O(n2w) time and O(nw) space

path query in the free space diagram2
Path Query in the Free Space diagram

In worst case the algorithm performs n path queries.

How do we perform a path query?

R

L

Can it be improved?

O(n2w) time and O(nw) space

Extension:

The algorithm can be modified to handle the case when only the “reference” trajectory needs to start an end at vertex.

approximation algorithm14
Approximation algorithm
  • Theorem:
  • A 2-distance approximation of the SC(m,max,d) problem can be computed in O(n2+nmw) time and O(nw) space using the discrete Fréchet distance.
  • Theorem:
  • A 2-distance approximation of the SC(m,max,d) problem can be computed in time O(n2w) using the continuous Fréchet distance if reference trajectory starts and ends in vertex.
  • Theorem:
  • A 2-distance approximation of the SC(m,max,d) problem can be computed in time O(n3m 2(n/m)(log2 n+m)) using the continuous Fréchet distance.

[Joint work: Buchin, Buchin, Löffler and Luo’10]

experimental results continuous
Experimental Results (continuous!)

i5-200 CPU with

a Nvidia GTX 580

Note: Continuous model  input data can be simplified!

[Joint work with Nacho Valladares’13]

open problems
Open Problems

Can we cluster faster?

Can a c-approximate Fréchet clustering be computed faster?

Can we cluster faster for special cases?

What should we report?

Cluster using other distance measures? For example using [Sankaramanet al. 2013]?

references
References
  • K. Buchin, M. Buchin, J. Gudmundsson, M. Loffler and J. Luo. Detecting Commuting Patterns by Clustering Subtrajectories. International Journal on Computational Geometry and Applications, 2011.
  • N. Valladares and J. Gudmundsson. A GPU approach to subtrajectory clustering using the Fréchetdistance. ACM SIGSPATIAL 2012.
  • A. Dumitrescu and G. Rote. On the Fréchet distance of a set of curves, Proceedings of the Sixteenth Canadian Conference on Computational Geometry, 2004.
  • S. Sankararaman, P. K. Agarwal, T. Mølhave, J. Pan and A. P. Boedihardjo. Model-driven matching and segmentation of trajectories. ACM SIGSPATIAL, 2013