Mining trajectory profiles for discovering user communities
Download
1 / 29

Mining Trajectory Profiles for Discovering User Communities - PowerPoint PPT Presentation


  • 148 Views
  • Uploaded on

Mining Trajectory Profiles for Discovering User Communities. Chih-Chieh Hung, Chih-Wen Chang, Wen-Chih Peng. Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03. Outline. Motivation Goal Framework Preprocess Construct User’s Profiles

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Mining Trajectory Profiles for Discovering User Communities' - Antony


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Mining trajectory profiles for discovering user communities

Mining Trajectory Profiles for Discovering User Communities

Chih-Chieh Hung, Chih-Wen Chang, Wen-ChihPeng

Speaker : Chih-Wen Chang

National Chiao Tung University, Taiwan

2009.11.03


Outline
Outline

  • Motivation

  • Goal

  • Framework

    • Preprocess

    • Construct User’s Profiles

    • Formulate Distance function

    • Identify Community

  • Experiments

  • Conclusion


Motivation 1 2
Motivation (1/2)

  • Rapid development of positioning techniques, users can easily collect their trajectories

    • GPS Logger, smart phones and navigation devices


Motivation 2 2
Motivation (2/2)

  • Many GPS community sites are established

    • Users can share their own trajectories

    • Users can search trajectories

Query

Every Trail

My tracks


Goal

  • Mine user communities from raw trajectories

    • User Communities

      • Sets of users who have similar moving behaviors

  • Applications

    • Find new friends

    • Recommendation

    • Rank of trajectories


3. Identify users communities

2. Formulate distance function

1. Construct User’s Profile

Profile

Community 1

Profile

Measure Distance Between Users

Community 2

Profile


Outline1
Outline

  • Motivation

  • Goal

  • Framework

    • Preprocess

    • Construct User’s Profiles

    • Formulate Distance function

    • Identify Community

  • Experiments

  • Conclusion


Framework
Framework

Preprocess

Construct User’s Profile

Measure Distance Between Users

Identify Community


Preprocessing
Preprocessing

  • Step 1:

    • Find frequent regions

      • Input: all trajectories of users

      • Output: frequent regions

      • Density-based approach

  • Step 2:

    • Transform trajectories into sequences of frequnet region id

      • T1 : <A, B, D>


Framework1
Framework

Preprocess

Construct User’s Profile

Measure Distance Between Users

Identify Community


Construct user s profiles 1 2
Construct User’s Profiles (1/2)

  • User’s Profile

    • Probabilistic Suffix Tree (abbreviated as PST)

      • Find and organize trajectory patterns

      • Record the probability of next movements

Frequently moving sequence

Conditional tables

(next possible movements)


Construct user s profiles 2 2
Construct User’s Profiles (2/2)

  • Construct PST

    • Level by level

    • Two operations:

      • Create a child node

        • The counts of Before symbol > MinSup

      • Add a symbol into the related conditional table

        • The counts of After symbol > MinSup

ABE

ABA

AC

B

ADF

H

JHI

EDH

ABE

ABA

AC

B

ADF

H

JHI

EDH

ABE

ABA

AC

B

ADF

H

JHI

EDH

root

MinSup = 0.2

B

A

After symbol

A : 1  1/2 = 0.5

E : 1  1/2 = 0.5

B:0.375

A:0.5

B:0.375

Before symbol

A : 2  2/3 × 0.375 = 0.25

A

AB:0.25


Framework2
Framework

Preprocess

Construct User’s Profile

Measure Distance Between Users

Identify Community


Formulate distance function 1 3
Formulate Distance function (1/3)

  • Determine distance of users

    • Transform the PST into Moving Sequence List

      Each element in moving sequence list is a branch of PST with their probability

L1 [1..2] = <[(A,0.5)],[(B,0.375)(AB,0.33)]>


Formulate distance function 2 3
Formulate Distance function (2/3)

  • Define the distance between PSTs

    • Find the minimal dist(Li[1..m], Lj[1..n])

    • Use three editing operations

      • Insertion

L1={m1:0.3,m2:0.2,m3:0.3}

L2={m1:0.3,m2:0.2}

T1

Cost = 0.3

T2

0.2

L1={m1:0.3,m2:0.2,m3:0.3}

L2={m1:0.3,m2:0.2,m3:0.3}

0.1

Insert


Formulate distance function 3 3
Formulate Distance function (3/3)

  • Deletion

  • Replacement

L1={m1:0.2,m2:0.2,m3:0.2}

L2={m1:0.2,m2:0.2,m3:0.2}

T1

T2

0.3

Cost = 0.3

Replace

L1={m1:0.2,m2:0.2,m3:0.2}

L2={m1:0.2,m2:0.2,m4:0.3}

L1={m1:0.2,m2:0.3}

L2={m1:0.2,m2:0.3,m3:0.3}

L1={m1:0.2,m2:0.3}

L2={m1:0.2,m2:0.3,____}

Delete

Cost = 0.3+0.2 = 0.5

T1

T2

0.3

0.2

0.2


Framework3
Framework

Preprocess

Construct User’s Profile

Measure Distance Between Users

Identify Community


Identify community 1 4
Identify Community (1/4)

  • User community

    • The same community: δMLS(Ti,Tj) < thresholdδ

    • The number of communities is minimal

  • Transform the relation between PSTs into a graph

    • A vertex represents a user

    • An edge exists between two vertices when

      δMLS(Ti,Tj) < thresholdδ

O1

O4

O2

O3

O5


Identify community 2 4
Identify Community (2/4)

  • Model as a minimum clique problem

    • A clique is a set of pair-wise adjacent vertices

      Example

O4

O1

O5

O2

O3


Identify community 3 4
Identify Community (3/4)

  • Select a representative PST for each community

    • Represent all PSTs in the same community

    • Advantages

      • Reduce the overhead of storages

      • Speed up query processing

      • Identify new users for their communities

Add into

?

Representative PST


Identify community 4 4
Identify Community (4/4)

  • Two factors

    • Sizeof representative PST

      • The number of tree nodes, denoted as N(Ti)

        2. Distance between the selected PST and others

        in the same community

      • The error sum, denoted as ES

        - Sum of the distance between selected PST and others

  • Representative PST

    • Minimize


  • Outline2
    Outline

    • Motivation

    • Goal

    • Framework

      • Preprocess

      • Construct User’s Profiles

      • Formulate Distance function

      • Identify Community

    • Experiments

    • Conclusion


    Experiments 1 4
    Experiments (1/4)

    • Simulator Model

      • Use real trajectories from CarWebto simulate the group mobility of users

        • Total : 2400 trajectories


    Experiments 2 4
    Experiments (2/4)

    • Compare to General Sequential Pattern mining algorithm (GSP)

      • Set of sequential patterns Ex. sp1, sp2, ..., spn

      • Trajectory profile of a user represented as a

      • Distance function between profiles

        • Cosine similarity measurement, similarity(Vi, Vj) =

          Example

    Similarity :

    <1,1,0,0> . <0,1,1,1>

    |<1,1,0,0>||<0,1,1,1>|


    Experiments 3 4
    Experiments (3/4)

    • Impact of Trajectory Profiles

    GSP are always larger than PST

    Especially in MinSup smaller than 0.15

    Storage

    Prediction


    Experiments 4 4
    Experiments (4/4)

    • Impact of the thresholdδ and MinSup

      • Smaller thresholdδ will find more number of communities

    Storage

    Prediction


    Outline3
    Outline

    • Motivation

    • Goal

    • Framework

      • Preprocess

      • Construct User’s Profiles

      • Formulate Distance function

      • Identify Community

    • Experiments

    • Conclusion


    Conclusion
    Conclusion

    • Explore the problem of mining communities from trajectories

    Preprocess

    Find frequent regions

    Replace trajectories by region ids

    Construct User’s Profile

    Build probabilistic suffix tree

    (abbreviated as PST)

    Measure Distance Between Users

    Formulate distance function

    Identify Community

    Cluster users by distance function

    Select Representative PSTs



    ad