An efficient algorithm for mining time interval based patterns in large databases
This presentation is the property of its rightful owner.
Sponsored Links
1 / 26

An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on
  • Presentation posted in: General

An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases . Yi-Cheng Chen, Ji -Chiang Jiang, Wen-Chih Peng and Suh -Yin Lee Department of Computer Science National Chiao Tung University Hsinchu , Taiwan 300

Download Presentation

An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


An efficient algorithm for mining time interval based patterns in large databases

An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases

Yi-Cheng Chen, Ji-Chiang Jiang, Wen-ChihPeng and Suh-Yin Lee

Department of Computer Science

National Chiao Tung University

Hsinchu, Taiwan 300

{ejen.cs95g, [email protected] [email protected] [email protected]

CIKM, 2010


Outline

OUTLINE

  • 1.INTRODUCTION

  • 2.PROBLEM DEFINITION

  • 3.INCISION STRATEGY

  • 4.COINCIDENCE REPRESENTATION

  • 5.CTMiner ALGORITHM

  • 6.EXPERIMENTAL RESULTS

  • 7.CONCLUSION AND FUTURE WORK


1 introduction

1.INTRODUCTION

  • All related researches in this domain are based on Allen’s temporal logics.

  • Which there are 13 temporal relations between any two event intervals .


1 introduction1

1.INTRODUCTION

Compare with previous works:

  • Kam et al. - hierarchical representation.

  • Hoppner - scan database by sliding window.

  • Papapetrou - Hybrid-DFS algorithm.

  • Wu et al. - TPrefixSpan.

  • Patel et al. - Augmented Representation(By additional counting information ), and IEMiner.


1 introduction2

1.INTRODUCTION

Propose :

  • Incision strategy

  • Coincidence representation

  • CTMiner (Coincidence Temporal Miner)


2 problem definition

2.PROBLEM DEFINITION

Event interval and event sequence

  • E = {e1, e2,…, ek} be the set of event symbols.

  • (ei, si, fi), ei∈ E, si , fi,are time points, si < fi

  • Event start:ei.tsEvent finish:ei.tf

  • {(e1, s1, f1), (e2, s2, f2), …, (en, sn, fn)} where si≤si+1 and si< fi


2 problem definition1

2.PROBLEM DEFINITION

Temporal database

  • Database D = {r1, r2, …, rm}, each record ri, where 1≤ i≤ m

  • A record riconsists of a sequence-id and an event interval(start time and finish time).

  • Records in the database D with the same client-id are grouped together.

  • Database D can be viewed as a collection of event sequences.


2 problem definition2

2.PROBLEM DEFINITION

Time set and time sequence

  • An event sequence q = {(e1, s1, f1), (e2, s2, f2), …, (en, sn, fn)}

  • The set T ={s1, f1, s2, f2, …, si, fi,…, sn, fn} is called a time set corresponding to sequence q.

  • Order all the elements in T and eliminate redundant element, we got sequence Ts.sequence Ts = {t1, t2, t3, …, tk}where ti∈ T , ti< ti+1.


2 problem definition3

2.PROBLEM DEFINITION

  • Event slice


2 problem definition4

2.PROBLEM DEFINITION

  • Event slice

(en, sn, fn)(B,1,5),(D,8,4),(E,10,13),(F,10,13)

4 event intervals in sequence 2

Corresponding time set T={1,5,8,14,10,13,10,13}{s1, f1, s2, f2, s3, f3, s4, f4 }

Time sequence Ts ={1,5,8,10,13,14}{t1, t2, t3, …, tk}


2 problem definition5

2.PROBLEM DEFINITION

Event slice

  • Let set L = { +, -, *, Φ}, a set of event sequences Q = {q1, q2, …, qi,…}, qi= {(e1, s1, f1), …, (ej, sj, fj) , … (en, sn, fn)}


2 problem definition6

2.PROBLEM DEFINITION

  • Event slice

start slice D+= (D, 8, 10)intermediate slice D*= (D, 10, 13)finish slice D-= (D, 13, 14)

The event interval B has only one intact slice B = (B, 1, 5)


3 incision strategy

3.INCISION STRATEGY


3 incision strategy1

3.INCISION STRATEGY

  • Incision example


3 incision strategy2

3.INCISION STRATEGY

  • Incision example

The incision strategy can totally avoid the generation of intermediate slices. By trimming the intermediate slices, we can still express the relationship between any two intervals correctly.


4 coincidence representation

4.COINCIDENCE REPRESENTATION

  • Group simultaneously occurring slices together to form the coincidences.

  • Concatenation with all coincidences can describe an event sequence effectively.

  • Simplify the processing of complex pairwise relationships between all intervals efficiently.


4 coincidence representation1

4.COINCIDENCE REPRESENTATION


4 coincidence representation2

4.COINCIDENCE REPRESENTATION

  • Good scalability

  • Nonambiguity

  • Simple is good

  • Compact space usage


5 ctminer algorithm

5.CTMiner ALGORITHM


5 ctminer algorithm1

5.CTMiner ALGORITHM

min_sup = 2


5 ctminer algorithm2

5.CTMiner ALGORITHM


5 ctminer algorithm3

5.CTMiner ALGORITHM


6 experimental results

6.EXPERIMENTAL RESULTS

  • Runtime performance on synthetic data sets


6 experimental results1

6.EXPERIMENTAL RESULTS

  • Real world dataset analysis


7 conclusion and future work

7.CONCLUSION AND FUTURE WORK

  • Coincidence representation is nonambiguous and has several advantages over existing representations .


7 conclusion and future work1

7.CONCLUSION AND FUTURE WORK

  • Further:mining closed and maximal temporal patterns, incremental temporal patterns mining, and the research of method toward data stream.


  • Login