Loading in 5 sec....

An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases PowerPoint Presentation

An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases

- By
**lefty** - Follow User

- 171 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases ' - lefty

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases

Yi-Cheng Chen, Ji-Chiang Jiang, Wen-ChihPeng and Suh-Yin Lee

Department of Computer Science

National Chiao Tung University

Hsinchu, Taiwan 300

{ejen.cs95g, perrys0620.cs96g}@nctu.edu.tw [email protected] [email protected]

CIKM, 2010

OUTLINE Patterns in Large Databases

- 1.INTRODUCTION
- 2.PROBLEM DEFINITION
- 3.INCISION STRATEGY
- 4.COINCIDENCE REPRESENTATION
- 5.CTMiner ALGORITHM
- 6.EXPERIMENTAL RESULTS
- 7.CONCLUSION AND FUTURE WORK

1. Patterns in Large Databases INTRODUCTION

- All related researches in this domain are based on Allen’s temporal logics.
- Which there are 13 temporal relations between any two event intervals .

1. Patterns in Large Databases INTRODUCTION

Compare with previous works：

- Kam et al. - hierarchical representation.
- Hoppner - scan database by sliding window.
- Papapetrou - Hybrid-DFS algorithm.
- Wu et al. - TPrefixSpan.
- Patel et al. - Augmented Representation(By additional counting information ), and IEMiner.

1. Patterns in Large Databases INTRODUCTION

Propose ：

- Incision strategy
- Coincidence representation
- CTMiner (Coincidence Temporal Miner)

2.PROBLEM DEFINITION Patterns in Large Databases

Event interval and event sequence

- E = {e1, e2,…, ek} be the set of event symbols.
- (ei, si, fi), ei∈ E, si , fi,are time points, si < fi
- Event start：ei.tsEvent finish：ei.tf
- {(e1, s1, f1), (e2, s2, f2), …, (en, sn, fn)} where si≤si+1 and si< fi

2.PROBLEM DEFINITION Patterns in Large Databases

Temporal database

- Database D = {r1, r2, …, rm}, each record ri, where 1≤ i≤ m
- A record riconsists of a sequence-id and an event interval(start time and finish time).
- Records in the database D with the same client-id are grouped together.
- Database D can be viewed as a collection of event sequences.

2.PROBLEM DEFINITION Patterns in Large Databases

Time set and time sequence

- An event sequence q = {(e1, s1, f1), (e2, s2, f2), …, (en, sn, fn)}
- The set T ={s1, f1, s2, f2, …, si, fi,…, sn, fn} is called a time set corresponding to sequence q.
- Order all the elements in T and eliminate redundant element, we got sequence Ts.sequence Ts = {t1, t2, t3, …, tk}where ti∈ T , ti< ti+1.

2.PROBLEM DEFINITION Patterns in Large Databases

- Event slice

2.PROBLEM DEFINITION Patterns in Large Databases

- Event slice

(en, sn, fn)(B,1,5),(D,8,4),(E,10,13),(F,10,13)

4 event intervals in sequence 2

Corresponding time set T={1,5,8,14,10,13,10,13}{s1, f1, s2, f2, s3, f3, s4, f4 }

Time sequence Ts ={1,5,8,10,13,14}{t1, t2, t3, …, tk}

2.PROBLEM DEFINITION Patterns in Large Databases

Event slice

- Let set L = { +, -, *, Φ}, a set of event sequences Q = {q1, q2, …, qi,…}, qi= {(e1, s1, f1), …, (ej, sj, fj) , … (en, sn, fn)}

2.PROBLEM DEFINITION Patterns in Large Databases

- Event slice

start slice D＋= (D, 8, 10)intermediate slice D*= (D, 10, 13)finish slice D－= (D, 13, 14)

The event interval B has only one intact slice B = (B, 1, 5)

3.INCISION STRATEGY Patterns in Large Databases

3.INCISION STRATEGY Patterns in Large Databases

- Incision example

3.INCISION STRATEGY Patterns in Large Databases

- Incision example

The incision strategy can totally avoid the generation of intermediate slices. By trimming the intermediate slices, we can still express the relationship between any two intervals correctly.

4.COINCIDENCE REPRESENTATION Patterns in Large Databases

- Group simultaneously occurring slices together to form the coincidences.
- Concatenation with all coincidences can describe an event sequence effectively.
- Simplify the processing of complex pairwise relationships between all intervals efficiently.

4.COINCIDENCE REPRESENTATION Patterns in Large Databases

4.COINCIDENCE REPRESENTATION Patterns in Large Databases

- Good scalability
- Nonambiguity
- Simple is good
- Compact space usage

5.CTMiner ALGORITHM Patterns in Large Databases

5.CTMiner ALGORITHM Patterns in Large Databases

min_sup = 2

5.CTMiner ALGORITHM Patterns in Large Databases

5.CTMiner ALGORITHM Patterns in Large Databases

6.EXPERIMENTAL RESULTS Patterns in Large Databases

- Runtime performance on synthetic data sets

6.EXPERIMENTAL RESULTS Patterns in Large Databases

- Real world dataset analysis

7.CONCLUSION AND Patterns in Large Databases FUTURE WORK

- Coincidence representation is nonambiguous and has several advantages over existing representations .

7.CONCLUSION AND Patterns in Large Databases FUTURE WORK

- Further：mining closed and maximal temporal patterns, incremental temporal patterns mining, and the research of method toward data stream.

Download Presentation

Connecting to Server..