- 295 Views
- Uploaded on
- Presentation posted in: General

Time Series Shapelets: A New Primitive for Data Mining

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Time Series Shapelets: A New Primitive for Data Mining

Lexiang Ye and Eamonn Keogh

University of California, Riverside

- Classification
- Huge interest in time series
- Extensive applications

- Nearest Neighbor
- Most accurate (in extensive empirical tests)
- Robust
- Simple

- Time and space complexity
- Results are not interpretable

- Shapelets
- shapelets are time series subsequences which are maximally representative of a class
- Distinguishing substring selection
- Probe design (computational biology)

false nettles

stinging nettles

false nettles

Shapelet Dictionary

I

Shapelet

5.1

Leaf Decision Tree

I

yes

no

0

1

false nettles

stinging nettles

stinging nettles

false nettles

Candidates Pool

ca

. . .

- Information gain

- Arrange the time series objects
- Find the optimal split point
- Pick the candidate achieving best utility as the shapelet

candidate

Split Point

0

Candidates Pool

- Total number of candidate
- Trace dataset
- 200 instances, each of length 275
- 7,480,200 shapelet candidates
- approximately three days

. . .

- Distance calculations from time series objects to shapelet candidates are the most expensive part
- Reduce the time in two ways
- Distance Early Abandon (known idea)
- Admissible Entropy Pruning (novel idea)

- Information Gain
- Traditional evaluation in decision tree
- Easily generalized to the multi-class problem
- Reduce the number of distance calculations

stinging nettles

false nettles

0

I=0.42

I= 0.29

0

0

false nettles

stinging nettles

false nettles

Shapelet Dictionary

I

Shapelet

5.1

Classification

Leaf Decision Tree

I

yes

no

0

1

false nettles

stinging nettles

stinging nettles

false nettles

5 *105

1.00

Brute Force

4 *105

0.95

3 *105

seconds

accuracy

0.90

2 *105

Currently best published accuracy 91.1%

Pruning

0.85

1 *105

0

0.80

160

10

20

40

80

10

20

40

80

320

160

|D|, the number of objects in the database

|D|, the number of objects in the database

Arrowhead Decision Tree

I

II

0

2

1

Avonlea

Clovis

1.0

(Clovis)

11.24

I

0

(Avonlea)

85.47

II

Shapelet Dictionary

0

200

400

1

0.5

0

0

200

400

600

800

1000

1200

one sample from each class

I

V

II

III

IV

VI

2

4

0

1

3

6

5

Shapelet Dictionary

I

0.4

II

0.3

III

0.2

IV

0.1

0.0

V

VI

300

0

100

200

Wheat Decision Tree

No Gun

Gun

(No Gun)

2

38.94

I

0

Shapelet Dictionary

0

50

100

Gun Decision Tree

I

1

0

0

100

200

300

1.0

0

0.909

0.902

0.860

right toe

144.075

I

left toe

(Normal Walk)

Walk Decision Tree

I

0.535

0

1

- Interpretable results
- more accurate/robust
- significantly faster at classification

Thank You

Question?

- All of the datasets are free to download http://www.cs.ucr.edu/~lexiangy/shapelet.html
- Code available upon request