Loading in 5 sec....

Time Series Sequence MatchingPowerPoint Presentation

Time Series Sequence Matching

- By
**elam** - Follow User

- 131 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Time Series Sequence Matching' - elam

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Papers

- “Fast subsequence Matching in time-series database”Christos Faloutsos, M.Ranganathan Yannis Manolopoulos
- “Skyline index for time series data”Quanzhong Li, Ines Fernando Vega Lopez, Bongki Moon

Types of Time Series sequence

- Financial, marketing area
- Stock prices
- Sales numbers

- Scientific databases
- Weather data
- Environmental data

Categories for time series sequencematching

- Whole matching
- data sequences and query sequence have the same length

- Subsequence matching
- Query sequence and data sequence have different length

Whole matching

- Given N sequences with the same length l
- Use features extraction function to convert sequences into n-dimensional values
- DFT
- N-dimensional value (Q1,Q2,…,Qn)
- Most energy in first few coefficients
- Keep first few coefficients
- Reduce dimensions of sequence

Whole matching

- Map each sequence as a n-dimensional point into the feature space
- Only take first 2 coefficients

- Organize these points into R-tree
- For index and search in R-tree

Whole matching

- New coming query sequence
- Use DFT convert to feature point
- Map the query feature point into feature space
- Find out points whose distance to query point within tolerance e
- Consider them similar

Some pictures of time series data and DFT

- Discrete Fourier Transform (DFT )
- keep first few (2-3) coefficients
- The first few coefficients contain most energy of the feature

Feature space

- TS1(0.05,3)
- TS2(0.01,12)
- ……

Feature space

- The distance e < minimum query distance

Subsequence matching

- A collection of N sequences, each one has different length
- A query Q with tolerance e
- Find out all sequence Sі(1<i<N), along with the correct offsets k,such that the sequence Sі[k:k+Len(Q)-1] matches the query sequence: D(Q, Sі[k:k+Len(Q)-1] ) <= e

ST-index

- Assuming the minimum query length w
- Using a sliding window of size w and place it on the date sequence at every possible offsets of the whole data sequences
- Extract the features in window at each possible offset and map each feature as a point into feature space

Figure

- Sliding window on sequence from offset 0 to Len(S)-w+1
- The length of window is w

Figure

- Sliding window on sequence from offset 0 to Len(S)-w+1
- The length of window is w

Figure

- Sliding window on sequence from offset 0 to Len(S)-w+1
- The length of window is w

Figure

- Sliding window on sequence from offset 0 to Len(S)-w+1
- The length of window is w

Figure

- Sliding window on sequence from offset 0 to Len(S)-w+1
- The length of window is w

Result

- A series of points in the feature space is curve
- R-tree

MBRs

- Store points in R-tree is inefficient
- Divide trial into sub-trials using minimum bounding rectangles (MBRs)

MBRs in R-tree

- Combine small MBRs
- Get the index information

How to insert points into MBRs

- Group the points into MBR with a fixed-number
- Group the points into MBR with a variable-number

I-adaptive method

- One greedy algorithm
- number of disk access
- cost function
- average cost function

Algorithm

- Assign the first point of the trail in a sub-trail
- For each successive point
- If it increases the average cost of current sub-trail
- Then start another sub-trail
- Else include this point in current sub-trial

Skyline index for time series data

- “Skyline index for time series data”Quanzhong Li, Ines Fernando Vega Lopez, Bongki Moon

Adaptive Piecewise Constant Approximation (APCA)

- What is APCA?

Adaptive Piecewise Constant Approximation (APCA)

- Limitation of APCA
- Internal overlap in MBRs

Skyline Bounding Region (SBR)

- SBR
- N time series data objects of length l
- Specify 2-dimensional regions by top and bottom skylines

Approximate SBR

- Many approaches
- Equal-length constant-valued segments
- Variance-length constant-valued segments

- ASBR will cover the original SBR

Index Approximation SBR

- R-Tree based Skyline index
- Internal node
- Approximation SBR
- Pointer to child node

- Leaf node
- Pointer to time series data

The End

Thank You

Download Presentation

Connecting to Server..