Discovering Deformable Motifs in Time Series Data

Discovering Deformable Motifs in Time Series Data Jin Chen CSE 891-001 2012 Fall

Background • Time series data are collected in multiple domains, including surveillance, pose tracking, ICU patient monitoring and finance. • These time series often contain motifs - “segments that repeat within and across different series”. • Discovering these repeated segments can provide primitives that are useful for domain understanding and as higher-level, meaningful features that can be used to segment time series or discriminate among time series data from different groups. • Therefore, we need a tool to explain the entire data in terms of repeated, warped motifs interspersed with non-repeating segments.

Motif Detection Approaches • A motif is defined via a pair of windows of the same length that are closely matched in terms of Euclidean distance. • Such pairs are identified via a sliding window approach followed by random projections to identify highly similar pairs that have not been previously identified. • However, this method is not geared towards finding motifs that can exhibit significant deformation. Mueen et al., SDM 2009

Motif Detection Approaches • Discover regions of high density in the space of all subsequences via clustering. • These works define a motif as a vector of means and variances over the length of the window, a representation that also is not geared to capturing deformable motifs. • To addresses deformation, dynamic time warping is used to measure warped distance. However, motifs often exhibit structured transformations, where the warp changes gradually over time. Minnen et. al. AAAI 2007

Motif Detection Approaches • The work of focus on developing a probabilistic model for aligning sequences that exhibit variability. • However, these methods rely on having a segmentation of the time series into corresponding motifs. This assumption allows relatively few constraints on the model, rendering them highly under-constrained in certain settings. Listgarten et. al. NIPS. 2005

Motif Detection Approaches

Continuous Shape Template Model (CSTM) for Deformable Motif Discovery • CSTM is a hidden, segmental Markov model, in which each state either generates a motif or samples from a non-repeated random walk. • A motif is represented by smooth continuous functions that are subject to non-linear warp and scale transformations. • CSTM learns both the motifs and their allowed warps in an unsupervised way from unsegmentedtime series data. • A demonstration on three distinct real-world domains showsthat CSTM achieves considerably better performance than previousmethods.

CSTM – Generative Model • CSTM assumes that the observed time series is generated by switching between a state that generates nonrepeating segments and states that generate repeating (structurally similar) segments. • Motifs are generated as samples from a shape template that can undergo nonlinear transformations such as shrinkage, amplification or local shifts. The transformations applied at each observed time t for a sequence are tracked via latent states, the distribution over which is inferred. • Simultaneously, the canonical shape template and the likelihood of possible transformations for each template are learned from the data. • The random-walk state generates trajectory data without long-term memory. Thus, these segments lack repetitive structural patterns.

Canonical Shape Template (CST) • Each shape template, indexed by k, is represented as a continuous function sk(l) where l in (0,Lk] and Lk is the length of the kth template. • In many domains, motifs appear as smooth functions.A promising representation is piecewiseBezier splines. Shape templates of varyingcomplexity are intuitively represented by using fewer ormore pieces. • A third order Bezier curve is parameterized by four points pi (i in {0,1,2,3}), andover two dimensions, where the 1st dimension is the time t and the 2nd dimension is the signal value.

Shape Transformation Model - Warping • Motifs are generated by non-uniform sampling and scalingof sk. Temporal warp can be introduced by moving slowly or quickly through sk. • The allowable temporal warps are specified as an ordered set {w1, … ,wn} of time increments that determines the rate at which we advance through sk. A template-specific warp transition matrixMkw specifies the probability of transitions between warp states. • To generate an observation series y1, … , yT , let wt in {w1,… ,wn} be the random variable tracking the warp and t be the position within the template skat time t. Then, yt+1 would be generated from the value sk(Pt+1) where Pt+1 = Pt+wt+1 and wt+1~ Mkw(wt)

Shape Transformation Model - Rescale • The set of allowable scaling coefficients are maintained as theset {c1,…, cn}. Let ϕt+1in {c1, …, cn} be the sampled scale value at time t+1, sampled from the scale transition matrix Mkϕ. • Thus, the observation yt+1 would be generated around the value ϕt+1sk(Pt+1), a scaled version of the value of the motif at Pt+1, where ϕt+1~Mkϕ(ϕt). • An additive noise value vt+1 ~N(0, σ) models small shifts. • In summary, putting together all three possible deformations,we have that yt+1 = vt+1+ϕt+1sk(Pt+1).

Non-repeating Random Walk (NRW) • The NRW model is used to capture data not generated fromthe templates. • If this data has differentnoise characteristics, the task becomes simpler as thenoise characteristics can help disambiguate between motif generatedsegments and NRW segments. • The generation ofsmooth series can be modeled using an autoregressive process.

Template Translations • Transitions between generating NRW data and motifs fromthe CSTs are modeled via a transition matrix T of size (K+1)X(K+1) where the number of CSTs is K. • The random variable t tracks the template for an observed series. Transitions into and out of templates are only allowed atthe start and end of the template, respectively. • Thus, when theposition within the template is at the end, we have that Kt~T(Kt−1), otherwise Kt= Kt−1. • For T, we fix the self-transition parameter for the NRW state as λ, a pre-specifiedinput. Different settings of λallows control over the proportion of data assigned to motifs versus NRW.

Learning the Model • The canonical shape templates, their template-specific transformationmodels, the NRW model, the template transitionmatrix and the latent states for the observed series are all inferred from the data using hard EM. • Coordinateascent is used to update model parameters in the M-step. • Inthe E-step, given the model parameters, Viterbi is used forinferring the latent trace.

Learning the Model

Results

Conclusions • Warp invariantsignatures can be used for a forward lookup withinbeam pruning to significantly speed up inference when K, the number of templates is large. • ABayesian nonparametric prior is another approach that couldbe used to systematically control the number of classes basedon model complexity. • A different extension could build ahierarchy of motifs, where larger motifs are comprised ofmultiple occurrences of smaller motifs, thereby possibly providingan understanding of the data at different time scales. • This work can serve as a basis for building nonparametricpriors over deformable multivariate curves.

Discovering Deformable Motifs in Time Series Data

Discovering Deformable Motifs in Time Series Data

Presentation Transcript

Time-series data analysis

Exact Discovery of Time Series Motifs

Time Series Data

Exact Discovery of Time Series Motifs

Time Series Data in MongoDB

Mining Time Series Data

Time Series Data Processes

Enumeration of Time Series Motifs of All Lengths

DISCOVERING MOTIFS IN TIME SERIES

Time Series Data

Visualizing and Discovering Nontrivial Patterns In Large Time Series Databases

Aggregate Data and Time Series

Time-Series Data Management

Identifying Patterns in Time Series Data

Modeling Time Series Data

Regression with Time Series Data

A Framework for Discovering Anomalous Regimes in Multivariate Time-Series Data with Local Models

DISCOVERING LARGER NETWORK MOTIFS

Indexing Time Series Data

DISCOVERING MOTIFS IN TIME SERIES