1 / 18

Infinite Hierarchical Hidden Markov Models

Infinite Hierarchical Hidden Markov Models. AISTATS 2009. Katherine A. Heller, Yee Whye Teh and Dilan Görür Lu Ren ECE@Duke University Nov 23, 2009. Outline. Hierarchical structure learning for sequential data Hierarchical hidden Markov model (HHMM)

haig
Download Presentation

Infinite Hierarchical Hidden Markov Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Infinite Hierarchical Hidden Markov Models AISTATS 2009 Katherine A. Heller, Yee WhyeTeh and DilanGörür Lu Ren ECE@Duke University Nov 23, 2009

  2. Outline • Hierarchical structure learning for sequential data • Hierarchical hidden Markov model (HHMM) • Infinite hierarchical hidden Markov model (IHHMM) • Inference and learning • Experiment results and demonstrations • Related work and extensions

  3. Multi-scale Structure Sequential data generated The sampled “states” used to generate the data • Consider to infer correlated observations over long periods in the observation sequence. • Potential application: language multi-resolution structure learning, video structure discovery, activity detection etc.

  4. Hierarchical HMM (HHMM) Hierarchical Hidden Markov Models (HHMM) Multiscale models of sequences where each level of the model is a separate HMM emitting lower level HMMs in a recursive manner. The generative process of one HHMM example [2]

  5. Hierarchical HMM (HHMM) 2. The entire set of parameters With a fixed model structure, the model is characterized by the following parameters [1] with with with • 3. Representing the HHMM as a DBN [2] • Simply assume all production states are at the bottom and the state of HMM at level and time is represented by . • specifies the complete “path” from the root to the leaf state. • Indicator variable control completion of the HHMM at level and time .

  6. Hierarchical HMM (HHMM) An HHMM represented as a DBM [2]

  7. Infinite Hierarchical HMM (IHHMM) IHHMM: allows the HHMM hierarchy to have a potentially infinite number of levels. • Observation: State: • Also a state transition indicating variable is introduced: • indicate whether there is a completion of the HHMM at level right before time ; • indicate presence of a state transition from to • The conditional probability of is: • There is an opportunity to transition at level only if there was a transition at level .

  8. Infinite Hierarchical HMM (IHHMM) • The property implied by the structure: • The number of transitions at level before a transition at level occurs is geometrically distributed with a mean . • This implies that the expected number of time steps for which a state at level persists in its current value is . • The states at higher levels persist longer. • The first non-transitioning level at time , has the distribution • is geometrically distributed with parameter if all • The IHHMM allows for a potentially infinite number of levels.

  9. Infinite Hierarchical HMM (IHHMM) The generative process for given is similar to the HHMM: For the levels down to , the state is generated according to The emissions matrix: for the levels

  10. Inference and Learning • The IHHMM is performed using Gibbs sampling and a modified forward-backtrack algorithm. • It iterates between the following two steps: • Sampling state values with fixed parameters for each level • Compute forward messages from to : replace with for • Resample and along the backward pass from to :

  11. Inference and Learning • When the top level is reached, a new level above it will be created by setting all states with 1; • If the level below the current top level has no state transitions, it becomes the new top level. 2. Sampling parameters given the current state: • Parameters are initialized as draws from the Dirichlet priors; • Posteriors are calculated based on the counts of state transitions and emissions in the previous step. Predicting new observations given the current state of the IHHMM: 1. Assume the top level learned from the IHHMM is , then calculate the following recursions from to :

  12. Inference and Learning 2. Compute the probability of observing from :

  13. Experiment Results 1. Data generated: sample samplesample Sequential data generated The sampled “states” used to generate the data

  14. Experiment Results 2. Demonstrate the model can capture the hierarchical structure • The first data set consists of repeats of integers increasing from 1 to 7, followed by repetitions of integers decreasing from 5 to 1, repeated twice. • The second data is the first one concatenated with another series of repeated increasing and decreasing sequences of integers. • 7 states is used in the model at all levels. b)

  15. Experiment Results The predictive log probability of the next integer is calculated: HMM: 0.25 IHHMM: 0.31 HHMM: 0.30 (for 2-4 levels) 3. Spectral data from Handel’s Hallelujah chorus

  16. Experiment Results 4. Alice in Wonderland letters data set. The difference in log predictive likelihood between IHHMM and a one level HMM learned by Gibbs sampling The difference in log predictive likelihood between IHHMM and a HMM learned by EM • The mean differences in both plots are positive, demonstrating that the IHHMM gives superior performance on this data. • The long tails signifies that there are letters which can be better predicted with the higher hierarchical levels.

  17. Final discussions Relation to the HHMM: IHHMM is a nonparametric extension of the HHMM for an unbounded hierarchy depth; The completion of an internal HHMM is governed by an independent process. Other related work: Probabilistic context free grammars with multi-scale structure learning; Infinite HMM, infinite factorial HMM; Future work: Make the number of states at each level infinite as well as the infinite HMM; Higher order Markov chains; More efficient inference algorithms.

  18. Cited References [1] S. Fine, Y. Singer, and N. Tishby. The hierarchical hidden Markov model: Analysis and applications. Machine Learning, 32: 41-62, 1998. [2] K. Murphy and M.A. Paskin. Linear time inference in hierarchical HMMs. In Neural Information Processing Systems, 2001.

More Related