Infinite Hierarchical Hidden Markov Models
Download
1 / 18

Infinite Hierarchical Hidden Markov Models - PowerPoint PPT Presentation


  • 278 Views
  • Uploaded on

Infinite Hierarchical Hidden Markov Models. AISTATS 2009. Katherine A. Heller, Yee Whye Teh and Dilan Görür Lu Ren [email protected] University Nov 23, 2009. Outline. Hierarchical structure learning for sequential data Hierarchical hidden Markov model (HHMM)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Infinite Hierarchical Hidden Markov Models' - haig


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Infinite Hierarchical Hidden Markov Models

AISTATS 2009

Katherine A. Heller, Yee WhyeTeh and DilanGörür

Lu Ren

[email protected] University

Nov 23, 2009


Outline

  • Hierarchical structure learning for sequential data

  • Hierarchical hidden Markov model (HHMM)

  • Infinite hierarchical hidden Markov model (IHHMM)

  • Inference and learning

  • Experiment results and demonstrations

  • Related work and extensions


Multi-scale Structure

Sequential data generated

The sampled “states” used to generate the data

  • Consider to infer correlated observations over long periods in the observation sequence.

  • Potential application: language multi-resolution structure learning, video structure discovery, activity detection etc.


Hierarchical HMM (HHMM)

Hierarchical Hidden Markov Models (HHMM)

Multiscale models of sequences where each level of the model is a separate HMM emitting lower level HMMs in a recursive manner.

The generative process of one HHMM example [2]


Hierarchical HMM (HHMM)

2. The entire set of parameters

With a fixed model structure, the model is characterized by the following parameters [1]

with

with

with

  • 3. Representing the HHMM as a DBN [2]

  • Simply assume all production states are at the bottom and the state of HMM at level and time is represented by .

  • specifies the complete “path” from the root to the leaf state.

  • Indicator variable control completion of the HHMM at level and time .


Hierarchical HMM (HHMM)

An HHMM represented as a DBM [2]


Infinite Hierarchical HMM (IHHMM)

IHHMM: allows the HHMM hierarchy to have a potentially infinite number of levels.

  • Observation: State:

  • Also a state transition indicating variable is introduced:

  • indicate whether there is a completion of the HHMM at level right before time ;

  • indicate presence of a state transition from to

  • The conditional probability of is:

  • There is an opportunity to transition at level only if there was a transition at level .


Infinite Hierarchical HMM (IHHMM)

  • The property implied by the structure:

  • The number of transitions at level before a transition at level occurs is geometrically distributed with a mean .

  • This implies that the expected number of time steps for which a state at level persists in its current value is .

  • The states at higher levels persist longer.

  • The first non-transitioning level at time , has the distribution

  • is geometrically distributed with parameter if all

  • The IHHMM allows for a potentially infinite number of levels.


Infinite Hierarchical HMM (IHHMM)

The generative process for given is similar to the HHMM:

For the levels down to , the state is generated according to

The emissions matrix:

for the levels


Inference and Learning

  • The IHHMM is performed using Gibbs sampling and a modified forward-backtrack algorithm.

  • It iterates between the following two steps:

  • Sampling state values with fixed parameters for each level

  • Compute forward messages from to :

replace with for

  • Resample and along the backward pass from to :


Inference and Learning

  • When the top level is reached, a new level above it will be created by setting all states with 1;

  • If the level below the current top level has no state transitions, it becomes the new top level.

2. Sampling parameters given the current state:

  • Parameters are initialized as draws from the Dirichlet priors;

  • Posteriors are calculated based on the counts of state transitions and emissions in the previous step.

Predicting new observations given the current state of the IHHMM:

1. Assume the top level learned from the IHHMM is , then calculate the following recursions from to :


Inference and Learning

2. Compute the probability of observing from :


Experiment Results

1. Data generated: sample samplesample

Sequential data generated

The sampled “states” used to generate the data


Experiment Results

2. Demonstrate the model can capture the hierarchical structure

  • The first data set consists of repeats of integers increasing from 1 to 7, followed by repetitions of integers decreasing from 5 to 1, repeated twice.

  • The second data is the first one concatenated with another series of repeated increasing and decreasing sequences of integers.

  • 7 states is used in the model at all levels.

b)


Experiment Results

The predictive log probability of the next integer is calculated:

HMM: 0.25 IHHMM: 0.31 HHMM: 0.30 (for 2-4 levels)

3. Spectral data from Handel’s Hallelujah chorus


Experiment Results

4. Alice in Wonderland letters data set.

The difference in log predictive likelihood between IHHMM and a one level HMM learned by Gibbs sampling

The difference in log predictive likelihood between IHHMM and a HMM learned by EM

  • The mean differences in both plots are positive, demonstrating that the IHHMM gives superior performance on this data.

  • The long tails signifies that there are letters which can be better predicted with the higher hierarchical levels.


Final discussions

Relation to the HHMM:

IHHMM is a nonparametric extension of the HHMM for an unbounded hierarchy depth;

The completion of an internal HHMM is governed by an independent process.

Other related work:

Probabilistic context free grammars with multi-scale structure learning;

Infinite HMM, infinite factorial HMM;

Future work:

Make the number of states at each level infinite as well as the infinite HMM;

Higher order Markov chains;

More efficient inference algorithms.


Cited References

[1] S. Fine, Y. Singer, and N. Tishby. The hierarchical hidden Markov model: Analysis and applications. Machine Learning, 32: 41-62, 1998.

[2] K. Murphy and M.A. Paskin. Linear time inference in hierarchical HMMs. In Neural Information Processing Systems, 2001.


ad