Hierarchical Beta Process and the Indian Buffet Process

Download Presentation

Hierarchical Beta Process and the Indian Buffet Process

Loading in 2 Seconds...

- 224 Views
- Uploaded on
- Presentation posted in: General

Hierarchical Beta Process and the Indian Buffet Process

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Hierarchical Beta Process and the Indian Buffet Process

by R. Thibaux and M. I. Jordan

Discussion led by

Qi An

- Introduction
- Indian buffet process (IBP)
- Beta process (BP)
- Connections between IBP and BP
- Hierarchical beta process (hBP)
- Application to document classification
- Conclusions

Mixture models

Each data is drawn from one mixture component

Number of mixture components is not set a prior

Distribution over partitions

Factorial models

Each data is associated with a set of latent Bernoulli variables

Cardinality of the set of features can vary

A “featural” description of objects

A natural way to define interesting topologies on cluster

May be appropriate for large number of clusters

VS.

Beta process is a special case of independent increment process, or Levy process,

Levy process can be characterized by Levy measure. For beta process, it is

If we draw a set of points from a Poisson process with base measure v, then

As the representation shows, B is discrete with probability one.

When the base measure B0 is discrete: , then B has atoms at the same locations with

Here, Ω can be viewed as a set of potential features and the random measure B defines the probability that X can possess particular feature.

In Indian buffet process, X is the customer and its features are the dishes the customer taste.

It is proven that the observations from a beta process satisfy

Procedure:

The first customer will try Poi(γ) number of dishes (feature). After that , the new observation can taste previous dish j with probability and then try a number of new features

where is the total mass

As a result, beta process is a two-parameter (c, γ) generalization of the Indian buffet process.

IBP=BP(c=1, γ=α)

The total number of unique dishes can be roughly represented as

This quantity becomes Poi(γ) if c0 (all customers share the same dishes) or Poi(n γ) if c∞ (no sharing).

Authors propose to generate an approximation, , of B

Let For each step n≥1

Consider a document classification problem. We have a training data set X, which is a list of documents. Each document is classified by one of n topics. We model a document by the set of words it contains. We assume document Xi,j is generated by including each word w independently with a probability pjw specific to topic j. These probabilities form a discrete measure Aj over all word space Ω. We can put a beta process BP(cj,B) prior on Aj.

Since we want the sharing across different topics, B has to be discrete. We thus put a beta process prior BP(c0,B0) on B, which allows sharing the same atoms among topics.

The HBP model can be summarized as:

This model can be solved with Monte Carlo inference algorithm.

- Authors applied the hierarchical beta process to a document classification problem
- Compare it to the Naïve Bayes (with Laplace smoothing) results
- The hBP model can obtain 58% result while the best Naïve Bayes result is 50%

- The beta process is shown to be suitable for nonparametric Bayesian factorial modeling
- The beta process can be extended to a recursively-defined hierarchy of beta process
- Compared to the Dirichlet process, the beta process has the potential advantage of being an independent increments process
- More work on inference algorithm is necessary to fully exploit beta process models.