inference in generative models of images and video n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Inference in generative models of images and video PowerPoint Presentation
Download Presentation
Inference in generative models of images and video

Loading in 2 Seconds...

play fullscreen
1 / 40

Inference in generative models of images and video - PowerPoint PPT Presentation


  • 140 Views
  • Uploaded on

Inference in generative models of images and video. John Winn MSR Cambridge May 2004. Overview. Generative vs. conditional models Combined approach Inference in the flexible sprite model Extending the model. Generative vs. conditional models.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Inference in generative models of images and video' - tamra


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
overview
Overview
  • Generative vs. conditional models
  • Combined approach
  • Inference in the flexible sprite model
  • Extending the model
generative vs conditional models
Generative vs. conditional models

We have an image I and latent variables Hwhich we wish to infer, e.g. object position, orientation, class.

There will also be other sources of variability, e.g. illumination, parameterised by θ.

Generative model: P(H, θ, I)

Conditional model: P(H, θ|I)

or P(H|I)

conditional models use features
Conditional models use features

Features are functions of I which aim to be informative about Hbut invariant to θ.

Edge features

Corner features

Blob features

conditional models
Conditional models

Using features f(I), train a conditional model e.g. using labelled data

Example: Viola & Jones face recognition using rectangle features and AdaBoost

conditional models1
Conditional models

Advantages

  • Simple - only model variables of interest
  • Inference is fast - due to use of features and simple model

Disadvantages

  • Non-robust
  • Difficult to compare different models
  • Difficult to combine different models
generative models
Generative models

A generative model defines a process of generating the image pixels Ifrom the latent variables Hand θ, giving a joint distribution over all variables:

P(H, θ, I)

Learning and inference carried out using standard machine learning techniques e.g. Expectation Maximisation, MCMC, variational methods.

No features!

generative models1
Generative models

Example: image modeled as layers of ‘flexible’ sprites.

generative models2
Generative models

Advantages

  • Accurate – as the entire image is modeled
  • Can compare different models
  • Can combine different models
  • Can generate new images

Disadvantages

  • Inference is difficult due to local minima
  • Inference is slower due to complex model
  • Limitations on model complexity
combined approach
Combined approach

Use a generative model, but speed up inference using proposal distributions given by a conditional model.

A proposal R(X) suggests a new distribution over some of the latent variables XH, θ.

Inference is extended to allow accepting or rejecting the proposal e.g. depending on whether it improves the model evidence.

using proposals in an mcmc framework
Using proposals in an MCMC framework

Generative model: textured regions combined with face and text models

Conditional model: face and text detector using AdaBoost (Viola & Jones)

Proposals for text and faces

Accepted proposals

From Tu et al, 2003

using proposals in an mcmc framework1
Using proposals in an MCMC framework

Generative model: textured regions combined with face and text models

Conditional model: face and text detector using AdaBoost (Viola & Jones)

Proposals for text and faces

Reconstructed image

From Tu et al, 2003

flexible sprite model
Flexible sprite model

Set of images e.g. frames from a video

x

flexible sprite model2
Flexible sprite model

f

π

Sprite shape and appearance

x

flexible sprite model3
Flexible sprite model

f

π

Sprite transform for this image (discretised)

T

m

x

Transformed mask instance for this image

flexible sprite model4
Flexible sprite model

b

f

π

Background

T

m

x

inference method problems
Inference method & problems
  • Apply variational inference with factorised Q distribution
  • Slow – since we have to search entire discrete transform space
  • Limited size of transform space e.g. translations only (160120).
  • Many local minima.
proposals in the flexible sprite model1
Proposals in the flexible sprite model
  • We wish to create a proposalR(T).
  • Cannot use features of the image directly until object appearance found.
  • Use features of the inferred mask.

π

proposal

T

m

moment based features
Moment-based features

Use the first and second moments of the inferred mask as features. Learn a proposal distribution R(T).

C-of-G of mask

True location

Contour of proposal distribution over object location

Can also use R to get a probabilistic bound on T.

results on scissors video
Results on scissors video.

Original

Reconstruction

  • On average, ~1% of transform space searched.
  • Always converges, independent of initialisation.

Foreground only

extended transform space
Extended transform space

Original

Reconstruction

extended transform space1
Extended transform space

Original

Reconstruction

extended transform space2
Extended transform space

Learned sprite appearance

Normalised video

corner features
Corner features

Learned sprite appearance

Masked normalised image

extensions to the generative model
Extensions to the generative model

Very wide range of possible extensions:

  • Local appearance model e.g. patch-based
  • Multiple layered objects
  • Object classes
  • Illumination modelling
  • Incorporation of object-specific models e.g. faces
  • Articulated models
further investigation of using proposals
Further investigation of using proposals

Investigate other bottom-up features, including:

  • Optical flow
  • Color/texture
  • Use of standard invariant features e.g. SIFT
  • Discriminative models for particular object classes e.g. faces, text
slide40

b

f

π

T

m

x

N