Image parsing unifying segmentation and detection
Download
1 / 38

Image Parsing: Unifying Segmentation and Detection - PowerPoint PPT Presentation


  • 115 Views
  • Uploaded on

Image Parsing: Unifying Segmentation and Detection. Z. Tu, X. Chen, A.L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty. Outline. Why Image Parsing? Introduction to Concepts in DDMCMC DDMCMC applied to Image Parsing

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Image Parsing: Unifying Segmentation and Detection' - javier


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Image parsing unifying segmentation and detection

Image Parsing: Unifying Segmentation and Detection

Z. Tu, X. Chen, A.L. Yuille and S-C. Hz

ICCV 2003 (Marr Prize) & IJCV 2005

Sanketh Shetty


Outline
Outline

  • Why Image Parsing?

  • Introduction to Concepts in DDMCMC

  • DDMCMC applied to Image Parsing

  • Combining Discriminative and Generative Models for Parsing

  • Results

  • Comments


Image parsing
Image Parsing

Optimize p(W|I)

Image I

Parse Structure W


Properties of parse structure
Properties of Parse Structure

  • Dynamic and reconfigurable

    • Variable number of nodes and node types

  • Defined by a Markov Chain

    • Data Driven Markov Chain Monte Carlo (earlier work in segmentation, grouping and recognition)


Key concepts
Key Concepts

  • Joint model for Segmentation & Recognition

    • Combine different modules to obtain cues

  • Fully generative explanation for Image generation

    • Uses Generative and Discriminative Models + DDMCMC framework

    • Concurrent Top-Down & Bottom-Up Parsing


Pattern classes
Pattern Classes

62 characters

Faces

Regions


Mcmc a quick tour
MCMC: A Quick Tour

  • Key Concepts:

    • Markov Chains

    • Markov Chain Monte Carlo

      • Metropolis-Hastings [Metropolis 1953, Hastings 1970]

      • Reversible Jump [Green 1995]

    • Data Driven Markov Chain Monte Carlo


Markov chains
Markov Chains

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005


Markov chain monte carlo
Markov Chain Monte Carlo

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005


Metropolis hastings algorithm
Metropolis-Hastings Algorithm

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005


Metropolis hastings algorithm1
Metropolis-Hastings Algorithm

Invariant Distribution

Proposal Distribution

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005


Reversible jumps mcmc
Reversible Jumps MCMC

  • Many competing models to explain data

    • Need to explore this complicated state space

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005


Ddmcmc motivation
DDMCMC Motivation

Unifies

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005


Ddmcmc motivation1
DDMCMC Motivation

Generative Model

p(I|W)p(W)

State Space


Ddmcmc motivation2
DDMCMC Motivation

Generative Model

p(I|W)p(W)

State Space

Discriminative Model

q( wj| I )

Dramatically reduce search space by focusing

sampling to highly probable states.


Ddmcmc framework
DDMCMC Framework

  • Moves:

    • Node Creation

    • Node Deletion

    • Change Node Attributes


Transition kernel
Transition Kernel

Satisfies detailed balanced equation

Full Transition Kernel


Convergence to p w i
Convergence to p(W|I)

Monotonically at a geometric rate



Image generation model
Image Generation Model

Regions:

Constant Intensity

Textures

Shading

State of parse graph


62 characters

Faces

3 Regions


Uniform

Designed to penalize high model complexity


Shape prior
Shape Prior

Faces

3 Regions





Discriminative cues used
Discriminative Cues Used

  • Adaboost Trained

    • Face Detector

    • Text Detector

  • Adaptive Binarization Cues

  • Edge Cues

    • Canny at 3 scales

  • Shape Affinity Cues

  • Region Affinity Cues



Possible transitions
Possible Transitions

  • Birth/Death of a Face Node

  • Birth/Death of Text Node

  • Boundary Evolution

  • Split/Merge Region

  • Change node attributes







Comments
Comments

  • Well motivated but very complicated approach to THE HOLY GRAIL problem in vision

    • Good global convergence results for inference with very minor dependence on initial W.

    • Extensible to larger set of primitives and pattern types.

  • Many details of the algorithm are missing and it is hard to understand the motivation for choices of values for some parameters

  • Unclear if the p(W|I)’s for configurations with different class compositions are comparable.

  • Derek’s comment on Adaboost false positives and their failure to report their exact improvement

  • No quantitative results/comparison to other algorithms and approaches

    • It should be possible to design a simple experiment to measure performance on recognition/detection/localization tasks.



ad