1 / 74

Image Parsing : Unifying Segmentation, Detection, and Recognition

Image Parsing : Unifying Segmentation, Detection, and Recognition. Shai Bagon Oren Boiman. Image Understanding. A long standing goal of Computer Vision Consists of understanding: Objects and visual patterns Context State / Actions of objects Relations between objects Physical layout

orsin
Download Presentation

Image Parsing : Unifying Segmentation, Detection, and Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman

  2. Image Understanding • A long standing goal of Computer Vision • Consists of understanding: • Objects and visual patterns • Context • State / Actions of objects • Relations between objects • Physical layout • Etc. A picture is worth a thousand words…

  3. Natural Language Understanding • Very far from being solved • Even NL parsing (syntax) is problematic • Ambiguities requirehigh level (semantic)knowledge

  4. Image Parsing • Decomposition to constituent visual patterns • Edge Detection • Segmentation • Object Recognition

  5. S I Image Parsing Framework High-Level Tasks Object Recognition Classification Generic Framework Low-Level Tasks Segmentation Edge Detection

  6. Top-down(Generative) Constellation, Star-Model etc. Bottom-up(Discriminative) SVM, Boosting, Neural Nets etc. I S Inference: Approach used in “Image Parsing” • + Consistent Solutions • - Slow • + Fast • - Possibly Inconsistent

  7. Coming up next… • Define a (Monstrous) Generative model for Image Parsing • How to perform s-l-o-winference on such models (MCMC) • How to accelerate inference using bottom-up cues (DDMCMC)

  8. Image Parsing Generative Model Uniform S • No. of regions K • Region Shapes Liand Types ζi • Region Parameters Θi Uniform I

  9. Generic Regions Gray level histogram Constant up to Gaussian noise Quadratic form

  10. Faces • Use a PCA model (Eigen-faces) • Estimate Cov. Σ and prin. comp.

  11. Text region shapes • Use Spline templates • Allow Affine transformation • Allow small deformations of control point • Shading intensity model

  12. Problem Formulation • Now we can compute • We’d like to optimize • over the space ofparse graphs

  13. Optimizing P(S|I) is not easy… Rules out gradient methods • Hybrid State Space: Continuous & Discrete • Enormous number of local maxima • Graphical model structure is not pre-determined Rules out Belief propagation

  14. Optimize by Sampling! • Monte Carlo Principle • Use random samples to optimize! • Lets say we’re given N samples from P(S|I) • S1,…,SN • Compute P(Si|I) • Given Si it is easy to compute P(Si|I) • Choose the best Si !

  15. Detour: Sampling methods • How to sample from (very) complex probability space • Sampling algorithm • Why is Markov Chained in Monte Carlo?

  16. Example • Sample from

  17. Markov Chain • A sequence of RandomVariables • Markov property • Transition Given the present The future is independent of the past

  18. Markov Chain – cont. • Under certain conditionsMC converges to unique distribution • Stationary distribution – first eigen-vector of K

  19. Markov Chain Monte Carlo • Reminder: • Had we wanted a sample from Take the value of Xt, • How to make our the stationary distribution of MC ? • How to guarantee convergence ?

  20. Markov Chain convergence • Irreducibility: • The walk can reach any statestarting at any state • Non-periodicity • Stationary distribution cannot depend on t

  21. How to make p(x) Stationary • Detailed Balance: (stationary distribution), if • Written as matrix product • Sufficient condition to converge to p(x) Backward step Independent of x* Probability sum to 1 Forward step The same distribution p(.)

  22. Kernel Selection • Detailed Balance requires Kernel: • Metropolis-Hastings Kernel: • Proposal: where to go next • Acceptance: should we go • MH Kernel provides detailed balance Among the ten most influencing algorithms in science and engineering

  23. Metropolis Hastings • Sample x*~q(x*|xt) • Compute acceptance probability • If rand<A, • Else,

  24. Can we use any q(.) ? 1. Easy to sample from: • we sample from q(.) instead of p(.)

  25. p(x) q(x) Can we use any q(.) ? 2. Supports p(x)

  26. p(x) q(x) Can we use any q(.) ? 3. Explores p(x) wisely: • Too narrow q(.): q(x*|x) ~ N(x, .1) • Too wide q(.): q(x*|x) ~ N(0,20)

  27. Can we use any q(.) ? • Easy to sample from: • we sample from q(.) instead of p(.) • Supports p(x) • Explores p(x) wisely: • q(.) too narrow • q(.) too wide -> low acceptance • The best q(.) is p(.) – but we can’t sample p(.) directly.

  28. Combining Kernels • Suppose we have Satisfying detailed balance with the same • Then also satisfies detailed balance.

  29. Combining MH Kernels • The same applies to Metropolis Hastings Kernels: • Combining MH Kernels with different proposals – MC will converge to

  30. Example Revisited • Proposal distribution: • Acceptance: Given x - easy to compute p(x) Normalization factor cancels out

  31. Example – cont.

  32. MAP Estimation • Converge to • Simulated Annealing: • explore less – exploit more! • As the density is peaked at the global maxima

  33. Annealing - example • As the density is peaked at the global maxima

  34. Model Selection • Dimensionality variation in our space • Cannot directly comparedensity of differentstates! Varying number of regions Varying types of explanations per region

  35. Jump across dimensions • Pair-wise common measure

  36. Reversible Jumps • Common measure • Sample extensions u and u* s.tdim(u)+dim(x) = dim(u*)+dim(x*) • Use common dimension for comparison using invertible deterministic functions h and h’ • Explicitly allow reversible jumps x* x

  37. MCMC Summary • Sample p(x) using Markov Chain • Proposal q(x*|x) • Supports p(x) • Guides the sampling • Detailed balance • MH Kernel ensures convergence to p(x) • Reversible Jumps • Comparing across models and dimensions

  38. MCMC – Take home message If you want to make a new sample, You should first learn how to propose. Acceptance is random Eventually you’ll get trapped in endless chains until you become stationary. Some say it is better to do reversible jumps between models.

  39. Back to image parsing • A state is a parse tree • Moves betweenpossible parsesof the image Varying number of regions Different region types: Text, Face and Generic Varying number of parameters

  40. MCMC Moves • Birth / Death of a Face / Text • Split / Merge of a generic region • Model switching for a region • Region boundary evolution

  41. Moves -> Kernel MCMC Moves • Birth / Death of a Face / Text • Split / Merge of a generic region • Model switching for a region • Region boundary evolution

  42. Moves -> Kernel Dimensionality change: must allow reversible jump Text Sub-Kernel Face Sub-Kernel Generic Sub-Kernel Text Birth Text Death Face Birth Face Death Split Region Merge Region Model Switching Boundary Evolution

  43. Using bottom-up cues • So far we haven’t stated the proposal probabilities q(.) • If q(.) is uninformed of the image, convergence can be painfully slow • Solution: use the image to propose moves Face birth kernel

  44. Data Driven MCMC • Define proposal probabilitiesq(x*|x;I) • The proposal probabilities will depend on discriminative tests • Faces detection • Text detection • Edge detection • Parameter clustering • Generative model with Discriminative proposals

  45. Face/Text Detection • Bottom-up cues: AdaBoost • hard classification • Estimate posterior instead • Run on sliding windows at several scales

  46. Edge Map • Canny edge detection at several scales • Only these edges for split / merge

  47. Parameters clustering • Estimate likely parameter settings in the image • Cluster using Mean-Shift

  48. How to propose? • q(S*|S,I) should approximate p(S*|I) • Choose one sub-kernel at random • (e.g., create face) • Use bottom-up cues to generate proposals: S1,S2,… • Weight proposal according to p(Si|I) • Sample from discrete distribution

  49. Generic region – split/merge • Split/merge according to edge map • Dimensionality change – reversible S’ S

  50. Generic region – split/merge • Splitting k into i,j: Sk -> Sij • Proposals are weighted • Normalize weight to probabilities • Sample

More Related