1 / 41

Model Selection Problems in Image Analysis

Model Selection Problems in Image Analysis. Mário A. T. Figueiredo Department of Electrical and Computer Engineering Instituto Superior Técnico Technical University of Lisbon PORTUGAL. www.lx.it.pt/~mtf. What is model selection?. ...to which we want to fit a polynomial.

bern
Download Presentation

Model Selection Problems in Image Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model Selection Problems in Image Analysis Mário A. T. Figueiredo Department of Electrical and Computer Engineering Instituto Superior Técnico Technical University of Lisbon PORTUGAL www.lx.it.pt/~mtf

  2. What is model selection? ...to which we want to fit a polynomial. Question: which order? “underfitting” “overfitting” How to identify the underlying trend of the data, ignoring the noise? Some experimental data...

  3. Outline 1. What is model selection ? 2. Introduction and some motivating examples 3. Bayesian model selection 4. Minimum description length (MDL) 5. Implicit model selection via sparseness priors 6. Concluding remarks Examples are shown along the way...

  4. Motivating example Goal: fit some function (e.g., a polynomial) Unknown parameters: Any estimation criterion, for fixed k Objective function (e.g., mean squared error, likelihood,...) Parameter space of orderk, e.g., y Observed data x In general, k may not be the number of parameters, just some “model order”

  5. Motivating example No, if the parameter spaces are nested: Can we use the minimized criterion for model selection? Example: for every quadratic polynomial, there is an equivalentthird order polynomial; simply set to zero the extra coefficient. We may also have increasingly “complex” models, not necessarily nested.

  6. Examples in signal/image/pattern analysis inference spline representationof a contour, with k control points Contour estimation y observed image Model selection question: how many control points ? (how simple/smooth?)

  7. Examples in signal/image/pattern analysis inference image segmentation into k regions Image segmentation y observed image Model selection question: how many regions ?

  8. Probabilistic formulation Observation model (likelihood function): For maximum likelihood (ML) estimation, ...the maximized likelihood is useless for model selection. ML estimation (for fixed k) is very well studied model selection Choosing k Unknownquantities/parameters: Observed data:

  9. Ockham’s razor Implicit, via sparsity penalties Bayesian Information theoretic(MDL, MML) Preference for simpler models is a spin-off. Built-in Ockham’s razor Parsimony is built intothe inference criteria (“short code-length”) Preference for models with many zero parametersis imposed When talking about model selection, “Ockham’s razor” comes to mind. “Ockham’s razor” principle (XIVth century) “a plurality should not be posited except where necessary” ...that is, we should prefer “simpler” models. Model selection approaches Other methods: cross-validation, statistical tests, ...not considered here.

  10. Bayesian approach and k Likelihood function: A priori knowledge (prior) Bayes law Posterior: If goal is only to select k (don’t care about q ), the relevant posterior is Unknowns:

  11. Bayesian model selection marginal likelihood ..the key to the built-in parsimony of the Baysian approach [MacKay, 1991]. If the model is too complex, is large in a small regionbut very small almost everywhere else. If the model is too simple, is never very large. Maximum a posteriori (MAP) model selection: The marginal likelihood promotes a balance between data-fit and complexity.

  12. Bayesian model selection: Laplace-type approximations Laplace-type approximation: maximum likelihood estimate Bayesian inference criterion (BIC), [Schwartz, 1978]. with k with k penalizes larger k Conditions: smooth prior, regularity of the likelihood, large n Order-penalized maximum likelihood.

  13. BIC: Example estimate truth 3 Back to the polynomial toy example

  14. Penalized likelihood criteria If Only depends on the model dimension Structural risk minimization type criteria,i.e., choose the best model from each class (k),then select among these models.Examples: BIC, SRM, classical MDL, etc... General form of penalized likelihood criteria: Complexity penalty Many instances: BIC, NIC, AIC, SIC, MDL, MML, NML, ...

  15. The minimum description length (MDL) criterion good model encoder Rationale: short code long codebad model compressed data code lengthmodel adequacy Several flavors: [Rissanen 1978, 1987] [Rissanen 1996] [Wallace and Freeman, 1987] decoder Introduction: observed data

  16. Two-part code MDL However, both k and are unknown to the decoder observed data encoder estimator coded data decoder MDL criterion: extract Given , the shortest codelength for is [Shannon, 1948]. A maximum penalized likelihood criterion

  17. MDL criterion: coding the parameters Real-valued parameters: how to code with a finite number of bits ? truncated to finite precision Under regularity conditions, can be shown that optimal precision, when MDL = BIC This is the “standard” MDL; there are more recent/refined versions (more later).

  18. Image analysis example: contour estimation observed image contour description y a statistical model for the inside, observation mechanism ...another one for the outside other parameters Examples: Gaussians of different means and/or variances; Rayleigh of different variances (ultrasound images); different textures. [Figueiredo, Leitão, 1992], [Figueiredo, Leitão, Jain, 1997, 2000].

  19. Spline contour representation contourcontrol points contour control points matrix with periodic B-spline basis functions Fewer control points simpler (smoother) shape Model selection:k = ? Approach: MDL/BIC

  20. Some results k = 9 k = 13 k = 7 same variance, different means k = 11 description length k = 5 [Figueiredo, Leitão, Jain, 1997, 2000].

  21. The importance of the region-based model initialization same mean, different variances/textures This contour could never be estimated withan edge-based (snake-type) approach.

  22. Example on a real (brain MR) image

  23. More examples on real medical images

  24. Poisson field segmentation A Poissonian image Model:k regions/segments of constant mean: A sequence of Poisson counts (e.g., X-ray astronomical data) Model selection question:k = ?

  25. MDL for Poisson segmentation Example -250 -255 -260 -265 -270 -275 1 50 100 150 200 i model i segment at location i Fully Bayes / MDL-optimal criterion. No user-defined thresholds, no approximations. no segmentation model 0 Poisson model: parameter estimates are rational numbers No quantization needed Elementary problem: segmenting a sequence into two parts.

  26. Non-incomplete MDL for Poisson segmentation true intensity function Example: estimate Observed counts 40 30 20 10 0 200 400 600 [Figueiredo and Nowak, 1999, 2000] What about multiple change points ? Simply take each segment and re-apply the criterion (recursively)

  27. Non-incomplete MDL for Poisson image segmentation 1 - competing models: no segmentation, all possible 4-segmentations, and all possible 2-segmentations ky Ny 1 Nx kx In 2D, we look for the best (if any) segmentation into two/four rectangles: Same multinomial-based criterion: To fully segment an image: apply criterion recursively.

  28. Segmenting Poisson images: synthetic example counts intensity estimates 250x250 pixels/bins segmentation True intensities [Figueiredo and Nowak, 1999, 2000]

  29. Segmenting Poisson images: real example adaptive recursive partioning (ARP) multi-look SAR Approximated by Poisson model, by moment matching. Can also develop non-incomplete MDL criterion for Gaussian data, but need asymptotic approximation [Ndili, Figueiredo, Nowak, 2001].

  30. Implicit model selection via sparseness many zeros “Model selection” in the sense that will favor sparse estimates. Typical choice: Take “full model” and encourage redundant parameters to go to zero. We drop explicit mention of k This can be seen as a MAP estimate, with prior Laplacian prior LASSO regression [Tibshirani, 1996], basis pursuit [Chen, Donoho, Saunders, 1995], image denoising and restoration [Figueiredo, Nowak, 2001, 2003]. sparse linear regression and probit regression [Figueiredo, 2001, 2003], logistic regression [Krishnapuram, Figueiredo, Carin, Hartemink, 2004].

  31. Sparseness inducing nature of Laplacian-type priors Compare behavior at the origin: Laplacian: “reward” derivative keeps increasing as estimate approaches zero. Gaussian Laplacian Gaussian: “reward” increase slows down as estimate approaches zero.

  32. Linear observation models Many problems have this form; I’ll focus on: - Image denoising; - Image restoration (deblurring). - Image super-resolution.

  33. Linear observation models and sparseness prior problem Least squares cost function A more natural “model selection” criterion: problem Number of non-zero elements in Very recent results show that and are closely related (under conditions) [Tropp 2003], [Donoho, Elad, Temlyakov, 2003], [Donoho, 2004]

  34. Example: wavelet-based image denoising Observation model Original image H = W, matrix with (maybe redundant) wavelet basis Natural images can be represented by very sparse (many zeros). , a noisy image Closed form (fast) solution with or Jeffreys prior

  35. Example: wavelet-based image denoising Denoised image State-of-the-art results in 2001, using Jeffreys prior [Figueiredo and Nowak, 2001]. Noisy image

  36. Example: wavelet-based image debluring Observation model Original image Blur matrix In this case, Sparseness prior , a noisy blurred image Can’t be solved in closed form, due to BW. EM algorithm, seeing x as missing data[Figueiredo and Nowak, 2002, 2003].

  37. Example: image debluring restored blurred Uniform 9x9 blur (BSNR = 40dB)

  38. Example: image debluring Separable, blur with [1,4,6,4,1] restored blurred

  39. Deblurring aerial images Restored image[Jalobeanu, Nowak, Zerubia, Figueiredo, 2002]

  40. Concluding remarks Model selection permeates image analysis and processing Natural formulation in a Bayesian framework Complexity penalties also in MDL framework Order can also be penalized implicitly via sparseness priors Examples: contour estimation, signal/image segmentation, wavelet-based image denoising, wavelet-based image deblurring..

More Related