1 / 28

A decision-theoretic view of image retrieval

A decision-theoretic view of image retrieval. Nuno Vasconcelos Compaq Computer Corporation Cambridge Research Lab http://www.media.mit.edu/~nuno. horses. Texture similarity. Color similarity. Shape similarity. Content-based retrieval.

Download Presentation

A decision-theoretic view of image retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A decision-theoretic view of image retrieval Nuno Vasconcelos Compaq Computer Corporation Cambridge Research Lab http://www.media.mit.edu/~nuno

  2. horses Texture similarity Color similarity Shape similarity Content-based retrieval • allow users to express queries directly in visual domain • user provides query image • system extracts low-level features (texture, color, shape) • signature compared with those extracted from database • top matches returned Nuno Vasconcelos

  3. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + = + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Retrieval architecture • three main components • feature transformation • feature representation • similarity function • previous solutions have concentrated on some components • two main strategies: • texture: features • color: representation • need: criteria to guide the design of all components + + + + + Nuno Vasconcelos

  4. Decision-theoretic formulation • given: feature space X and set Y={1,…,C} of classes • goal: design map that minimizes probability of retrieval error • Bayes classifier is optimal • establishes andoptimal criteria for image similarity Nuno Vasconcelos

  5. Bayes: Battacharyya: ML: Kullback Leibler: c2: Quadratic: Mahalanobis: Euclidean: A unified view of image similarity Battacharyya 2 way bound c2 Bayes linearization equal priors Kullback Leibler Quadratic Mahalanobis Euclidean ML Large, iid query Gaussian S orthogonal Sq = Si Si = I Nuno Vasconcelos

  6. Feature transformation • probability of error is lower bounded by Bayes error: • Theorem: for a retrieval system withobservation space Z and a feature transformationthe Bayes error on X can never be smaller than that on Z. Equality is achieved if and only if T is invertible. • suggests thatemphasis on features is a bad idea Nuno Vasconcelos

  7. Feature representation • Theorem:for a retrieval system with class probabilities p(y=i) and class-likelihood functions p(x|y=i), and a decision functionthe difference between real and Bayes error is upper bounded by the L1 distance between real and estimated probabilities Nuno Vasconcelos

  8. Feature representation • distance between actual and ideal probability of error (estimation D) is upper bounded by a function of the quality of density estimates • this means: • good estimation is sufficient condition for accurate retrieval • from the theoretical viewpoint,no reason for features • caveat: estimation is difficult in high dimensions Nuno Vasconcelos

  9. Color (estimation)-based retrieval • no features, emphasis on representation (histograms) • problem: low-order statistics are not sufficient • spatial neighborhoods Þ high dimensionality Nuno Vasconcelos

  10. Summary • low Bayes error: avoid features • good image discrimination: • requires high dimensional spaces • estimation is difficult in high dimensions • can lead to large estimation error • fundamental trade-off of image retrieval: • a feature transformation will increase Bayes error but can also reduce estimation error • the two components have to be considered simultaneously! Bayes error < error < Bayes error + estimationD Nuno Vasconcelos

  11. Example: texture recognition • emphasis: discriminant features • simple representation (m, S) and similarity function (MD) • years of research on “good” features, e.g. MRSAR • problem: discriminant for texture but not generic • can we get similar performance with generic transform? • for Bayesian retrieval the features are not so important Nuno Vasconcelos

  12. Designing retrieval systems • the retrieval trade-off: • low Bayes error: invertible feature transformation • low estimation D: expressive feature representation & low-dimensional feature space • directive 1: get the most expressive representation you can afford! • directive 2: role for feature transform is dimensionality reduction • images live on a low-dimensional manifold embedded in high dimensional space • feature transformation should eliminate unnecessary dimensions • while staying as close to invertible as possible Nuno Vasconcelos

  13. Feature representation • among expressive models (kernel estimators) • we like Gaussian mixtures because they are: • compact (computational efficiency) • able to capture details of multi-modal densities (histogram) • computationally tractable in high dimensions (Gaussian) Nuno Vasconcelos

  14. Q3 T T-1 Feature transformation • dimensionality reduction has been thoroughly studied in compression literature • “close to invertible” = minimum reconstruction error Nuno Vasconcelos

  15. Optimal transformation • optimal solution (squared error sense): principal component analysis • for T(x) = Fxiff F*k = [v1,…,vk], vi = ith eigenvector of Lx, l1<…<ln • problems: • squared error is not Bayes error • PCA does not mimic well early human vision Nuno Vasconcelos

  16. Alternative transformations • defining sparse representation as one where the coefficients are close to zero most of the time (high kurtosis) • Olshausen and Field have shown that if we add a sparseness constraint to PCA the resulting basis functions are remarkably similar to the receptive field of the cells found in V1. Nuno Vasconcelos

  17. Basis functions Nuno Vasconcelos

  18. In practice • early stages of vision: dimensionality reduction, but subject to “efficiency” constraints • sparse representations are computationally intensive • can be reasonably approximated by wavelets • we have obtained good results even with the DCT • in summary, this indicates it is possible to have feature transformations that: • achieve good balance between invertibility and dim. reduction • capture the most important aspects of early human vision • have reduced complexity • work needed to find the best transformation Nuno Vasconcelos

  19. Invariance properties • Lemma: restriction of a Gaussian mixture to a linear subspace is still a Gaussian mixture • Gaussian mixture on a multi-resolution feature space: • family of embedded densities over multiple image scales • each dimension adds higher resolution information • DC only = histogram Nuno Vasconcelos

  20. Embedded multi-resolution mixture • explicit control over trade-off between “invariant” and “invertible”(low Bayes error) invariant invertible Nuno Vasconcelos

  21. Embedded multi-resolution mixture • explicit control over trade-off between “invariant” and “invertible”(low Bayes error) invariant invertible Nuno Vasconcelos

  22. Embedded multi-resolution mixture • explicit control over trade-off between “invariant” and “invertible”(low Bayes error) invariant invertible Nuno Vasconcelos

  23. Embedded multi-resolution mixture • explicit control over trade-off between “invariant” and “invertible”(low Bayes error) invariant invertible Nuno Vasconcelos

  24. Impact on retrieval accuracy • overall, the EMM representation: • extends histogram: account for spatial dependencies • extends Gaussian: expressive power to capture density details • combines good properties of color and texture-based approaches precision: % of retrieved that are relevant to query recall: % of relevant that are retrieved Nuno Vasconcelos

  25. Retrieval results • comparison: • Corel DB • 1500 images, 15 classes • methods: • MRSAR+MD (texture) • histogram intersection (color) • color correlograms (both) • DCT+ Gaussian mixtures + ML • Bayesian retrieval with embedded mixtures is clearly superior: up to 10% better than next best method (correlogram) Nuno Vasconcelos

  26. Conclusions • Probabilistic architecture for image similarity • decision-theoretic formulation • unifying view of similarity • optimal guidelines for feature transformation and representation • DCT + Gaussian mixtures • works well across various types of databases Nuno Vasconcelos

  27. Object recognition Bayesian + embedded multi-resolution mixture: Color histograms + Histogram Intersection (Swain & Ballard): Nuno Vasconcelos

  28. Texture recognition Bayesian + embedded resolution mixture: MRSAR model + Mahalanobis distance (Mao & Jain): Nuno Vasconcelos

More Related