1 / 22

Part 1: Classical Image Classification Methods

Part 1: Classical Image Classification Methods. Kai Yu Dept. of Media Analytics NEC Laboratories America. Andrew Ng Computer Science Dept. Stanford University. Outline of Part 2. Local Features, Sampling, Visual Words Discriminative Methods Bag-of-Words (BoW) representation

rea
Download Presentation

Part 1: Classical Image Classification Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University

  2. Outline of Part 2 • Local Features, Sampling, Visual Words • Discriminative Methods • Bag-of-Words (BoW) representation • Spatial pyramid matching (SPM) • Generative Methods • Part-based methods • Topic models

  3. Outline of Part 2 • Local Features, Sampling, Visual Words • Discriminative Methods • Bag-of-Words (BoW) representation • Spatial pyramid matching (SPM) • Generative Methods • Part-based methods • Topic models

  4. Local features • Distinctive descriptors of local image patches • Invariant to local translation, scale, … • and sometimes rotation or general affine transformations • The most famous choice is the SIFT feature

  5. Sampling local features from images A set of points Image credits: F-F. Li, E. Nowak, J. Sivic

  6. Visual words • Similar points are grouped into one visual word • Algorithms: k-means, agglomerative clustering, … • Points from different images are then more easily compared. Slide credit: Kristen Grauman

  7. Outline of Part 2 • Local Features, Sampling, Visual Words, … • Discriminative Methods • Bag-of-Words (BoW) representation • Spatial pyramid matching (SPM) • Generative Methods • Part-based methods • Topic models

  8. Bag-of-words (BoW) representation Analogy to documents Adapted from tutorial slides by Fei-Fei et al.

  9. BoW for object categorization • Works pretty well for whole-image classification Csurka et al. (2004), Willamowski et al. (2005), Grauman & Darrell (2005), Sivic et al. (2003, 2005) Slide credit: Svetlana Lazebnik

  10. Unsupervised Dictionary Learning SIFT space R1 R2 R3 image database • Sample local features from images • Run k-mean or other clustering algorithm to get dictionary • Dictionary is also called “codebook”

  11. Compute BoW histogram for each image R1 R1 R2 R2 Assign sift features into clusters R3 R3 Compute the frequency of each cluster within an image BoW histogram representations

  12. Indication of BoW histogram • Summarize entire image based on its distribution of visual word occurrences • Turn bags of different sizes into a fixed length vector • Analogous to bag of words representation commonly used for text categorization.

  13. Image classification based on BoW histogram BoW histogram vector space bird Decision boundary dog • Learn a classification model to determine the decision boundary • Nonlinear SVMs are commonly applied.

  14. Issues • Sampling strategy • Learning codebook: size? supervised?, … • Classification: which method? scalability? • Scalability: how to handle millions of data? • How to use spatial information?

  15. Spatial information • The BoW removes spatial layout. • This increases the invariance to scale, translation, and deformation, • But sacrifices discriminative power, especially when the spatial layout is important. Slide adapted from Bill Freeman

  16. Spatial pyramid matching • Compute BoW for image regions at different locations in various scales Figure credit: Svetlana Lazebnik

  17. A common pipeline for discriminative image classification using BoW Dictionary Learning Image Classification Dense/Sparse SIFT VQ Coding Dense/Sparse SIFT Spatial Pyramid Pooling K-means dictionary Nonlinear SVM

  18. Combining multiple descriptors Multiple Feature Detectors Multiple Descriptors: SIFT, shape, color, … VQ Coding and Spatial Pooling Nonlinear SVM Diagram from SurreyUVA_SRKDA, winner team in PASCAL VOC 2008

  19. Outline of Part 2 • Local Features, Sampling, Visual Words, … • Discriminative Methods • Bag-of-Words (BoW) representation • Spatial pyramid matching (SPM) • Generative Methods • Part-based methods • Topic models

  20. “beach”  z c w N D Topic models for images Latent Dirichlet Allocation (LDA) Fei-Fei et al. ICCV 2005 Slide credit Fei-Fei Li

  21. Part-based Model Rob Fergus ICCV09 Tutorial Fischler & Elschlager 1973

  22. For a comprehensive coverage of object categorization models, please visit Recognizing and Learning Object Categories Li Fei-Fei (Stanford), Rob Fergus (NYU), Antonio Torralba (MIT) http://people.csail.mit.edu/torralba/shortCourseRLOC/

More Related