1 / 39

Agenda

Agenda. Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based image retrieval Datasets & Conclusions. Classifier based methods. Decision boundary. Background. Computer screen.

homer
Download Presentation

Agenda

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Agenda • Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets & Conclusions

  2. Classifier based methods Decision boundary Background Computer screen Bag of image patches In some feature space Object detection and recognition is formulated as a classification problem. The image is partitioned into a set of overlapping windows … and a decision is taken at each window about if it contains a target object or not. Where are the screens?

  3. Discriminative methods Nearest neighbor Neural networks 106 examples Conditional Random Fields Support Vector Machines and Kernels

  4. Nearest Neighbors Difficult due to high intrinsic dimensionality of images - lots of data needed - slow neighbor lookup 106 examples Shakhnarovich, Viola, Darrell 2003 Torralba, Fergus, Freeman 2008

  5. Multi-layer Hubel-Wiesel architectures Neural networks Biologically inspired LeCun, Bottou, Bengio, Haffner 1998 Rowley, Baluja, Kanade 1998 Hinton & Salakhutdinov 2006 Ranzato, Huang, Boureau, LeCun 2007 Riesenhuber & Poggio 1999 Serre, Wolf, Poggio. 2005 Mutch & Lowe 2006

  6. Support Vector Machines Face detection Combining Multiple Kernels Varma & Roy 2007 Bosch, Munoz, Zisserman 2007 Heisele, Serre, Poggio, 2001 Pyramid Match Kernel Grauman & Darrell 2005 Lazebnik, Schmid, Ponce 2006

  7. Conditional Random Fields Kumar & Hebert 2003 Quattoni, Collins, Darrell 2004 More in Segmentation section

  8. Boosting • A simple algorithm for learning robust classifiers • Freund & Shapire, 1995 • Friedman, Hastie, Tibshhirani, 1998 • Provides efficient algorithm for sparse visual feature selection • Tieu & Viola, 2000 • Viola & Jones, 2003 • Easy to implement, not requires external optimization tools.

  9. A simple object detector with Boosting • Download • Toolbox for manipulating dataset • Code and dataset • Matlab code • Gentle boosting • Object detector using a part based model • Dataset with cars and computer monitors http://people.csail.mit.edu/torralba/iccv2005/

  10. Boosting Boosting fits the additive model by minimizing the exponential loss Training samples The exponential loss is a differentiable upper bound to the misclassification error.

  11. Weak classifiers • The input is a set of weighted training samples (x,y,w) • Regression stumps: simple but commonly used in object detection. fm(x) b=Ew(y [x> q]) a=Ew(y [x< q]) Four parameters: x q

  12. From images to features:A myriad of weak detectors We will now define a family of visual features that can be used as weak classifiers (“weak detectors”) Takes image as input and the output is binary response. The output is a weak detector.

  13. A myriad of weak detectors • Yuille, Snow, Nitzbert, 1998 • Amit, Geman 1998 • Papageorgiou, Poggio, 2000 • Heisele, Serre, Poggio, 2001 • Agarwal, Awan, Roth, 2004 • Schneiderman, Kanade 2004 • Carmichael, Hebert 2004 • …

  14. Weak detectors Textures of textures Tieu and Viola, CVPR 2000 Every combination of three filters generates a different feature This gives thousands of features. Boosting selects a sparse subset, so computations on test time are very efficient. Boosting also avoids overfitting to some extend.

  15. Haar wavelets Haar filters and integral image Viola and Jones, ICCV 2001 The average intensity in the block is computed with four sums independently of the block size.

  16. Haar wavelets Papageorgiou & Poggio (2000) Polynomial SVM

  17. Edges and chamfer distance Gavrila, Philomin, ICCV 1999

  18. Edge fragments Opelt, Pinz, Zisserman, ECCV 2006 Weak detector = k edge fragments and threshold. Chamfer distance uses 8 orientation planes

  19. Histograms of oriented gradients • Shape context • Belongie, Malik, Puzicha, NIPS 2000 • SIFT, D. Lowe, ICCV 1999 • Dalal & Trigs, 2006

  20. Weak detectors Part based: similar to part-based generative models. We create weak detectors by using parts and voting for the object center location Screen model Car model These features are used for the detector on the course web site.

  21. Weak detectors First we collect a set of part templates from a set of training objects. Vidal-Naquet, Ullman, Nature Neuroscience 2003 …

  22. Weak detectors We now define a family of “weak detectors” as: * = = Better than chance

  23. Weak detectors We can do a better job using filtered images = * = * = Still a weak detector but better than before

  24. Example: screen detection Feature output

  25. Example: screen detection Thresholded output Feature output Weak ‘detector’ Produces many false alarms.

  26. Example: screen detection Thresholded output Feature output Strong classifier at iteration 1

  27. Example: screen detection Thresholded output Feature output Strong classifier Second weak ‘detector’ Produces a different set of false alarms.

  28. Example: screen detection Thresholded output Feature output Strong classifier + Strong classifier at iteration 2

  29. Example: screen detection Thresholded output Feature output Strong classifier + … Strong classifier at iteration 10

  30. Example: screen detection Thresholded output Feature output Strong classifier + … Adding features Final classification Strong classifier at iteration 200

  31. 100% Precision 0% Recall 100% 100 features 30 features Cascade of classifiers 3 features We want the complexity of the 3 features classifier with the performance of the 100 features classifier: Fleuret and Geman 2001, Viola and Jones 2001 Select a threshold with high recall for each stage. We increase precision using the cascade

  32. Some goals for object recognition • Able to detect and recognize many object classes • Computationally efficient • Able to deal with data starving situations: • Some training samples might be harder to collect than others • We want on-line learning to be fast

  33. Shared features • Is learning the object class 1000 easier than learning the first? • Can we transfer knowledge from one object to another? • Are the shared properties interesting by themselves? …

  34. Shared features • Binary classifiers that share features: Screen detector Car detector Face detector • Independent binary classifiers: Screen detector Car detector Face detector Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007

  35. 50 training samples/class 29 object classes 2000 entries in the dictionary Results averaged on 20 runs Error bars = 80% interval Class-specific features Shared features Krempp, Geman, & Amit, 2002 Torralba, Murphy, Freeman. CVPR 2004

  36. Generalization as a function of object similarities 12 viewpoints 12 unrelated object classes K = 2.1 K = 4.8 Area under ROC Area under ROC Number of training samples per class Number of training samples per class Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007

  37. Sharing patches • Bart and Ullman, 2004 For a new class, use only features similar to features that where good for other classes: Proposed Dog features

  38. Sharing transformations Miller, E., Matsakis, N., and Viola, P. (2000). Learning from one example through shared densities on transforms. In IEEE Computer Vision and Pattern Recognition. Transformations are shared and can be learnt from other tasks.

  39. Some references on multiclass • Caruana 1997 • Schapire, Singer, 2000 • Thrun, Pratt 1997 • Krempp, Geman, Amit, 2002 • E.L.Miller, Matsakis, Viola, 2000 • Mahamud, Hebert, Lafferty, 2001 • Fink 2004 • LeCun, Huang, Bottou, 2004 • Holub, Welling, Perona, 2005 • …

More Related