1 / 51

Object recognition

Object recognition. Object Classes. Individual Recognition. Is this a dog?. Variability of Airplanes Detected. Variability of Horses Detected. Class Non-class . Class Non-class. Recognition with 3-D primitives. Geons. Visual Class: Common Building Blocks.

zeroun
Download Presentation

Object recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Object recognition

  2. Object Classes

  3. Individual Recognition

  4. Is this a dog?

  5. Variability of Airplanes Detected

  6. Variability of Horses Detected

  7. Class Non-class

  8. Class Non-class

  9. Recognition with 3-D primitives Geons

  10. Visual Class: Common Building Blocks

  11. Optimal Class Components? • Large features are too rare • Small features are found everywhere Find features that carry the highest amount of information

  12. Entropy Entropy: x = 0 1 H p = 0.5 0.5 ? 0.1 0.9 0.47 0.01 0.99 0.08

  13. Mutual Information I(x,y) X alone: p(x) = 0.5, 0.5 H = 1.0 X given Y: Y = 1 Y = 0 p(x) = 0.1, 0.9 H = 0.47 p(x) = 0.8, 0.2 H = 0.72 H(X|Y) = 0.5*0.72 + 0.5*0.47 = 0.595 H(X) – H(X|Y) = 1 – 0.595 = 0.405 I(X,Y) = 0.405

  14. Mutual information H(C) F=0 F=1 H(C) when F=1 H(C) when F=0 I(C;F) = H(C) – H(C/F)

  15. Mutual Information II

  16. Computing MI from Examples • Mutual information can be measured from examples: 100 Faces 100 Non-faces Feature: 44 times6 times Mutual information: 0.1525 H(C) = 1, H(C|F) = 0.8475

  17. Full KL Classification Error p(F|C) p(C) C F q(C|F)

  18. Optimal classification features • Theoretically: maximizing delivered information minimizes classification error • In practice: informative object components can be identified in training images

  19. Selecting Fragments

  20. Adding a New Fragment(max-min selection) MIΔ ? MI = MI [Δ ; class] - MI [ ; class ] Select: Maxi MinkΔMI (Fi, Fk) (Min. over existing fragments, Max. over the entire pool)

  21. Highly Informative Face Fragments

  22. 100 x Merit, weight 100 x Merit, weight 15 6 5 4 10 3 x Merit 2 5 1 a. 100 b. 100 0 0 0 1 2 3 4 0 1 2 3 1 . 5 Relative object size Relative object size 1 0 . 5 Relative mutual info. 100 x Merit, weight 0 1 . 2 0 1 2 3 1 - 0 . 5 0 . 8 Relative object size 0 . 6 0 . 4 0 . 2 0 0 0 . 5 1 1 . 5 2 Relative resolution Intermediate Complexity

  23. Decision Combine all detected fragments Fk: ∑wk Fk > θ

  24. Optimal Separation ∑wk Fk = θ is a hyperplane Perceptron SVM

  25. Combining fragments linearly Conditional independence: P(F1,F2 | C) = p(F1|C) p(F2|C) >θ >θ W(Fi) = log Σw(Fi) > θ

  26. Σw(Fi) > θ If Fi=1 take log If Fi=0 take log Instead: Σwi> θ On all the detected fragments only With: wi = w(Fi=1) – w(Fi=0)

  27. Class II

  28. Class Non-class

  29. Fragments with positions ∑wk Fk > θ On all detected fragments within their regions

  30. Horse-class features

  31. Examples of Horses Detected

  32. Interest points (Harris)SIFT Descriptors Ix2 IxIyIxIyIy2 ∑

  33. Harris Corner Operator <Ix2> < IxIy< < < yIxI <yI2> H = Averages within a neighborhood. Corner: The two eigenvalues λ1, λ2 are large Indirectly: ‘Corner’ = det(H) – k trace2(H)

  34. Harris Corner Examples

  35. SIFT descriptor Example: 4*4 sub-regions Histogram of 8 orientations in each V = 128 values: g1,1,…g1,8,……g16,1,…g16,8 David G. Lowe, "Distinctive image features from scale-invariant keypoints,"International Journal of Computer Vision, 60, 2 (2004), pp. 91-110

  36. SIFT

  37. Constellation of PatchesUsing interest points Six-part motorcycle model, joint Gaussian, Fegurs, Perona, Zissermann 2003

  38. Bag of wordsand Unsupervised Classification

  39. Bag of visual words A large collection of image patches

  40. – – Each class has its words historgram

  41. pLSA Classify document automatically, find related documents, etc. based on word frequency. Documents contain different ‘topics’ such as Economics, Sports, Politics, France… Each topic has its typical word frequency. Economics will have high occurrence of ‘interest’, ‘bonds’ ‘inflation’ etc. We observe the probabilities p(wi | dn) of words and documents Each document contains several topics, zk A word has different probabilities in each topic, p(wi | zk). A given document has a mixture of topics: p(zk | dn) The word-frequency model is: p(wi | dn) = Σkp(wi|zk) p(zk |dn) pLSA was used to discover topics, and arrange documents according to their topics.

  42. pLSA The word-frequency model is: p(wi | dn) = Σkp(wi|zk) p(zk |dn) We observe p(wi | dn) and find the best p(wi|zk) and p(zk |dn) to explain the data pLSA was used to discover topics, and then arrange documents according to their topics.

  43. Discovering objects and their location in images Sivic, Russel, Efros, Freedman & Zisserman CVPR 2005 Uses simple ‘visual words’ for classification Not the best classifier, but obtains unsupervised classification, using pLSA

  44. codewords dictionary Visual words – unsueprvised classification • Four classes: faces, cars, airplanes, motorbikes, and non-class. Training images are mixed. • Allowed 7 topics, one per class, the background includes 3 topics. • Visual words: local patches using SIFT descriptors. • (say local 10*10 patches)

  45. Learning • Data: the matrix Dij = p(wi | Ij) • During learning – discover ‘topics’ (classes + background) • p(wi | Ij) = Σ p(wi | Tk) p(Tk | Ij ) • Optimize over p(wi | Tk), p(Tk | Ij ) • The topics are expected to discover classes • Got mainly one topic per class image.

  46. Results of learning

  47. Classifying a new image • New image I: • Measure p(wi | I) • Find topics for the new image: • p(wi | I) = Σ p(wi | Tk) p(Tk | I) • Optimize over the topics Tk • Find the largest (non-background) topic

  48. Classifying a new image

  49. On general model learning • The goal is to classify C using a set of features F. • F have been selected (must have high MI(C;F)) • The next goal is to use F to decide on the class C. • Probabilistic approach: • Use observations to learn the joint distribution p(C,F) • In a new image, F is observed, find the most likely C, • Max (C) p(C,F)

More Related