1 / 23

An opposition to Window-Scanning Approaches in Computer Vision

An opposition to Window-Scanning Approaches in Computer Vision. Presented by Tomasz Malisiewicz March 6, 2006 Advanced Perception @ The Robotics Institute. 2 Problems. Does scanning windows across an image work? What types of objects does it work for?. Context. aka Top-Down Processing.

bruce-lloyd
Download Presentation

An opposition to Window-Scanning Approaches in Computer Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An opposition to Window-Scanning Approaches in Computer Vision Presented by Tomasz Malisiewicz March 6, 2006 Advanced Perception @ The Robotics Institute

  2. 2 Problems • Does scanning windows across an image work? • What types of objects does it work for?

  3. Context aka Top-Down Processing *Following Slides Borrowed From Derek Hoiem’s “Putting Context Into Vision” Presentation What are window-scanning approaches missing?

  4. Quick Question: What is this?

  5. Context What is context? • Any data or meta-data not directly produced by the presence of an object • Nearby image data

  6. What is context? • Any data or meta-data not directly produced by the presence of an object • Nearby image data • Scene information Context Context

  7. What is context? • Any data or meta-data not directly produced by the presence of an object • Nearby image data • Scene information • Presence, locations of other objects Tree

  8. Clues for Function • What is this?

  9. Clues for Function • What is this? • Now can you tell?

  10. Low-Res Scenes • What is this?

  11. Low-Res Scenes • What is this? • Now can you tell?

  12. More Low-Res • What are these blobs?

  13. More Low-Res • The same pixels! (a car)

  14. Why is context useful? • Objects defined at least partially by function • Trees grow in ground • Birds can fly (usually) • Door knobs help open doors

  15. Why is context useful? • Objects defined at least partially by function • Context gives clues about function • Not rooted into the ground  not tree • Object in sky  {cloud, bird, UFO, plane, superman} • Door knobs always on doors

  16. Why is context useful? • Objects defined at least partially by function • Context gives clues about function • Objects like some scenes better than others • Toilets like bathrooms • Fish like water

  17. Why is context useful? • Objects defined at least partially by function • Context gives clues about function • Objects like some scenes better than others • Many objects are used together and, thus, often appear together • Kettle and stove • Keyboard and monitor

  18. The other* problem • What types of objects does it work for? *Assuming we can just directly avoid the first problem

  19. “However, such approaches seem unlikely to scale up to the detection of hundreds or thousands of different object classes because each classifier is trained and run independently.” – Torralba and Murphy and Freeman from Sharing features: efficient boosting procedures for multiclass object detection • “Our goal is to develop a system that detects and recognizes many kinds of objects in photographs and video including everyday office objects, text captions in video, and various structures in biomedical imagery.” – Schneiderman and Kanade from Object Detection Using the Statistics of Parts How many different classifiers must one construct? A different classifier for each object? A different classifier for each pose of an object? How many poses do we need per object?

  20. Too many windows • Now imagine scanning a window and applying 100K independent classifiers at each window

  21. Conclusion • Without context, we can’t find all things we want to find. We need context to help constrain the search for objects. • With independent classifiers per object (and per pose), we can’t detect a large number of objects. Should cow detectors and a horse detectors be built independently? Think along the lines of a horse and a cow are types of animals that often occur in similar contexts. • Remember that complex and deformable objects would require many poses if are to adhere to the window-based classifier paradigm.

  22. Thank you. *Pascal 2006 Visual Challenge Image

More Related