1 / 43

Introduction

Introduction. Michael Bleyer LVA Stereo Vision. VU Stereo Vision (3.0 ECTS/2.0 WS). Anrechenbarkeit: Wahlfach im Masterstudium “Computergraphik & Digitale Bildverarbeitung” Wahlfach im Masterstudium “Medieninformatik” Webseite der LVA:

Download Presentation

Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction Michael Bleyer LVA Stereo Vision

  2. VU Stereo Vision (3.0 ECTS/2.0 WS) • Anrechenbarkeit: • Wahlfach im Masterstudium “Computergraphik & Digitale Bildverarbeitung” • Wahlfach im Masterstudium “Medieninformatik” • Webseite der LVA: • http://www.ims.tuwien.ac.at/teaching_detail.php?ims_id=188.HQK

  3. VU Stereo Vision (3.0 ECTS/2.0 WS) • Vorlesungstermine (9 Einheiten): • Fr 4. März (10.00-11.30) • Fr 11. März (10.00-11.30) • Fr 18. März (10.00-11.30) • Fr 25. März (10.00-11.30) • Fr 01. April (10.00-11.30) • Fr 08. April (10.00-11.30) • Fr 15. April (10.00-11.30) • Fr 06. Mai (10.00-11.30) • Fr 13. Mai (10.00-11.30) • Mündliche Prüfung nach Vereinbarung • Ort: • Seminarraum 188/2

  4. Topics Covered in the Lecture (1) • Session 1 - Introduction • 3D perception • What is disparity? • Applications • Session 2 – Basics • 3D geometry • Challenges in stereo matching • Assumptions • Session 3 – Local methods • Principle • Adaptive windows • Section 4 – Global methods • Stereo as energy minimization problem • Dynamic programming

  5. Topics Covered in the Lecture (2) • Session 5 - Graph‐Cuts • Alpha-expansions • Fusion-moves • Session 6 – Smoothness Term • Belief propagation • Different smoothness terms • Session 7 – Data Term • Different match measures • Role of color • Session 8 - Segmentation‐Based Stereo • Occlusion handling • An example algorithm • Stereo and matting • Session 9 – Surface Stereo • One of our recent paper • Demo - Autostereoscopic display

  6. Homework • Implement a block matching algorithm • You will get more details in session 3. • The algorithm is very simple. • I do not care about programming languages. • You have to present your algorithm as part of the oral exam.

  7. My Promise • After attending the lecture you should: • Know the basics of stereo visions • Be able to understand the current state-of-the art • Have understood several principles that you can also use for other vision problems: • Optical flow • Segmentation • Matting • Inpainting • Image Restoration • …

  8. 3D Perception Michael Bleyer LVA Stereo Vision

  9. 3D Perception Human-Eye Separation(~6.5cm) Brain 3D View Left 2D Image Right 2D Image

  10. 3D Perception Human-Eye Separation(~6.5cm) Brain If we ensure that the left eye sees a 2D image and the right eye sees another one, our brain will try to overlay the images to generate a 3D impression. How can we use this for watching 3D movies? 3D View Left 2D Image Right 2D Image

  11. Anaglyphs • Two images of complementary color are overlaid to generate one image. • Glasses required (e.g. red/green) • Red filter cancels out red image component, green filter cancels out green component • Each eye gets one image => 3D impression • Current 3D cinemas use this principle. However, polarization filters are used instead of color filters. (Anaglyph Image) (Red/Green Glasses)

  12. Shutter Glasses • Display flickers between left and right image (i.e. each even frame shows left image, each odd frame shows right image) • When left frame is shown, shutter glasses close right eye and vice versa. • Requires new displays of high frame rate (120Hz). • Currently pushed by Nvidea to address gaming market. (Shutter Glasses and 120 Hz Display) (Nvidea Artwork)

  13. Autostereoscopic Displays • No glasses required! • Matrix of many transparent lenses put on the display. • Lenses distort pixels so that left eye gets a left image and right eye gets a right image (if you are standing in a sweet spot) => 3D impression • Novel viewpoint capability: • You can walk in front of the display and get a perceptively correct depth impression depending on your current viewpoint. • You will get a demo soon (Philips Wowvx Display)

  14. Free Viewing (No glasses required, but some practice) • The way how you usually look at the display (no 3D):

  15. Free Viewing (No glasses required, but some practice) • Parallel Viewing: Right Image Left Image

  16. Free Viewing (No glasses required, but some practice) • Cross Eye Viewing: • Most likely the simpler method. Left Image Right Image

  17. Learning Cross Eye Viewing • Take a pencil and hold it in the middle of your eyes. • Look at the pencil and slowly change its distance to your eyes • If you found the right distance, you see a third image inbetween left and right images. • This third image is in 3D • Practise, it is worth the effort. Right 2D Image Left 2D Image

  18. 3D on YouTube

  19. Computational Stereo Michael Bleyer LVA Stereo Vision

  20. Computational Stereo Brain Replace human eyes with a pair of slightly displaced cameras. 3D View Left 2D Image Right 2D Image

  21. Computational Stereo Displacement (Stereo Baseline) Brain Replace human eyes with a pair of slightly displaced cameras. 3D View Left 2D Image Right 2D Image

  22. Computational Stereo Displacement (Stereo Baseline) Brain 3D View Left 2D Image Right 2D Image

  23. Computational Stereo Displacement (Stereo Baseline) Computer 3D View Left 2D Image Right 2D Image

  24. Computational Stereo Displacement (Stereo Baseline) Computer 3D View Left 2D Image Right 2D Image

  25. Computational Stereo Displacement (Stereo Baseline) Computer How can we accomplish a fully automatic 2D to 3D conversion? 3D View Left 2D Image Right 2D Image

  26. What is Disparity? • The amount to which a single pixel is displaced in the two images is called disparity. • A pixel’s disparity is inversely proportional to its depth in the scene.

  27. What is Disparity? Background Disparity (Small) • The amount to which a single pixel is displaced in the two images is called disparity. • A pixel’s disparity is inversely proportional to its depth in the scene.

  28. What is Disparity? Foreground Disparity (Large) • The amount to which a single pixel is displaced in the two images is called disparity. • A pixel’s disparity is inversely proportional to its depth in the scene.

  29. Disparity Encoding • The disparity of each pixel is encoded by a grey value. • High grey values represent high disparities (and low gray values small disparities). • The resulting image is called disparity map.

  30. Disparity and Depth • The disparity map contains sufficient information for generating a 3D model.

  31. Disparity and Depth The challenging part is to compute the disparity map. This task is known as the stereo matching problem. Stereo matching will be the topic of this lecture!!! • The disparity map contains sufficient information for generating a 3D model.

  32. Applications • (just a few examples)

  33. 3D Reconstruction from aerial images • Stereo cameras are mounted on an airplane to obtain a terrain map. • Images taken from http://www.robotic.de/Heiko.Hirschmueller/

  34. 3D Reconstruction of Cities • City of Dubrovnik reconstructed from images taken from Flickr in a fully automatic way. • [S. Agarwal, N. Snavely, I. Simon, S. Seitz and R. Szeliski “Building Rome in a Day”, ICCV, 2009]

  35. Driver Assistance / Autonomous driving cars • For example, use stereo to measure distance to other cars. • DARPA Grand Challenge • Image taken from http://www.cs.auckland.ac.nz/~rklette/talks/08_AI.pdf

  36. The Mars Rover • Reconstruct the surface of Mars using stereo vision

  37. Human Motion Capture • Fit a 3D model of the human body to the computed point cloud. • [R. Plänkers and P. Fua, “Articulated Soft Objects for Multiview Shape and Motion Capture”, PAMI, 2003]

  38. Bilayer Segmentation – Z-Keying • Goal: Divide image into a foreground and a background region. • Simple background subtraction will fail if there is motion in the background. • Solution: • Compute depth map • If the depth of a pixel is larger than a predefined threshold, pixels belongs to the foreground • [A. Criminisi, G. Cross, A. Blake and V. Kolmogorov, “Bilayer Segmentation of Live Video”, CVPR, 2006]

  39. Novel View Generation • Given a 3D model of the scene, one can use a virtual camera to record new views from arbitrary viewpoints. • For example: Freeze frame effect from the movie Matrix. • [L. Zitnick, S. Kang, M. Uyttendaele, S. Winder, and R. Szeliski, "High-quality video view interpolation using a layered representation", SIGGRAPH, 2004] Left View (recorded) Virtual Interpolated View (not recorded) Right View (recorded)

  40. MS Kinect

  41. Understanding Human Vision • If we can teach the computer to see in 3D, we can also learn more about the way how the human perceives depth.

  42. Summary • 3D Perception • Principle of human 3D vision • Ways for watching movies in 3D • Computational Stereo • Stereo Matching Problem • Applications

More Related