1 / 111

4054 Machine Vision Two or more cameras

4054 Machine Vision Two or more cameras. Dr. Simon Prince Dept. Computer Science University College London. http:// www.cs.ucl.ac.uk/s.prince/4054.htm. Two or More Cameras. Introduction to 3D vision Stereo vision Geometry of two cameras Finding image keypoints

thimba
Download Presentation

4054 Machine Vision Two or more cameras

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 4054 Machine VisionTwo or more cameras Dr. Simon Prince Dept. Computer Science University College London http://www.cs.ucl.ac.uk/s.prince/4054.htm

  2. Two or More Cameras • Introduction to 3D vision • Stereo vision • Geometry of two cameras • Finding image keypoints • Finding correspondences between keypoints • Sparse stereo reconstruction • Dense stereo reconstruction • Shape from silhouette

  3. Two or More Cameras 1. Introduction to 3D vision

  4. Seeing the world Perspective projection converts from 3-D to 2-D

  5. 3D shape from 2D images • Single image cues • Perspective

  6. 3D shape from 2D images • Single image cues • Perspective • Contour

  7. 3D shape from 2D images • Single image cues • Perspective • Contour • Texture

  8. 3D shape from 2D images • Single image cues • Perspective • Contour • Texture • Arial perspective

  9. 3D shape from 2D images • Single image cues • Perspective • Contour • Texture • Areal perspective • Shading

  10. 3D shape from 2D images • Multiple image cues • space (stereo)

  11. 3D shape from 2D images • Multiple image cues • space (stereo) • time (motion)

  12. 3D shape from 2D images • Multiple image cues • space (stereo) • time (motion) • focus (depth from focus)

  13. 3D shape from 2D images • Multiple image cues • space (stereo) • time (motion) • focus (depth from focus) • silhouette

  14. Applications of 3D Models • Building models for computer graphics • Helping segmentation (breaking camouflage) • Cartography (satellite imagery). • Preserving ancient/cultural monuments • Rendering new views of objects • Measuring 3D face shape for biometrics Wrong Gaze Right Gaze Wrong Gaze

  15. Two or More Cameras 2. Stereo Vision

  16. Stereo Vision • Stereo vision refers to the ability to infer information on the 3D structure of a scene from two or more images taken from different viewpoints. • This can be achieved by observing the difference in position of image points corresponding to the same scene point.

  17. Binocular Stereo Common in biological vision… 3D scene binocular cameras 1 2 PC reconstruction (brighter closer) ...can we duplicate this ability with machine vision?

  18. O’ Principle of stereo vision is TRIANGULATION. O

  19. O’ Stereo Vision Problems O CALIBRATION: Establish the geometric relationship between the cameras

  20. O’ Stereo Vision Problems O CORRESPONDENCE: Find pairs of matching points in the two images

  21. O’ Stereo Vision Problems O RECONSTRUCTION: Calculate the three-dimensional position of point in scene.

  22. Two or More Cameras 2.1 Geometry of Two Cameras

  23. Correspondence THE BAD NEWS: for a given point in the first image (centre of square) we aim to find the same point in the second image. There are numerous regions which look possible. Two-dimensional search of the image is very expensive.

  24. ? ? ? ? Epipolar Geometry Epipolar Line O O THE GOOD NEWS: we do not need to perform a 2D search for correspondence, if we know the geometry of the stereo cameras.

  25. Epipoles Epipole Epipole O O The epipole is the image of the optical centre of the other camera. It is guaranteed to be on every epipolar line.

  26. Example: converging cameras

  27. Example: motion parallel with image plane

  28. Example: forward motion e’ e

  29. Epipoles and Epipolar Lines

  30. In coordinate system of second camera The Essential Matrix O O These three vectors are coplanar. Condition for coplanarity of a,b,c is a . (b x c) = 0

  31. The cross product can be expressed in matrix form as : Giving: Which can also be expressed as : Where E is known as the essential matrix. The Essential Matrix Substituting in:

  32. Essential Matrix Relation: Define: i.e. l is the epipolar line in the 1st camera Substituting in: The epipole are on every epipolar line. Hence, for the epipole in the first camera, the essential matrix relation will be satisfied for every x’. This implies that x lies in the nullspace of E. Properties of the Essential Matrix • 3x3 Matrix • Relates cameracoordsin 1st and 2nd images • 6 Degrees of freedom (3 rot, 3 trans) • Rank 2

  33. The essential matrix relationship was: Substituting in: Or: Where F is known as the fundamental matrix The Fundamental Matrix Up until now, we have been working in camera co-ordinates. However, if we do not know the intrinsic matrices of the cameras, we have only image co-ordinate to play with. The relation between them is:

  34. Expanding: This can be written as a dot product between the vectorized entries of F and the co-ordinate positions: Given a series of n matching points, we can form the linear system: Computing the Fundamental Matrix The fundamental matrix relation written out in full:

  35. Some complications • Must force singularity constraint • Scaling makes this numerically unpleasant • Algebraic minimization does not minimize the correct cost function • -Should minimize image distance to epipolar line. 100 10000 1

  36. Degenerate Cases • Degenerate cases • Planar scene • Pure rotation • No unique solution • Remaining DOF filled by noise • Use simpler model (e.g. homography) • Model selection (Torr et al., ICCV´98, Kanatani, Akaike) • Compare H and F according to expected residual error (compensate for model complexity)

  37. A chicken and egg problem... • Given a set of n>=8 matching points, we can calculate the fundamental matrix and hence determine the epipolar geometry • But matching points are hard to find without already knowing the epipolar geometry

  38. Correspondence • GOAL: To identify corresponding location of points in the right hand image to points in the left hand image • ASSUMPTIONS: • Most scene points visible from both viewpoints • Matching image points have similar pixel neighbourhoods • DECISIONS: • Which elements to match? • How to measure similarity? • THREE STAGES: • Finding robust image keypoints • Initial matching of these keypoints • Robust matching of keypoints

  39. Two or More Cameras 2.2 Finding image keypoints

  40. Keypoints / Corners / Feature Points Selectinterest points in each image. What makes a point interesting?

  41. Desirable Properties of Keypoints • DISTINCTIVENESS: distinct from other points in the image to minimize chance of false match • EASE OF EXTRACTION: fast and simple to extract • INVARIANCE: tolerant to • image noise • changes in illumination • uniform scaling • rotation • minor changes in viewing direction

  42. Scale Invariant Feature Transform(SIFT) • AIMS: • To find ~1000 keypoints per image • Have desirable properties listed • THREE STAGES: • Identify candidate points and localise in position and scale • Reject unstable points • Associate orientation with each point

  43. Keypoint Criterion: Scale-Space Extrema Repeatedly convolve (blur) image, I(x,y) with a Gaussian y X s where: Produces a stack of images with sharpest in bottom layer and most blurred at top

  44. In practice subsample every octave

  45. Take differences between layers

  46. Difference of Gaussian Scale Representation

  47. Scale Space Extrema Must be larger (or smaller) than each of its 26 neighbours in the image stack

  48. Localising Keypoints GOAL: To localize keypoints in space to subpixel accuracy. METHOD: 1. Take Taylor expansion around current point where D(X) is 3D DOG scale function and is offset from estimated position 2. Take derivative and equate to zero giving where or equivalently 3. Solve

  49. Suppress Low Contrast Points

  50. Suppressing Edges What we really want is corner points (e.g.Harris&Stephens´88; Shi&Tomasi´94) as edges are ambiguous in one direction. homogeneous edge corner This information captured by singular values of image structure tensor, H. Corner = both eigenvalues large. Edge = one large, one small. Homogenous = both small.

More Related