Stereo and projective structure from motion
This presentation is the property of its rightful owner.
Sponsored Links
1 / 49

Stereo and Projective Structure from Motion PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on
  • Presentation posted in: General

04/13/10. Stereo and Projective Structure from Motion. Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem. Many slides adapted from Lana Lazebnik, Silvio Saverese, Steve Seitz. This class. Recap of epipolar geometry Recovering structure

Download Presentation

Stereo and Projective Structure from Motion

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Stereo and projective structure from motion

04/13/10

Stereo and Projective Structure from Motion

Computer Vision

CS 543 / ECE 549

University of Illinois

Derek Hoiem

Many slides adapted from Lana Lazebnik, Silvio Saverese, Steve Seitz


This class

This class

  • Recap of epipolar geometry

  • Recovering structure

    • Generally, how can we estimate 3D positions for matched points in two images? (triangulation)

    • If we have a moving camera, how can we recover 3D points? (projective structure from motion)

    • If we have a calibrated stereo pair, how can we get dense depth estimates? (stereo fusion)


Basic questions

Basic Questions

  • Why can’t we get depth if the camera doesn’t translate?

  • Why can’t we get a nice panorama if the camera does translate?


Recap epipoles

Recap: Epipoles

  • Point x in left image corresponds to epipolar line l’ in right image

  • Epipolar line passes through the epipole (the intersection of the cameras’ baseline with the image plane


Recap fundamental matrix

Recap: Fundamental Matrix

  • Fundamental matrix maps from a point in one image to a line in the other

  • If x and x’ correspond to the same 3d point X:


Recap automatically relating projections

Recap: Automatically Relating Projections

Assume we have matched points x x’ with outliers

Homography (No Translation)

Fundamental Matrix (Translation)


Recap automatically relating projections1

Recap: Automatically Relating Projections

Assume we have matched points x x’ with outliers

Homography (No Translation)

Fundamental Matrix (Translation)

  • Correspondence Relation

  • Normalize image coordinates

  • RANSAC with 4 points

  • De-normalize:


Recap automatically relating projections2

Recap: Automatically Relating Projections

Assume we have matched points x x’ with outliers

Homography (No Translation)

Fundamental Matrix (Translation)

Correspondence Relation

Normalize image coordinates

RANSAC with 8 points

Enforce by SVD

De-normalize:

  • Correspondence Relation

  • Normalize image coordinates

  • RANSAC with 4 points

  • De-normalize:


Recap

Recap

  • We can get projection matrices P and P’ up to a projective ambiguity

  • Code:

    function P = vgg_P_from_F(F)

    [U,S,V] = svd(F);

    e = U(:,3);

    P = [-vgg_contreps(e)*F e];

See HZ p. 255-256


Recap1

Recap

  • Fundamental matrix song


Triangulation linear solution

Triangulation: Linear Solution

X

  • Generally, rays Cx and C’x’ will not exactly intersect

  • Can solve via SVD, finding a least squares solution to a system of equations

x

x'

Further reading: HZ p. 312-313


Triangulation linear solution1

Triangulation: Linear Solution

Given P, P’, x, x’

  • Precondition points and projection matrices

  • Create matrix A

  • [U, S, V] = svd(A)

  • X = V(:, end)

    Pros and Cons

  • Works for any number of corresponding images

  • Not projectively invariant

Code: http://www.robots.ox.ac.uk/~vgg/hzbook/code/vgg_multiview/vgg_X_from_xP_lin.m


Triangulation non linear solution

Triangulation: Non-linear Solution

  • Minimize projected error while satisfying xTFx=0

  • Solution is a 6-degree polynomial of t, minimizing

Further reading: HZ p. 318


Projective structure from motion

Projective structure from motion

Xj

x1j

x3j

x2j

P1

P3

P2

  • Given: m images of n fixed 3D points

  • xij = Pi Xj, i = 1,… , m, j = 1, … , n

  • Problem: estimate m projection matrices Pi and n 3D points Xj from the mn corresponding points xij

Slides from Lana Lazebnik


Projective structure from motion1

Projective structure from motion

  • Given: m images of n fixed 3D points

  • xij = Pi Xj, i = 1,… , m, j = 1, … , n

  • Problem: estimate m projection matrices Pi and n 3D points Xj from the mn corresponding points xij

  • With no calibration info, cameras and points can only be recovered up to a 4x4 projective transformation Q:

  • X → QX, P → PQ-1

  • We can solve for structure and motion when

  • 2mn >= 11m +3n – 15

  • For two cameras, at least 7 points are needed


Sequential structure from motion

Sequential structure from motion

  • Initialize motion from two images using fundamental matrix

  • Initialize structure by triangulation

  • For each additional view:

    • Determine projection matrix of new camera using all the known 3D points that are visible in its image – calibration

points

cameras


Sequential structure from motion1

Sequential structure from motion

  • Initialize motion from two images using fundamental matrix

  • Initialize structure by triangulation

  • For each additional view:

    • Determine projection matrix of new camera using all the known 3D points that are visible in its image – calibration

    • Refine and extend structure: compute new 3D points, re-optimize existing points that are also seen by this camera – triangulation

points

cameras


Sequential structure from motion2

Sequential structure from motion

  • Initialize motion from two images using fundamental matrix

  • Initialize structure by triangulation

  • For each additional view:

    • Determine projection matrix of new camera using all the known 3D points that are visible in its image – calibration

    • Refine and extend structure: compute new 3D points, re-optimize existing points that are also seen by this camera – triangulation

  • Refine structure and motion: bundle adjustment

points

cameras


Bundle adjustment

Bundle adjustment

  • Non-linear method for refining structure and motion

  • Minimizing reprojection error

Xj

P1Xj

x3j

x1j

P3Xj

P2Xj

x2j

P1

P3

P2


Self calibration

Self-calibration

  • Self-calibration (auto-calibration) is the process of determining intrinsic camera parameters directly from uncalibrated images

  • For example, when the images are acquired by a single moving camera, we can use the constraint that the intrinsic parameter matrix remains fixed for all the images

    • Compute initial projective reconstruction and find 3D projective transformation matrix Q such that all camera matrices are in the form Pi = K [Ri| ti]

  • Can use constraints on the form of the calibration matrix: zero skew


Summary so far

Summary so far

  • From two images, we can:

    • Recover fundamental matrix F

    • Recover canonical cameras P and P’ from F

    • Estimate 3d position values X for corresponding points x and x’

  • For a moving camera, we can:

    • Initialize by computing F, P, X for two images

    • Sequentially add new images, computing new P, refining X, and adding points

    • Auto-calibrate assuming fixed calibration matrix to upgrade to similarity transform


Stereo and projective structure from motion

Photo synth

Noah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring photo collections in 3D," SIGGRAPH 2006

http://photosynth.net/


3d from multiple images

3D from multiple images

Building Rome in a Day: Agarwal et al. 2009


Plug steve seitz talk

Plug: Steve Seitz Talk

  • Steve Seitz will talk about “Reconstructing the World from Photos on the Internet”

    • Monday, April 26th, 4pm in Siebel Center


Special case dense binocular stereo

Special case: Dense binocular stereo

  • Fuse a calibrated binocular stereo pair to produce a depth image

image 1

image 2

Dense depth map

Many of these slides adapted from Steve Seitz and Lana Lazebnik


Basic stereo matching algorithm

Basic stereo matching algorithm

  • For each pixel in the first image

    • Find corresponding epipolar line in the right image

    • Examine all pixels on the epipolar line and pick the best match

    • Triangulate the matches to get depth information

  • Simplest case: epipolar lines are scanlines

    • When does this happen?


Simplest case parallel images

Simplest Case: Parallel images

  • Image planes of cameras are parallel to each other and to the baseline

  • Camera centers are at same height

  • Focal lengths are the same


Simplest case parallel images1

Simplest Case: Parallel images

  • Image planes of cameras are parallel to each other and to the baseline

  • Camera centers are at same height

  • Focal lengths are the same

  • Then, epipolar lines fall along the horizontal scan lines of the images


Special case of fundamental matrix

Special case of fundamental matrix

Epipolar constraint:

R = I t = (T, 0, 0)

x

x’

t

The y-coordinates of corresponding points are the same!


Depth from disparity

Depth from disparity

X

z

x

x’

f

f

BaselineB

O

O’

Disparity is inversely proportional to depth!


Stereo image rectification

Stereo image rectification


Stereo image rectification1

Stereo image rectification

  • Reproject image planes onto a common plane parallel to the line between optical centers

  • Pixel motion is horizontal after this transformation

  • Two homographies (3x3 transform), one for each input image reprojection

  • C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision. IEEE Conf. Computer Vision and Pattern Recognition, 1999.


Rectification example

Rectification example


Basic stereo matching algorithm1

Basic stereo matching algorithm

  • If necessary, rectify the two stereo images to transform epipolar lines into scanlines

  • For each pixel x in the first image

    • Find corresponding epipolar scanline in the right image

    • Examine all pixels on the scanline and pick the best match x’

    • Compute disparity x-x’ and set depth(x) = 1/(x-x’)


Correspondence search

Correspondence search

Left

Right

  • Slide a window along the right scanline and compare contents of that window with the reference window in the left image

  • Matching cost: SSD or normalized correlation

scanline

Matching cost

disparity


Correspondence search1

Correspondence search

Left

Right

scanline

SSD


Correspondence search2

Correspondence search

Left

Right

scanline

Norm. corr


Effect of window size

Effect of window size

  • Smaller window

    • More detail

    • More noise

  • Larger window

    • Smoother disparity maps

    • Less detail

W = 3

W = 20


Failures of correspondence search

Failures of correspondence search

Occlusions, repetition

Textureless surfaces

Non-Lambertian surfaces, specularities


Results with window search

Results with window search

Data

Window-based matching

Ground truth


How can we improve window based matching

How can we improve window-based matching?

  • So far, matches are independent for each point

  • What constraints or priors can we add?


Stereo constraints priors

Stereo constraints/priors

  • Uniqueness

    • For any point in one image, there should be at most one matching point in the other image


Stereo constraints priors1

Stereo constraints/priors

  • Uniqueness

    • For any point in one image, there should be at most one matching point in the other image

  • Ordering

    • Corresponding points should be in the same order in both views


Stereo constraints priors2

Stereo constraints/priors

  • Uniqueness

    • For any point in one image, there should be at most one matching point in the other image

  • Ordering

    • Corresponding points should be in the same order in both views

Ordering constraint doesn’t hold


Non local constraints

Non-local constraints

  • Uniqueness

    • For any point in one image, there should be at most one matching point in the other image

  • Ordering

    • Corresponding points should be in the same order in both views

  • Smoothness

    • We expect disparity values to change slowly (for the most part)


Stereo matching as energy minimization

Stereo matching as energy minimization

I2

D

  • Energy functions of this form can be minimized using graph cuts

I1

W1(i)

W2(i+D(i))

D(i)

  • Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001


Many of these constraints can be encoded in an energy function and solved using graph cuts

Many of these constraints can be encoded in an energy function and solved using graph cuts

Ground truth

Graph cuts

  • Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001

  • For the latest and greatest: http://www.middlebury.edu/stereo/


Summary

Summary

  • Recap of epipolar geometry

    • Epipoles are intersection of baseline with image planes

    • Matching point in second image is on a line passing through its epipole

    • Fundamental matrix maps from a point in one image to an epipole in the other

    • Can recover canonical camera matrices from F (with projective ambiguity)

  • Recovering structure

    • Triangulation to recover 3D position of two matched points in images with known projection matrices

    • Sequential algorithm to recover structure from a moving camera, followed by auto-calibration by assuming fixed K

    • Get depth from stereo pair by aligning via homography and searching across scanlines to match; Depth is inverse to disparity.


Next class

Next class

  • KLT tracking

  • Elegant SFM method using tracked points, assuming orthographic projection

  • Optical flow


  • Login