Stanford CS223B Computer Vision, Winter 2006 Lecture 8 Structure From Motion

1 / 44

# Stanford CS223B Computer Vision, Winter 2006 Lecture 8 Structure From Motion - PowerPoint PPT Presentation

Stanford CS223B Computer Vision, Winter 2006 Lecture 8 Structure From Motion. Professor Sebastian Thrun CAs: Dan Maynes-Aminzade, Mitul Saha, Greg Corrado Slides by: Gary Bradski, Intel Research and Stanford SAIL. Structure From Motion. features. camera.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Stanford CS223B Computer Vision, Winter 2006 Lecture 8 Structure From Motion

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Stanford CS223B Computer Vision, Winter 2006Lecture 8 Structure From Motion

Professor Sebastian Thrun

Slides by: Gary Bradski, Intel Research and Stanford SAIL

Structure From Motion

features

camera

Recover: structure (feature locations), motion (camera extrinsics)

Structure From Motion (1)

Structure From Motion (2)

Structure From Motion (3)

Structure From Motion
• Problem 1:
• Given n points pij =(xij, yij) in m images
• Reconstruct structure: 3-D locations Pj =(xj, yj, zj)
• Reconstruct camera positions (extrinsics) Mi=(Aj, bj)
• Problem 2:
• Establish correspondence: c(pij)
• SFM = Nonlinear Least Squares problem
• Minimize through
• Gauss-Newton
• Levenberg Marquardt (!)
• Prone to local minima
Count # Constraints vs #Unknowns
• m camera poses
• n points
• 2mn point constraints
• 6m+3n unknowns
• Suggests: need 2mn  6m + 3n
• But: Can we really recover all parameters???
How Many Parameters Can’t We Recover?

We can recover all but…

Count # Constraints vs #Unknowns
• m camera poses
• n points
• 2mn point constraints
• 6m+3n unknowns
• Suggests: need 2mn  6m + 3n
• But: Can we really recover all parameters???
• Can’t recover origin, orientation (6 params)
• Can’t recover scale (1 param)
• Thus, we need 2mn  6m + 3n -7
Are done?
• No, bundle adjustment has many local minima.
The “Trick Of The Day”
• Replace Perspective by Orthographic Geometry
• Replace Euclidean Geometry by Affine Geometry
• Solve SFM linearly (“closed” form, globally optimal)
• Post-Process to make solution Euclidean
• Post-Process to make solution perspective

Orthographic Camera Model

Extrinsic Parameters

Rotation

Orthographic Projection

Limit of Pinhole Model:

Orthographic Projection

Limit of Pinhole Model:

Orthographic Projection

The Affine SFM Problem

drop the

constraints

subject to

Count # Constraints vs #Unknowns
• m camera poses
• n points
• 2mn point constraints
• 8m+3n unknowns
• Suggests: need 2mn  8m + 3n
• But: Can we really recover all parameters???
How Many Parameters Can’t We Recover?

We can recover all but…

Points for Solving Affine SFM Problem
• m camera poses
• n points
• Need to have: 2mn  8m + 3n-12
Affine SFM

Fix coordinate system

by making p0=origin

Rank Theorem:

Q has rank 3

Proof:

The Rank Theorem

2m elements

n elements

Affine Solution to Orthographic SFM

Gives also the optimal affine reconstruction under noise

Back To Orthographic Projection

Find C and d for which constraints are met

Search in 12-dim space (instead of 8m + 3n-12)

Back To Projective Geometry

Orthographic (in the limit)

Projective

Back To Projective Geometry

O

X

-x

Z

f

Optimize

Using orthographic solution as starting point

The “Trick Of The Day”
• Replace Perspective by Orthographic Geometry
• Replace Euclidean Geometry by Affine Geometry
• Solve SFM linearly (“closed” form, globally optimal)
• Post-Process to make solution Euclidean
• Post-Process to make solution perspective

Structure From Motion
• Problem 1:
• Given n points pij =(xij, yij) in m images
• Reconstruct structure: 3-D locations Pj =(xj, yj, zj)
• Reconstruct camera positions (extrinsics) Mi=(Aj, bj)
• Problem 2:
• Establish correspondence: c(pij)
The Correspondence Problem

View 1

View 2

View 3

Correspondence: Solution 1
• Track features (e.g., optical flow)
• …but fails when images taken from widely different poses
Correspondence: Solution 2
• Compute soft correspondence: p(c|A,b,P)
• Plug soft correspondence into SFM
• Reiterate

See Dellaert/Seitz/Thorpe/Thrun, Machine Learning Journal, 2003

Correspondence: Alternative Approach
• Ransac [Fisher/Bolles]

= Random sampling and consensus

Summary SFM
• Problem
• Determine feature locations (=structure)
• Determine camera extrinsic (=motion)
• Two Principal Solutions
• Bundle adjustment (nonlinear least squares, local minima)
• SVD (through orthographic approximation, affine geometry)
• Correspondence
• (RANSAC)
• Expectation Maximization