slide1 l.
Skip this Video
Download Presentation
Rigid Structure from Video

Loading in 2 Seconds...

play fullscreen
1 / 46

Rigid Structure from Video - PowerPoint PPT Presentation

  • Uploaded on

Rigid Structure from Video Pedro M. Q. Aguiar Outline Other methods - limitations Proposed approach Problem formulation Algorithms Experiments Motivation Segmentation of 2D rigid moving objects Inference of 3D rigid structure Content-based video representation

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Rigid Structure from Video' - benjamin

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Rigid Structure from Video

Pedro M. Q. Aguiar

  • Other methods - limitations
  • Proposed approach
  • Problem formulation
  • Algorithms
  • Experiments
  • Motivation
  • Segmentation of 2D rigid moving objects
  • Inference of 3D rigid structure
Content-based video representation

apps: compression, non-linear editing, virtual reality, etc

  • Video
  • Generative Video (GV) [Jasinschi & Moura, 95]
    • flat scenario
    • flat moving objects
  • PROBLEM: Segmentation of 2D rigid moving objects
  • 3D content-based representation
    • 3D rigid shape
    • 3D motion
  • PROBLEM: Inference of 3D rigid structure (shape and motion)
motion segmentation in low texture
Motion segmentation in low texture

with low texture,

segmentation fails !

  • Two-frame motion-based segmentation
    • No prior knowledge about shape, texture
  • [Diehl, 91]

time consumingalgorithms !

  • Possible solution - smoothing
    • Statistical regularization [Dubuisson & Jain, 95]
    • Combine motion with other attributes [Bouthemy & François, 93]
  • Proposed approach - exploit rigidity over a set of frames
    • Explicit modeling of occlusion
    • Feasible implementation of MLE
observation model
Observation model


camera window

camera position

camera position

object template

(modeling of oclusion)

object position

object texture


maximum likelihood estimation
Maximum Likelihood estimation
  • Given
    • set of F frames
  • Estimate
    • background texture
    • object texture
    • object template
    • camera motion
    • object motion
  • ML cost function

over all frames and pixels

  • ML estimate
minimization procedure
Minimization procedure
  • ML estimation

quadratic in O and B

average of the observations,

after registration

  • Object and background estimates

linear in T

average of the observations, in the

regions not occluded by the object

nonlinear in T

  • Decouple the estimation of the position vectors
  • Motion is estimated on a frame by frame basis [Bergen et al, 92]
minimization procedure two step iterative method
Minimization procedure - two-step iterative method
  • Replacing and in the ML cost function

nonlinear minimization !

  • Replacing only in the ML cost function
  • minimize using a two-step iterative method:
  • solve for with fixed
  • solve for with fixed

(quadratic, closed-form solution)

(linear, closed-form solution)

minimization procedure segmentation matrix
Minimization procedure - segmentation matrix



  • Template estimate
  • Replacing only in the ML cost function

Accumulated differences between each pair of co-registered frames

Accumulated differences between each frame and the background

  • regions where the test is inconclusive

with the available F frames

linear in T !


moving object

three frames from the image sequence



background estimate

Two-step method

template estimate


background estimate

moving objects

four frames from a video sequence

3d structure from 2d video
3D structure from 2D video
  • Motivation: 3D content-based video representation (application areas go well behind digital video)
  • Key step: recovery of 3D shape and 3D motion from an image sequence
  • Strongest cue: motion of the brightness pattern
  • Structure From Motion:
    • Step 1. Compute the 2D motion on the image plane
    • Step 2. Recover the 3D motion and the depth
two frame sfm common problem
Two-frame SFM - common problem
  • step 1. track feature points across a set of frames
  • step 2. recover relative depth and set of 3D positions
  • Two-frame SFM failswhen object is far from camera


  • Solution: exploit rigidity - multi-frame SFM
  • Multi-frame Structure From Motion:
factorization method
Factorization method

expedite method

  • Factorization [Tomasi & Kanade, 92]:
    • uses linear subspace constraints
    • 3D structure is estimated by factorizing a measurement matrix R whose entries are the trajectories of feature point projections
    • without noise, R is rank 3. AnSVD is used to factorize matrix R
  • Multi-frame SFM - hard problem:
    • non-linear
    • large set of unknowns (due to the entire set of 3D positions)
  • Problems:
    • track a large set of features: computationally very heavy, if possible
    • cost of SVD: high for large number of features or frames
proposed approach surfaced based factorization
Proposed approach: surfaced-based factorization
  • Induces a parametric description for the 2D motion in the image plane
  • Recover the 3D shape and 3D motion parameters from the 2D motion parameters by further exploiting linear subspace constraints:
    • surface-based factorization
    • rank 1 factorization
    • weighted factorization

uses a fast algorithm to compute only the largest singular value

computes the weighted estimate without additional computational cost

  • Describe the 3D shape by a local parameterization
maximum likelihood formulation
Maximum Likelihood formulation

rather than the two components of the motion, local depth is a single unknown

  • Observations: the images in the sequence. Unknowns: object texture, 3D shape, 3D motion
  • Through ML, 3D structure is recovered:
    • Exploiting object rigidity over a set of frames
    • Directly from the image intensity values

so, where do SFM and factorization come from ?

  • Minimization procedure :
    • Minimize with respect to the texture in terms of 3D shape and 3D motion
    • After replacing the texture estimate, the ML cost function depends on the 3D structure only through the 2D motion in the image plane
    • Estimate 3D motion by inferring SFM (factorization). Plug-in the 3D motion estimates
    • Minimize the ML cost function with respect to the relative depth
  • Local 2D motion estimation is ill-posed - aperture problem. Direct methods:
    • Infer 3D structure by using the brightness change constraintbetween two frames [Horn & Weldon, 88]
    • Kalman filter to update estimates over time [J. Hell, 90]
observation model18
Observation model
  • Observation model



3D position

  • Unknowns:
texture estimate
Texture estimate
  • Texture estimate - weighted average
  • ML estimate
sfm as an approximation to mle
SFM as an approximation to MLE
  • The ML cost function depends on the 3D structure only through the 2D motion induced in the image plane (no approximations involved)
  • Insert the texture estimate into the cost function
  • 3D structure estimation:
    • 3D motion estimation:
      • Compute 2D motion
      • SFM: rank 1 surface-based factorization
  • 3D shape estimation:
    • Plug-in the 3D motion estimate into the ML cost function
    • Then, minimize with respect to the shape
  • (The estimates can be refined by minimizing the ML cost function in two alternate steps,
  • but initialization is the key problem)
feature based sfm
Feature-based SFM

Translation estimate:


rank 1 factorization
Rank 1 factorization
  • Decomposition (minimize without constraints)


  • Normalization (computes by approximating the constraints)


rank 1 factorization experiment
Rank 1 factorization - experiment

three larger singularvalues of R

matrix is well described

by its largest singular value

rank 1 factorization experiment24
Rank 1 factorization - experiment

all trajectories have equal shape - it depends only on the 3D motion. The scaling factor depends on the 3D shape (relative depth)

3D shape and 3D motion are observed in a coupled way through the feature trajectories

surface based factorization
Surface-based factorization
  • Orthographic projection

(easily extended to scaled-orthographic and para-perspective projections)

  • 2D motion in the image plane is affine

Relation between the parameters:

  • Rank 1 factorization

Multi-frame SFM:

  • Piecewise planar 3D shapes
surface based factorization experiment
Surface-based factorization - experiment

smooth texture

image motion


image sequence

weighted factorization
Weighted factorization

observation noise

  • rank 1 factorization
weighted factorization experiment
Weighted factorization - experiment

non-weighted estimates

weighted estimates

two components of translation

six entries of the rotation matrix

feature trajectories

ml estimate of the 3d shape
ML estimate of the 3D shape
  • Image motion:

known motion parameters

affine mapping that depends only on the 3D motion

  • Define a sequence:
  • Motion of the affine mapped sequence:

unknown relative depth

shape of the trajectory of s (known from 3D motion)

magnitude of the trajectory of s (unknown relative depth)

  • Plug-in the 3D motion estimate into the ML cost function
  • Estimating the relative depth after plugging-in the 3D motion is more constrained than estimating the image motion
  • Motivation for the minimization procedure
minimization procedure multiresolution
Minimization procedure - multiresolution
  • Multiresolution continuation-type method
    • coarse-to-fine as more images are being taken into account
    • each stage minimizes the ML cost function by using a Gauss-Newton method

components of the image gradient

  • Region R - constant relative depth z
  • Image sequence:
  • and motion:
  • Shape

Affine mapped image sequence:

  • Shape:

without smoothing

Multiresolution continuation-type method. Shape estimate:

  • Synthesizing different views:
application video compression
Application - video compression


Compressed 317:1

Compressed 575:1

Texture patches JPEG compressed

major contributions and extensions
Major contributions and extensions
  • Explicit modeling of occlusion
  • Multiframe motion segmentation algorithm (two-step)
  • Surface-based factorization
  • Rank 1 factorization
  • Weighted factorization
  • extension: contour model
  • extensions:
  • other projection models
  • multibody
  • occlusion
  • 3D deformable shape from a set of cameras
  • subspace constraints for image motion estimation
  • Multiresolution algorithm for direct inference of 3D shape
  • extension: parameterized surface model

Multiresolution continuation-type method. Shape estimate: