Slide1 l.jpg
This presentation is the property of its rightful owner.
1 / 46

Rigid Structure from Video PowerPoint PPT Presentation

Rigid Structure from Video Pedro M. Q. Aguiar Outline Other methods - limitations Proposed approach Problem formulation Algorithms Experiments Motivation Segmentation of 2D rigid moving objects Inference of 3D rigid structure Content-based video representation

Download Presentation

Rigid Structure from Video

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Slide1 l.jpg

Rigid Structure from Video

Pedro M. Q. Aguiar

Outline l.jpg


  • Other methods - limitations

  • Proposed approach

  • Problem formulation

  • Algorithms

  • Experiments

  • Motivation

  • Segmentation of 2D rigid moving objects

  • Inference of 3D rigid structure

Motivation l.jpg

Content-based video representation

apps: compression, non-linear editing, virtual reality, etc


  • Video

  • Generative Video (GV) [Jasinschi & Moura, 95]

    • flat scenario

    • flat moving objects

  • PROBLEM: Segmentation of 2D rigid moving objects

  • 3D content-based representation

    • 3D rigid shape

    • 3D motion

  • PROBLEM: Inference of 3D rigid structure (shape and motion)

Motion segmentation in low texture l.jpg

Motion segmentation in low texture

with low texture,

segmentation fails !

  • Two-frame motion-based segmentation

    • No prior knowledge about shape, texture

  • [Diehl, 91]

time consumingalgorithms !

  • Possible solution - smoothing

    • Statistical regularization [Dubuisson & Jain, 95]

    • Combine motion with other attributes [Bouthemy & François, 93]

  • Proposed approach - exploit rigidity over a set of frames

    • Explicit modeling of occlusion

    • Feasible implementation of MLE

Observation model l.jpg

Observation model


camera window

camera position

camera position

object template

(modeling of oclusion)

object position

object texture


Maximum likelihood estimation l.jpg

Maximum Likelihood estimation

  • Given

    • set of F frames

  • Estimate

    • background texture

    • object texture

    • object template

    • camera motion

    • object motion

  • ML cost function

over all frames and pixels

  • ML estimate

Minimization procedure l.jpg

Minimization procedure

  • ML estimation

quadratic in O and B

average of the observations,

after registration

  • Object and background estimates

linear in T

average of the observations, in the

regions not occluded by the object

nonlinear in T

  • Decouple the estimation of the position vectors

  • Motion is estimated on a frame by frame basis [Bergen et al, 92]

Minimization procedure two step iterative method l.jpg

Minimization procedure - two-step iterative method

  • Replacing and in the ML cost function

nonlinear minimization !

  • Replacing only in the ML cost function

  • minimize using a two-step iterative method:

  • solve for with fixed

  • solve for with fixed

(quadratic, closed-form solution)

(linear, closed-form solution)

Minimization procedure segmentation matrix l.jpg

Minimization procedure - segmentation matrix



  • Template estimate

  • Replacing only in the ML cost function

Accumulated differences between each pair of co-registered frames

Accumulated differences between each frame and the background

  • regions where the test is inconclusive

    with the available F frames

linear in T !

Experiment l.jpg


moving object

three frames from the image sequence


Experiment11 l.jpg


background estimate

Two-step method

template estimate

Experiment12 l.jpg


background estimate

moving objects

four frames from a video sequence

3d structure from 2d video l.jpg

3D structure from 2D video

  • Motivation: 3D content-based video representation (application areas go well behind digital video)

  • Key step: recovery of 3D shape and 3D motion from an image sequence

  • Strongest cue: motion of the brightness pattern

  • Structure From Motion:

    • Step 1. Compute the 2D motion on the image plane

    • Step 2. Recover the 3D motion and the depth

Two frame sfm common problem l.jpg

Two-frame SFM - common problem

  • step 1. track feature points across a set of frames

  • step 2. recover relative depth and set of 3D positions

  • Two-frame SFM failswhen object is far from camera


  • Solution: exploit rigidity - multi-frame SFM

  • Multi-frame Structure From Motion:

Factorization method l.jpg

Factorization method

expedite method

  • Factorization [Tomasi & Kanade, 92]:

    • uses linear subspace constraints

    • 3D structure is estimated by factorizing a measurement matrix R whose entries are the trajectories of feature point projections

    • without noise, R is rank 3. AnSVD is used to factorize matrix R

  • Multi-frame SFM - hard problem:

    • non-linear

    • large set of unknowns (due to the entire set of 3D positions)

  • Problems:

    • track a large set of features: computationally very heavy, if possible

    • cost of SVD: high for large number of features or frames

Proposed approach surfaced based factorization l.jpg

Proposed approach: surfaced-based factorization

  • Induces a parametric description for the 2D motion in the image plane

  • Recover the 3D shape and 3D motion parameters from the 2D motion parameters by further exploiting linear subspace constraints:

    • surface-based factorization

    • rank 1 factorization

    • weighted factorization

uses a fast algorithm to compute only the largest singular value

computes the weighted estimate without additional computational cost

  • Describe the 3D shape by a local parameterization

Maximum likelihood formulation l.jpg

Maximum Likelihood formulation

rather than the two components of the motion, local depth is a single unknown

  • Observations: the images in the sequence. Unknowns: object texture, 3D shape, 3D motion

  • Through ML, 3D structure is recovered:

    • Exploiting object rigidity over a set of frames

    • Directly from the image intensity values

so, where do SFM and factorization come from ?

  • Minimization procedure :

    • Minimize with respect to the texture in terms of 3D shape and 3D motion

    • After replacing the texture estimate, the ML cost function depends on the 3D structure only through the 2D motion in the image plane

    • Estimate 3D motion by inferring SFM (factorization). Plug-in the 3D motion estimates

    • Minimize the ML cost function with respect to the relative depth

  • Local 2D motion estimation is ill-posed - aperture problem. Direct methods:

    • Infer 3D structure by using the brightness change constraintbetween two frames [Horn & Weldon, 88]

    • Kalman filter to update estimates over time [J. Hell, 90]

Observation model18 l.jpg

Observation model

  • Observation model



3D position

  • Unknowns:

Texture estimate l.jpg

Texture estimate

  • Texture estimate - weighted average

  • ML estimate

Sfm as an approximation to mle l.jpg

SFM as an approximation to MLE

  • The ML cost function depends on the 3D structure only through the 2D motion induced in the image plane (no approximations involved)

  • Insert the texture estimate into the cost function

  • 3D structure estimation:

    • 3D motion estimation:

      • Compute 2D motion

      • SFM: rank 1 surface-based factorization

  • 3D shape estimation:

    • Plug-in the 3D motion estimate into the ML cost function

    • Then, minimize with respect to the shape

  • (The estimates can be refined by minimizing the ML cost function in two alternate steps,

  • but initialization is the key problem)

Feature based sfm l.jpg

Feature-based SFM

Translation estimate:


Rank 1 factorization l.jpg

Rank 1 factorization

  • Decomposition (minimize without constraints)


  • Normalization (computes by approximating the constraints)


Rank 1 factorization experiment l.jpg

Rank 1 factorization - experiment

three larger singularvalues of R

matrix is well described

by its largest singular value

Rank 1 factorization experiment24 l.jpg

Rank 1 factorization - experiment

all trajectories have equal shape - it depends only on the 3D motion. The scaling factor depends on the 3D shape (relative depth)

3D shape and 3D motion are observed in a coupled way through the feature trajectories

Surface based factorization l.jpg

Surface-based factorization

  • Orthographic projection

(easily extended to scaled-orthographic and para-perspective projections)

  • 2D motion in the image plane is affine

Relation between the parameters:

  • Rank 1 factorization

Multi-frame SFM:

  • Piecewise planar 3D shapes

Surface based factorization experiment l.jpg

Surface-based factorization - experiment

smooth texture

image motion


image sequence

Surface based factorization experiment27 l.jpg

Surface-based factorization - experiment



Weighted factorization l.jpg

Weighted factorization

observation noise

  • rank 1 factorization

Weighted factorization experiment l.jpg

Weighted factorization - experiment

non-weighted estimates

weighted estimates

two components of translation

six entries of the rotation matrix

feature trajectories

Feature trajectories l.jpg

Feature trajectories

Non weighted factorization reconstruction l.jpg

Non-weighted factorization - reconstruction

Weighted factorization reconstruction l.jpg

Weighted factorization - reconstruction

Ml estimate of the 3d shape l.jpg

ML estimate of the 3D shape

  • Image motion:

known motion parameters

affine mapping that depends only on the 3D motion

  • Define a sequence:

  • Motion of the affine mapped sequence:

unknown relative depth

shape of the trajectory of s (known from 3D motion)

magnitude of the trajectory of s (unknown relative depth)

  • Plug-in the 3D motion estimate into the ML cost function

  • Estimating the relative depth after plugging-in the 3D motion is more constrained than estimating the image motion

  • Motivation for the minimization procedure

Minimization procedure multiresolution l.jpg

Minimization procedure - multiresolution

  • Multiresolution continuation-type method

    • coarse-to-fine as more images are being taken into account

    • each stage minimizes the ML cost function by using a Gauss-Newton method

components of the image gradient

  • Region R - constant relative depth z

Experiment35 l.jpg


  • Image sequence:

  • and motion:

  • Shape

Experiment36 l.jpg


Affine mapped image sequence:

  • Shape:

Experiment37 l.jpg


without smoothing

Multiresolution continuation-type method. Shape estimate:

Experiment38 l.jpg


Experiment39 l.jpg


  • Synthesizing different views:

Application video compression l.jpg

Application - video compression


Compressed 317:1

Compressed 575:1

Texture patches JPEG compressed

Major contributions and extensions l.jpg

Major contributions and extensions

  • Explicit modeling of occlusion

  • Multiframe motion segmentation algorithm (two-step)

  • Surface-based factorization

  • Rank 1 factorization

  • Weighted factorization

  • extension: contour model

  • extensions:

  • other projection models

  • multibody

  • occlusion

  • 3D deformable shape from a set of cameras

  • subspace constraints for image motion estimation

  • Multiresolution algorithm for direct inference of 3D shape

  • extension: parameterized surface model

Experiment42 l.jpg


Multiresolution continuation-type method. Shape estimate:

Experiment43 l.jpg


Experiment44 l.jpg


Experiment45 l.jpg


Rank 1 factorization computational cost l.jpg

Rank 1 factorization - computational cost

  • Login