- 301 Views
- Updated On :
- Presentation posted in: Sports / GamesEducation / CareerFashion / BeautyGraphics / DesignNews / Politics

Rigid Structure from Video Pedro M. Q. Aguiar Outline Other methods - limitations Proposed approach Problem formulation Algorithms Experiments Motivation Segmentation of 2D rigid moving objects Inference of 3D rigid structure Content-based video representation

Related searches for Rigid Structure from Video

Rigid Structure from Video

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Rigid Structure from Video

Pedro M. Q. Aguiar

- Other methods - limitations
- Proposed approach
- Problem formulation
- Algorithms
- Experiments

- Motivation
- Segmentation of 2D rigid moving objects
- Inference of 3D rigid structure

Content-based video representation

apps: compression, non-linear editing, virtual reality, etc

- Video

- Generative Video (GV) [Jasinschi & Moura, 95]
- flat scenario
- flat moving objects

- PROBLEM: Segmentation of 2D rigid moving objects

- 3D content-based representation
- 3D rigid shape
- 3D motion

- PROBLEM: Inference of 3D rigid structure (shape and motion)

with low texture,

segmentation fails !

- Two-frame motion-based segmentation
- No prior knowledge about shape, texture

- [Diehl, 91]

time consumingalgorithms !

- Possible solution - smoothing
- Statistical regularization [Dubuisson & Jain, 95]
- Combine motion with other attributes [Bouthemy & François, 93]

- Proposed approach - exploit rigidity over a set of frames
- Explicit modeling of occlusion
- Feasible implementation of MLE

background

camera window

camera position

camera position

object template

(modeling of oclusion)

object position

object texture

noise

- Given
- set of F frames

- Estimate
- background texture
- object texture
- object template
- camera motion
- object motion

- ML cost function

over all frames and pixels

- ML estimate

- ML estimation

quadratic in O and B

average of the observations,

after registration

- Object and background estimates

linear in T

average of the observations, in the

regions not occluded by the object

nonlinear in T

- Decouple the estimation of the position vectors
- Motion is estimated on a frame by frame basis [Bergen et al, 92]

- Replacing and in the ML cost function

nonlinear minimization !

- Replacing only in the ML cost function

- minimize using a two-step iterative method:
- solve for with fixed
- solve for with fixed

(quadratic, closed-form solution)

(linear, closed-form solution)

Segmentation

matrix

- Template estimate

- Replacing only in the ML cost function

Accumulated differences between each pair of co-registered frames

Accumulated differences between each frame and the background

- regions where the test is inconclusive
with the available F frames

linear in T !

moving object

three frames from the image sequence

background

background estimate

Two-step method

template estimate

background estimate

moving objects

four frames from a video sequence

- Motivation: 3D content-based video representation (application areas go well behind digital video)

- Key step: recovery of 3D shape and 3D motion from an image sequence

- Strongest cue: motion of the brightness pattern

- Structure From Motion:
- Step 1. Compute the 2D motion on the image plane
- Step 2. Recover the 3D motion and the depth

- step 1. track feature points across a set of frames

- step 2. recover relative depth and set of 3D positions

- Two-frame SFM failswhen object is far from camera

3D

- Solution: exploit rigidity - multi-frame SFM

- Multi-frame Structure From Motion:

expedite method

- Factorization [Tomasi & Kanade, 92]:
- uses linear subspace constraints
- 3D structure is estimated by factorizing a measurement matrix R whose entries are the trajectories of feature point projections
- without noise, R is rank 3. AnSVD is used to factorize matrix R

- Multi-frame SFM - hard problem:
- non-linear
- large set of unknowns (due to the entire set of 3D positions)

- Problems:
- track a large set of features: computationally very heavy, if possible
- cost of SVD: high for large number of features or frames

- Induces a parametric description for the 2D motion in the image plane

- Recover the 3D shape and 3D motion parameters from the 2D motion parameters by further exploiting linear subspace constraints:
- surface-based factorization
- rank 1 factorization
- weighted factorization

uses a fast algorithm to compute only the largest singular value

computes the weighted estimate without additional computational cost

- Describe the 3D shape by a local parameterization

rather than the two components of the motion, local depth is a single unknown

- Observations: the images in the sequence. Unknowns: object texture, 3D shape, 3D motion

- Through ML, 3D structure is recovered:
- Exploiting object rigidity over a set of frames
- Directly from the image intensity values

so, where do SFM and factorization come from ?

- Minimization procedure :
- Minimize with respect to the texture in terms of 3D shape and 3D motion
- After replacing the texture estimate, the ML cost function depends on the 3D structure only through the 2D motion in the image plane
- Estimate 3D motion by inferring SFM (factorization). Plug-in the 3D motion estimates
- Minimize the ML cost function with respect to the relative depth

- Local 2D motion estimation is ill-posed - aperture problem. Direct methods:
- Infer 3D structure by using the brightness change constraintbetween two frames [Horn & Weldon, 88]
- Kalman filter to update estimates over time [J. Hell, 90]

- Observation model

texture

shape

3D position

- Unknowns:

- Texture estimate - weighted average

- ML estimate

- The ML cost function depends on the 3D structure only through the 2D motion induced in the image plane (no approximations involved)

- Insert the texture estimate into the cost function

- 3D structure estimation:
- 3D motion estimation:
- Compute 2D motion
- SFM: rank 1 surface-based factorization

- 3D motion estimation:

- 3D shape estimation:
- Plug-in the 3D motion estimate into the ML cost function
- Then, minimize with respect to the shape

- (The estimates can be refined by minimizing the ML cost function in two alternate steps,
- but initialization is the key problem)

Translation estimate:

Define:

- Decomposition (minimize without constraints)

Define:

- Normalization (computes by approximating the constraints)

Define:

three larger singularvalues of R

matrix is well described

by its largest singular value

all trajectories have equal shape - it depends only on the 3D motion. The scaling factor depends on the 3D shape (relative depth)

3D shape and 3D motion are observed in a coupled way through the feature trajectories

- Orthographic projection

(easily extended to scaled-orthographic and para-perspective projections)

- 2D motion in the image plane is affine

Relation between the parameters:

- Rank 1 factorization

Multi-frame SFM:

- Piecewise planar 3D shapes

smooth texture

image motion

parameters

image sequence

motion

shape

observation noise

- rank 1 factorization

non-weighted estimates

weighted estimates

two components of translation

six entries of the rotation matrix

feature trajectories

- Image motion:

known motion parameters

affine mapping that depends only on the 3D motion

- Define a sequence:

- Motion of the affine mapped sequence:

unknown relative depth

shape of the trajectory of s (known from 3D motion)

magnitude of the trajectory of s (unknown relative depth)

- Plug-in the 3D motion estimate into the ML cost function

- Estimating the relative depth after plugging-in the 3D motion is more constrained than estimating the image motion

- Motivation for the minimization procedure

- Multiresolution continuation-type method
- coarse-to-fine as more images are being taken into account
- each stage minimizes the ML cost function by using a Gauss-Newton method

components of the image gradient

- Region R - constant relative depth z

- Image sequence:

- and motion:

- Shape

Affine mapped image sequence:

- Shape:

without smoothing

Multiresolution continuation-type method. Shape estimate:

- Synthesizing different views:

Original

Compressed 317:1

Compressed 575:1

Texture patches JPEG compressed

- Explicit modeling of occlusion

- Multiframe motion segmentation algorithm (two-step)

- Surface-based factorization

- Rank 1 factorization

- Weighted factorization

- extension: contour model

- extensions:
- other projection models
- multibody
- occlusion
- 3D deformable shape from a set of cameras
- subspace constraints for image motion estimation

- Multiresolution algorithm for direct inference of 3D shape

- extension: parameterized surface model

Multiresolution continuation-type method. Shape estimate: