Rigid Structure from Video - PowerPoint PPT Presentation

slide1 l.
Skip this Video
Loading SlideShow in 5 Seconds..
Rigid Structure from Video PowerPoint Presentation
Download Presentation
Rigid Structure from Video

play fullscreen
1 / 46
Download Presentation
Rigid Structure from Video
Download Presentation

Rigid Structure from Video

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Rigid Structure from Video Pedro M. Q. Aguiar

  2. Outline • Other methods - limitations • Proposed approach • Problem formulation • Algorithms • Experiments • Motivation • Segmentation of 2D rigid moving objects • Inference of 3D rigid structure

  3. Content-based video representation apps: compression, non-linear editing, virtual reality, etc Motivation • Video • Generative Video (GV) [Jasinschi & Moura, 95] • flat scenario • flat moving objects • PROBLEM: Segmentation of 2D rigid moving objects • 3D content-based representation • 3D rigid shape • 3D motion • PROBLEM: Inference of 3D rigid structure (shape and motion)

  4. Motion segmentation in low texture with low texture, segmentation fails ! • Two-frame motion-based segmentation • No prior knowledge about shape, texture • [Diehl, 91] time consumingalgorithms ! • Possible solution - smoothing • Statistical regularization [Dubuisson & Jain, 95] • Combine motion with other attributes [Bouthemy & François, 93] • Proposed approach - exploit rigidity over a set of frames • Explicit modeling of occlusion • Feasible implementation of MLE

  5. Observation model background camera window camera position camera position object template (modeling of oclusion) object position object texture noise

  6. Maximum Likelihood estimation • Given • set of F frames • Estimate • background texture • object texture • object template • camera motion • object motion • ML cost function over all frames and pixels • ML estimate

  7. Minimization procedure • ML estimation quadratic in O and B average of the observations, after registration • Object and background estimates linear in T average of the observations, in the regions not occluded by the object nonlinear in T • Decouple the estimation of the position vectors • Motion is estimated on a frame by frame basis [Bergen et al, 92]

  8. Minimization procedure - two-step iterative method • Replacing and in the ML cost function nonlinear minimization ! • Replacing only in the ML cost function • minimize using a two-step iterative method: • solve for with fixed • solve for with fixed (quadratic, closed-form solution) (linear, closed-form solution)

  9. Minimization procedure - segmentation matrix Segmentation matrix • Template estimate • Replacing only in the ML cost function Accumulated differences between each pair of co-registered frames Accumulated differences between each frame and the background • regions where the test is inconclusive with the available F frames linear in T !

  10. Experiment moving object three frames from the image sequence background

  11. Experiment background estimate Two-step method template estimate

  12. Experiment background estimate moving objects four frames from a video sequence

  13. 3D structure from 2D video • Motivation: 3D content-based video representation (application areas go well behind digital video) • Key step: recovery of 3D shape and 3D motion from an image sequence • Strongest cue: motion of the brightness pattern • Structure From Motion: • Step 1. Compute the 2D motion on the image plane • Step 2. Recover the 3D motion and the depth

  14. Two-frame SFM - common problem • step 1. track feature points across a set of frames • step 2. recover relative depth and set of 3D positions • Two-frame SFM failswhen object is far from camera 3D • Solution: exploit rigidity - multi-frame SFM • Multi-frame Structure From Motion:

  15. Factorization method expedite method • Factorization [Tomasi & Kanade, 92]: • uses linear subspace constraints • 3D structure is estimated by factorizing a measurement matrix R whose entries are the trajectories of feature point projections • without noise, R is rank 3. AnSVD is used to factorize matrix R • Multi-frame SFM - hard problem: • non-linear • large set of unknowns (due to the entire set of 3D positions) • Problems: • track a large set of features: computationally very heavy, if possible • cost of SVD: high for large number of features or frames

  16. Proposed approach: surfaced-based factorization • Induces a parametric description for the 2D motion in the image plane • Recover the 3D shape and 3D motion parameters from the 2D motion parameters by further exploiting linear subspace constraints: • surface-based factorization • rank 1 factorization • weighted factorization uses a fast algorithm to compute only the largest singular value computes the weighted estimate without additional computational cost • Describe the 3D shape by a local parameterization

  17. Maximum Likelihood formulation rather than the two components of the motion, local depth is a single unknown • Observations: the images in the sequence. Unknowns: object texture, 3D shape, 3D motion • Through ML, 3D structure is recovered: • Exploiting object rigidity over a set of frames • Directly from the image intensity values so, where do SFM and factorization come from ? • Minimization procedure : • Minimize with respect to the texture in terms of 3D shape and 3D motion • After replacing the texture estimate, the ML cost function depends on the 3D structure only through the 2D motion in the image plane • Estimate 3D motion by inferring SFM (factorization). Plug-in the 3D motion estimates • Minimize the ML cost function with respect to the relative depth • Local 2D motion estimation is ill-posed - aperture problem. Direct methods: • Infer 3D structure by using the brightness change constraintbetween two frames [Horn & Weldon, 88] • Kalman filter to update estimates over time [J. Hell, 90]

  18. Observation model • Observation model texture shape 3D position • Unknowns:

  19. Texture estimate • Texture estimate - weighted average • ML estimate

  20. SFM as an approximation to MLE • The ML cost function depends on the 3D structure only through the 2D motion induced in the image plane (no approximations involved) • Insert the texture estimate into the cost function • 3D structure estimation: • 3D motion estimation: • Compute 2D motion • SFM: rank 1 surface-based factorization • 3D shape estimation: • Plug-in the 3D motion estimate into the ML cost function • Then, minimize with respect to the shape • (The estimates can be refined by minimizing the ML cost function in two alternate steps, • but initialization is the key problem)

  21. Feature-based SFM Translation estimate: Define:

  22. Rank 1 factorization • Decomposition (minimize without constraints) Define: • Normalization (computes by approximating the constraints) Define:

  23. Rank 1 factorization - experiment three larger singularvalues of R matrix is well described by its largest singular value

  24. Rank 1 factorization - experiment all trajectories have equal shape - it depends only on the 3D motion. The scaling factor depends on the 3D shape (relative depth) 3D shape and 3D motion are observed in a coupled way through the feature trajectories

  25. Surface-based factorization • Orthographic projection (easily extended to scaled-orthographic and para-perspective projections) • 2D motion in the image plane is affine Relation between the parameters: • Rank 1 factorization Multi-frame SFM: • Piecewise planar 3D shapes

  26. Surface-based factorization - experiment smooth texture image motion parameters image sequence

  27. Surface-based factorization - experiment motion shape

  28. Weighted factorization observation noise • rank 1 factorization

  29. Weighted factorization - experiment non-weighted estimates weighted estimates two components of translation six entries of the rotation matrix feature trajectories

  30. Feature trajectories

  31. Non-weighted factorization - reconstruction

  32. Weighted factorization - reconstruction

  33. ML estimate of the 3D shape • Image motion: known motion parameters affine mapping that depends only on the 3D motion • Define a sequence: • Motion of the affine mapped sequence: unknown relative depth shape of the trajectory of s (known from 3D motion) magnitude of the trajectory of s (unknown relative depth) • Plug-in the 3D motion estimate into the ML cost function • Estimating the relative depth after plugging-in the 3D motion is more constrained than estimating the image motion • Motivation for the minimization procedure

  34. Minimization procedure - multiresolution • Multiresolution continuation-type method • coarse-to-fine as more images are being taken into account • each stage minimizes the ML cost function by using a Gauss-Newton method components of the image gradient • Region R - constant relative depth z

  35. Experiment • Image sequence: • and motion: • Shape

  36. Experiment Affine mapped image sequence: • Shape:

  37. Experiment without smoothing Multiresolution continuation-type method. Shape estimate:

  38. Experiment

  39. Experiment • Synthesizing different views:

  40. Application - video compression Original Compressed 317:1 Compressed 575:1 Texture patches JPEG compressed

  41. Major contributions and extensions • Explicit modeling of occlusion • Multiframe motion segmentation algorithm (two-step) • Surface-based factorization • Rank 1 factorization • Weighted factorization • extension: contour model • extensions: • other projection models • multibody • occlusion • 3D deformable shape from a set of cameras • subspace constraints for image motion estimation • Multiresolution algorithm for direct inference of 3D shape • extension: parameterized surface model

  42. Experiment Multiresolution continuation-type method. Shape estimate:

  43. Experiment

  44. Experiment

  45. Experiment

  46. Rank 1 factorization - computational cost