1 / 23

Image Mosaicing with Motion Segmentation from Video

Image Mosaicing with Motion Segmentation from Video. Augusto Roman, Taly Gilat EE392J Final Project 03/20/01. Project Goal. To create a single panoramic image from a sequence of video frames Align frames using motion estimation. The Wang Adelson Algorithm.

Download Presentation

Image Mosaicing with Motion Segmentation from Video

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Image Mosaicing with Motion Segmentation from Video Augusto Roman, Taly Gilat EE392J Final Project 03/20/01

  2. Project Goal • To create a single panoramic image from a sequence of video frames • Align frames using motion estimation

  3. The Wang Adelson Algorithm • Dense MV is gold standard estimate for each pixel • Estimate region motion • Segment regions based on motions • Iterate

  4. Dense Motion Estimation • Need a motion vector for every pixel. • Full search block matching is far too slow (more than 2 hours per frame pair!) • Phase Correlation used to estimate motion • Results in a few possible motions for each block. • For each pixel in the center of the block, test with each of the possible motions and choose the motion with the smallest MSE. • Shift block by small amount and compute new phase correlation and repeat. • Much faster! ~30 sec per frame pair!

  5. Phase Correlation • Given a block B1,B2 from each image • Compute 2D FFT of each • Compute cross-power spectrum • Normalized value of: F{B1} F{B2}* • Take IFFT to get Phase Correlation Function • This is very similar to the correlation function between the two blocks

  6. Motion Vectors vs Motions Hypothesis • There are a number of different motion hypotheses available. • In theory, each of these hypothesis corresponds to a distinct motion in the video. • Each pixel is assigned to the motion hypothesis that most closely approximates its motion vector. • This segments the frame into distinct regions, one for each motion hypothesis.

  7. Motion Hypothesis Generation • For each region want motion hypothesis that best represents all pixel motions in that region. • Least squares fit to find best affine motion parameter in a region • First iteration initialized with blocks

  8. Motion Hypothesis Refinement • K-means used to cluster motion hypotheses • K unknown • Empty clusters removed • Large clusters split to maintain minimum k value

  9. Region Segmentation • For each pixel compare hypotheses to dense MV • Find closest hypothesis • Group all pixels represented by a motion hypothesis into a region • Pixels with large error unassigned • Hypotheses without membership removed

  10. Region Adjustment • Region Splitter • Assumes areas with same motion are connected • Disconnected areas within a region are split into separate regions • Increases number of hypotheses for k-means • Region Filter • Small regions give poor motion estimates • Remove all regions with area below threshold • Disconnected objects with same motion will be merged at next segmentation step

  11. Segmentation Results Frame 1 Frame 2

  12. Segmentation Results Initial Segmentation Iteration 1 Iteration 2 Iteration 3 Iteration 5 Iteration 7

  13. The Whole Enchilada • The dense motion estimate, region segmentation and motion estimation performed for all frames pairs. • For first pair, segmentation initialized to blocks and k-means initialized to lattice in 6D affine space. • Subsequent frame pairs initialized with final segmentation and motion hypotheses from previous frame pair.

  14. Layer Synthesis • Motion estimates relate each frame only to the previous frame • Frames are projected onto first video frame • Cumulative projection kept in 3x3 transformation matrix • Layers are not necessarily ordered similarly between frames • Assumed largest layer is background • Median taken of all values projected to each pixel in final image

  15. Implementation Difficulties • Parameter tweaking • Phase Correlation • Attempted to focus on textured areas of image • Initial test videos lacked texture • Registering layers across frames • Cumulative motion error • Resolution loss in final panoramic

  16. Results

  17. Results

  18. Results

  19. Results

  20. Results

  21. Results

  22. Results

  23. Conclusions • Nice panoramas • Moving objects removed • Future improvements • Sub-pixel dense motion estimation • More accurate segmentation • Region registration across frame

More Related