1 / 57

776 Computer Vision

776 Computer Vision. Jan-Michael Frahm Spring 2012. Feature point extraction. homogeneous. edge. corner. Find points for which the following is maximum. i.e. maximize smallest eigenvalue of M. Comparing image regions. Compare intensities pixel-by-pixel. I(x,y). I´(x,y).

teleri
Download Presentation

776 Computer Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 776 Computer Vision Jan-Michael Frahm Spring 2012

  2. Feature point extraction homogeneous edge corner Find points for which the following is maximum i.e. maximize smallest eigenvalue of M

  3. Comparing image regions Compare intensities pixel-by-pixel I(x,y) I´(x,y) Dissimilarity measures • Sum of Square Differences

  4. Comparing image regions Compare intensities pixel-by-pixel I(x,y) I´(x,y) Similarity measures Zero-mean Normalized Cross Correlation

  5. Feature point extraction • Approximate SSD for small displacement Δ • Image difference, square difference for pixel • SSD for window

  6. Harris corner detector • Only use local maxima, subpixel accuracy through second order surface fitting • Select strongest features over whole image and over each tile (e.g. 1000/image, 2/tile) • Use small local window: • Maximize „cornerness“:

  7. Simple matching • for each corner in image 1 find the corner in image 2 that is most similar (using SSD or NCC) and vice-versa • Only compare geometrically compatible points • Keep mutual best matches What transformations does this work for?

  8. Feature matching: example 3 3 2 2 4 4 1 5 1 5 What transformations does this work for? What level of transformation do we need?

  9. Feature tracking • Identify features and track them over video • Small difference between frames • potential large difference overall • Standard approach: KLT (Kanade-Lukas-Tomasi)

  10. Feature Tracking • Establish correspondences between identical salient points multiple images

  11. Good features to track • Use the same window in feature selection as for tracking itself • Compute motion assuming it is small Affine is also possible, but a bit harder (6x6 in stead of 2x2) differentiate:

  12. Example Simple displacement is sufficient between consecutive frames, but not to compare to reference template

  13. Example

  14. Synthetic example

  15. Good features to keep tracking Perform affine alignment between first and last frame Stop tracking features with too large errors

  16. Optical flow • Brightness constancy assumption (small motion) • 1D example possibility for iterative refinement

  17. Optical flow • Brightness constancy assumption (small motion) • 2D example the “aperture” problem (1 constraint) ? (2 unknowns) isophote I(t+1)=I isophote I(t)=I

  18. The Aperture Problem and Let • Algorithm: At each pixel compute by solving • Mis singular if all gradient vectors point in the same direction • e.g., along an edge • of course, trivially singular if the summation is over a single pixel or there is no texture • i.e., only normal flow is available (aperture problem) • Corners and textured areas are OK Motion estimation Slide credit: S.Seitz, R. Szeliski

  19. Optical flow • How to deal with aperture problem? (3 constraints if color gradients are different) Assume neighbors have same displacement Slide credit: S.Seitz, R. Szeliski

  20. SSD Surface – Textured area Motion estimation Slide credit: S.Seitz, R. Szeliski

  21. SSD Surface -- Edge Motion estimation Slide credit: S.Seitz, R. Szeliski

  22. SSD – homogeneous area Motion estimation Slide credit: S.Seitz, R. Szeliski

  23. Lucas-Kanade Assume neighbors have same displacement least-squares:

  24. Revisiting the small motion assumption • Is this motion small enough? • Probably not—it’s much larger than one pixel (2nd order terms dominate) • How might we solve this problem? * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

  25. Reduce the resolution! * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

  26. Coarse-to-fine optical flow estimation u=1.25 pixels u=2.5 pixels u=5 pixels u=10 pixels image It-1 image It-1 image I image I Gaussian pyramid of image It-1 Gaussian pyramid of image I slides from Bradsky and Thrun

  27. Coarse-to-fine optical flow estimation warp & upsample run iterative L-K . . . image J image It-1 image I image I Gaussian pyramid of image It-1 Gaussian pyramid of image I slides from Bradsky and Thrun run iterative L-K

  28. Gain-Adaptive KLT-Tracking Video with fixed gain Video with auto-gain • Data parallel implementation on GPU [Sinha, Frahm, Pollefeys, Genc MVA'07] • Simultaneous tracking and radiometric calibration [Kim, Frahm, Pollefeys ICCV07] • But: not data parallel – hard for GPU acceleration • Block-Jacobi iterations [Zach, Gallup, Frahm CVGPU’08] • Data parallel, very efficient on GPU

  29. Gain Estimation Camera reported (blue) and estimated gains (red)‏ [Zach, Gallup, Frahm CVGPU08]

  30. Limits of the gradient method Fails when intensity structure in window is poor Fails when the displacement is large (typical operating range is motion of 1 pixel) Linearization of brightness is suitable only for small displacements • Also, brightness is not strictly constant in images actually less problematic than it appears, since we can pre-filter images to make them look similar Slide credit: S.Seitz, R. Szeliski

  31. Limitations of Yosemite Yosemite Flow Color Coding Image 7 Image 8 Ground-Truth Flow • Only sequence used for quantitative evaluation • Limitations: • Very simple and synthetic • Small, rigid motion • Minimal motion discontinuities/occlusions Slide credit: S.Seitz, R. Szeliski

  32. Limitations of Yosemite Yosemite Flow Color Coding Image 7 Image 8 Ground-Truth Flow • Only sequence used for quantitative evaluation • Current challenges: • Non-rigid motion • Real sensor noise • Complex natural scenes • Motion discontinuities • Need more challenging and more realistic benchmarks Slide credit: S.Seitz, R. Szeliski

  33. Realistic synthetic imagery Rock Grove • Randomly generate scenes with “trees” and “rocks” • Significant occlusions, motion, texture, and blur • Rendered using Mental Ray and “lens shader” plugin Motion estimation Slide credit: S.Seitz, R. Szeliski

  34. Modified stereo imagery • Recrop and resample ground-truth stereo datasets to have appropriate motion for OF Venus Moebius Motion estimation Slide credit: S.Seitz, R. Szeliski

  35. Dense flow with hidden texture Visible UV Setup Lights Image Cropped • Paint scene with textured fluorescent paint • Take 2 images: One in visible light, one in UV light • Move scene in very small steps using robot • Generate ground-truth by tracking the UV images Slide credit: S.Seitz, R. Szeliski

  36. Experimental results • Algorithms: • Pyramid LK: OpenCV-based implementation of Lucas-Kanade on a Gaussian pyramid • Black and Anandan: Author’s implementation • Bruhn et al.: Our implementation • MediaPlayerTM: Code used for video frame-rate upsampling in Microsoft MediaPlayer • Zitnick et al.: Author’s implementation Slide credit: S.Seitz, R. Szeliski

  37. Experimental results Motion estimation Slide credit: S.Seitz, R. Szeliski

  38. Conclusions • Difficulty: Data substantially more challenging than Yosemite • Diversity: Substantial variation in difficulty across the various datasets • Motion GT vs Interpolation: Best algorithms for one are not the best for the other • Comparison with Stereo: Performance of existing flow algorithms appears weak Slide credit: S.Seitz, R. Szeliski

  39. Motion representations • How can we describe this scene? Slide credit: S.Seitz, R. Szeliski

  40. Block-based motion prediction • Break image up into square blocks • Estimate translation for each block • Use this to predict next frame, code difference (MPEG-2) Slide credit: S.Seitz, R. Szeliski

  41. Layered motion • Break image sequence up into “layers”: •  = • Describe each layer’s motion Slide credit: S.Seitz, R. Szeliski

  42. Layered motion • Advantages: • can represent occlusions / disocclusions • each layer’s motion can be smooth • video segmentation for semantic processing • Difficulties: • how do we determine the correct number? • how do we assign pixels? • how do we model the motion? Slide credit: S.Seitz, R. Szeliski

  43. Layers for video summarization Motion estimation Slide credit: S.Seitz, R. Szeliski

  44. Background modeling (MPEG-4) • Convert masked images into a background sprite for layered video coding • + + + • = Slide credit: S.Seitz, R. Szeliski

  45. What are layers? • [Wang & Adelson, 1994] • intensities • alphas • velocities Slide credit: S.Seitz, R. Szeliski

  46. How do we form them? Slide credit: S.Seitz, R. Szeliski

  47. How do we estimate the layers? • compute coarse-to-fine flow • estimate affine motion in blocks (regression) • cluster with k-means • assign pixels to best fitting affine region • re-estimate affine motions in each region… Slide credit: S.Seitz, R. Szeliski

  48. Layer synthesis • For each layer: • stabilize the sequence with the affine motion • compute median value at each pixel • Determine occlusion relationships Slide credit: S.Seitz, R. Szeliski

  49. Results Slide credit: S.Seitz, R. Szeliski

More Related