1 / 34

Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images. Ed Bremer University of Rochester. References. [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, October 2004, http://lear.inrialpes.fr/pubs/2004/MS04a

kasen
Download Presentation

Automatic Matching of Multi-View Images

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Matching of Multi-View Images Ed Bremer University of Rochester

  2. References • [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, October 2004, http://lear.inrialpes.fr/pubs/2004/MS04a • [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 • [3] Lowe, D., 2004. Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, 60, 2 (2004), pp. 91-118. • [4] Matas, J., Chum, O., Urban, M., Pajdla,T. 2002. Robust Wide Baseline Stereo From Maximally Stable Extremal Regions, Proc British Machine Vision Conference BMVC2002, pages 384 – 393. • [5] Zisserman, A., Schaffalitzky, F., 2002, Multi-view matching for unordered image sets, or  ”How do I organize my holiday snaps?”, Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, pages 414-431, vol 1. • [6] Baumberg, A., 2000, Reliable Feature Matching Across Widely Separated Views, In Proc. CVPR ,pages 774-781. • [7] Mikolajczyk, K, Schmid, C., 2001, Indexing based on scale invariant interest points, In Proc. 8th ICCV, pages 525-531. Automatic Matching of Multi-View Images

  3. Outline • Motivation • Applications • Process Components • Region Detectors • Descriptors • Matching Criteria • Performance Evaluation • Conclusion & Next Steps Automatic Matching of Multi-View Images

  4. Motivation • Multi-view/Multi-image Matching Multiple images of scene taken by single or multiple cameras with different rotation, scale, viewpoint and illumination 3D scene Automatic Matching of Multi-View Images

  5. Motivation • Applications … detecting matching regions is used in all the following • Image registration • Super-resolution • Stereo vision • Object detection and recognition • Object and motion tracking • Indexing and retrieval of objects • 3D scene reconstruction • Scene recognition Automatic Matching of Multi-View Images

  6. Examples of Multi-view Images [2] [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 Automatic Matching of Multi-View Images

  7. Process Components • Covariant region detection • Detect image regions covariant to class of transformation between reference image and transformed image • Invariant descriptor • Compute invariant descriptors from covariant regions • Descriptor matching • Compute distance between descriptors in reference image and transformed image [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

  8. Region Detectors • Support regions for computation of descriptors • Determined independently in each image • Scale invariant or Affine invariant • Can be points (feature points) or regions (covariant) • Provide dense (local) coverage – robust to occlusion • Need to be stable and repeatable • Five region detectors - • Harris points -> invariant to rotation • Harris-Laplacian -> invariant to rotation and scale • Hessian-Laplace ->invariant to rotation and scale • Harris-Affine -> invariant to affine image transformations • Hessian-Affine -> invariant to affine image transformations [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

  9. Region Detectors • Harris points - • Maxima of Harris function used to locate interest point • Support region fixed in size, 41x41 neighborhood centered at interest point • Harris-Laplace regions - • Scale adapted Harris function • Interest point is local minima or maxima across scale-space by Laplacian-of-Gaussian [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

  10. Region Detectors • Harris-Laplace Performance - • Approximately 10% better than Laplacian, Lowe or gradient methods. • Harris standard detector is very poor under scale changes [7] Mikolajczyk, K., Schmid, C., 2001, Indexing based on scale invariant interest points, In Proc. 8th ICCV, Pages 525-531. Automatic Matching of Multi-View Images

  11. Region Detectors • Hessian-Laplace regions - • Interest point is at local maxima of Hessian determinant • Location in scale-space using maxima of Laplacian-of-Gaussian (can also use Difference-of-Gaussians) [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a [3] Lowe, D., 2004. Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, 60, 2 (2004), pp. 91-118. Automatic Matching of Multi-View Images

  12. Region Detectors • Harris-Affine regions - • Find regions using Harris-Laplace detector • Region based on 2nd moment & affine adapted • Hessian-Affine regions - • Find regions using Hessian-Laplace detector • Affine adapted region based on 2nd moment. [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 Automatic Matching of Multi-View Images

  13. Region Detectors • Regions produced by Harris-Affine and Hessian-Affine detectors [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 Automatic Matching of Multi-View Images

  14. Region Detectors • Affine normalization using 2nd moment matrix for region L and R [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 Automatic Matching of Multi-View Images

  15. Region Detectors • Region normalization • Detectors produce circular or elliptical regions • Size dependant on detection scale • Map regions to circular region with constant radius • Rotate regions in direction of dominant gradient orientation • Illumination normalization • Use affine transformation -> aI(x) + b • Mean and standard deviation of pixel intensities [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

  16. Descriptors • Descriptors -> Feature vector • Invariant to changes in scale, rotation, affine translation and affine illumination • Need to be distinct, stable and repeatable • Distribution (histogram) type or Covariance type • Ten Descriptor types • Scale-Invariant Feature Transform (SIFT) • Gradient Location and Orientation histogram (GLOH) • Shape Context • Principal Component Analysis (PCA)-SIFT • Steerable Filters • Differential Invariants • Complex Filters • Moment Invariants • Cross-Correlation • Spin Image [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

  17. Descriptors • SIFT and GLOH 3D Descriptors • SIFT -> 4 x 4 x 8 = 128 dimension descriptor • GLOH -> Log-polar [(2 x 8) + 1] x 16 = 272 dimension descriptor [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

  18. Matching Criteria • Distance measure • Find putative matches between images • Mahalanobis distance – used for covariant descriptors • Euclidean distance – used for distribution (histogram) descriptors • Direct distance comparison not suitable for indexing or database searching • Simple threshold • Descriptors match if distance between is below threshold t • Descriptor in reference image can have many matches to descriptors in transformed image • Nearest Neighbor (NN) • Find closest match between descriptors in reference and transformed image • Descriptor in reference image can have only 1 match to descriptor in transformed image Automatic Matching of Multi-View Images

  19. Performance Evaluation • Criterion basis • Recall rate = #correct matched/#correspondences • 1-precision = #false matches/[#correct matches + #false matches] • Ideal descriptor -> recall rate = 1, for all precision given no overlap error [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

  20. SIFT - Scale Invariant Feature Transform • Scale Invariant Feature Transform (SIFT) Lowe [3] • Features – • Invariant to image scale, rotation • Invariant for small changes in illumination and 3D camera viewpoint • Extracts large number of highly distinctive features • Enables detection of small objects • Improved performance in cluttered scenes • Algorithms are efficient – complex operations applied to local regions or features vs whole image • Procedure • Scale-space extrema detection • Keypoint localization • Orientation asignment • Keypoint vector (descriptor) Automatic Matching of Multi-View Images

  21. SIFT - Scale Invariant Feature Transform [3] • Scale-Space Blob Detector - • Search for stable features over all scales and image locations • Scale-space kernel -> Gaussian function • Difference of Gaussian Automatic Matching of Multi-View Images

  22. SIFT - Scale Invariant Feature Transform [3] • Difference of Gaussian (DoG) • simple subtraction of blurred L images • Approximation to scale-normalized Laplacian of Gaussian Maxima or minima of scale-normalized Laplacian produces the most stable image features compared to gradient, Hessian, or Harris corner function (Mikolajczyk 2002) Automatic Matching of Multi-View Images

  23. SIFT - Scale Invariant Feature Transform [3] • Scale-Space Image Set - • Divide each octave into s intervals • Compute s + 3 filtered (increasing blurry) images, k = 2(1/s) s = 3, k = 1.26 -> 6th –> 3.18σ 5th –> 2.52σ 4th –> 2.00σ 3rd –> 1.59σ 2nd –> 1.26σ 1st –> 1.00σ • Subtract adjacent images to produce DoG images • Repeat for next octave using 2nd image from top and decimate by 2 Automatic Matching of Multi-View Images

  24. SIFT - Scale Invariant Feature Transform [3] • Scale-Space Pyramid - (from Lowe) Automatic Matching of Multi-View Images

  25. SIFT - Scale Invariant Feature Transform [3] • Locating Scale-Space Extrema - • Detection of local maxima or minima of D(x, y, σ) • Compare each sample point to 8 neighbors in same scale image and 9 neighbors in scale image above and below. • Mark if sample is greater than or less than all of the neighbors • Compares s number of DoG images Automatic Matching of Multi-View Images

  26. SIFT - Scale Invariant Feature Transform [3] • Improving Localization - • Reject points that have low contrast using: <threshold • Where –> • Gives offset extremum -> • Hessian and derivative of D(x, y, σ) uses differences of neighboring sample points. x = (x, y , σ)T is offset from sample point Automatic Matching of Multi-View Images

  27. SIFT - Scale Invariant Feature Transform [3] • Edge Rejection - • Eliminate poorly defined peaks (edges) using Hessian matrix • Verify ratio of principal curves is less than threshold r<10 • Efficient to compute -> less than 20 floating point operations Automatic Matching of Multi-View Images

  28. SIFT - Scale Invariant Feature Transform [3] • Results from Lowe [3] – 832 keypoints reduced to 536 (233x189 image) Automatic Matching of Multi-View Images

  29. SIFT - Scale Invariant Feature Transform • Results from Lowe [3] – performance measures Automatic Matching of Multi-View Images

  30. SIFT - Scale Invariant Feature Transform • Results from Lowe [3] – performance measures Automatic Matching of Multi-View Images

  31. SIFT - Scale Invariant Feature Transform [3] • Orientation – rotational invariance • Use scale of point to select image L(x, y, σ) • Compute the gradient m(x, y) and orientationθ(x, y) at each image sample using differences. • Orientation histogram of sample points – entries weighted by gradient magnitude and a Gaussian window around the keypoint, bins cover 360° range • Peaks in histogram correspond to dominant directions of local gradients Automatic Matching of Multi-View Images

  32. SIFT - Scale Invariant Feature Transform [3] • Descriptor – the feature vector • 8x8 sub-region histograms allow shift in gradient positions • 128 element feature vector -> 4x4 array of 8 orientations (2x2x8 from Lowe is shown below) • Feature vectors matched by nearest neighbor (Euclidean distance) Automatic Matching of Multi-View Images

  33. SIFT - Scale Invariant Feature Transform [3] • Results from Lowe [3] – • Two training objects recognized in cluttered image • Small squares show point matches • Large rectangles shown border of training image after affine transformation Automatic Matching of Multi-View Images

  34. Conclusions • Conclusions • Harris-Laplacian region detector performs better than Laplacian, DoG and gradient scale-space operators • Scale-space detectors provide invariance to rotation, scale and small changes to illumination and viewpoint. • Affine adaptation provides invariance to affine transformations • GLOH and SIFT descriptors provide the best performance. • Dense, localized descriptors perform well under occlusions • Nexts steps • Coding and testing of region detectors, descriptors and matching… Automatic Matching of Multi-View Images

More Related