Scale invariant feature transform sift
Sponsored Links
This presentation is the property of its rightful owner.
1 / 49

Scale-Invariant Feature Transform (SIFT) PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on
  • Presentation posted in: General

Scale-Invariant Feature Transform (SIFT). Jinxiang Chai. Review. Image Processing - Median filtering - Bilateral filtering - Edge detection - Corner detection. Review: Corner Detection. 1. Compute image gradients

Download Presentation

Scale-Invariant Feature Transform (SIFT)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Scale-Invariant Feature Transform (SIFT)

Jinxiang Chai


Review

Image Processing

-Median filtering

- Bilateral filtering

- Edge detection

- Corner detection


Review: Corner Detection

1. Compute image gradients

2. Construct the matrix from it and its neighborhood values

3. Determine the 2 eigenvalues λ(i.j)= [λ1, λ2].

4. If both λ1 and λ2 are big, we have a corner


The Orientation Field

Corners are detected where both λ1 and λ2 are big


Good Image Features

  • What are we looking for?

    • Strong features

    • Invariant to changes (affine and perspective/occlusion)

    • Solve the problem of correspondence

      • Locate an object in multiple images (i.e. in video)

      • Track the path of the object, infer 3D structures, object and camera movement,


Scale Invariant Feature Transform (SIFT)

  • Choosing features that are invariant to image scaling and rotation

  • Also, partially invariant to changes in illumination and 3D camera viewpoint


Invariance

  • Illumination

  • Scale

  • Rotation

  • Affine


Required Readings

  • Object recognition from local scale-invariant features [pdf link], ICCV 09

  • David G. Lowe, "Distinctive image features from scale-invariant keypoints,"International Journal of Computer Vision, 60, 2 (2004), pp. 91-110


Motivation for SIFT

  • Earlier Methods

    • Harris corner detector

      • Sensitive to changes in image scale

      • Finds locations in image with large gradients in two directions

    • No method was fully affine invariant

      • Although the SIFT approach is not fully invariant it allows for considerable affine change

      • SIFT also allows for changes in 3D viewpoint


SIFT Algorithm Overview

  • Scale-space extrema detection

  • Keypoint localization

  • Orientation Assignment

  • Generation of keypoint descriptors.


Scale Space

  • Different scales are appropriate for describing different objects in the image, and we may not know the correct scale/size ahead of time.


Scale space (Cont.)

  • Looking for features (locations) that are stable (invariant) across all possible scale changes

    • use a continuous function of scale (scale space)

  • Which scale-space kernel will we use?

    • The Gaussian Function


Scale-Space of Image

  • variable-scale Gaussian

  • input image


Scale-Space of Image

  • variable-scale Gaussian

  • input image

  • To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function


  • Scale-Space of Image

    • variable-scale Gaussian

    • input image

  • To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function


  • Scale-Space of Image

    • variable-scale Gaussian

    • input image

  • To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

  • Look familiar?


    Scale-Space of Image

    • variable-scale Gaussian

    • input image

  • To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

  • Look familiar?

    -bandpass filter!


    Difference of Gaussian

    • A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)

    • B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)

    • DOG (Difference of Gaussian) = A – B

    • So how to deal with different scales?


    Difference of Gaussian

    • A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)

    • B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)

    • DOG (Difference of Gaussian) = A – B

    • Downsample B with bilinear interpolation with pixel spacing of 1.5 (linear combination of 4 adjacent pixels)


    B1

    A1

    Difference of Gaussian Pyramid

    A3-B3

    Blur

    B3

    DOG3

    A3

    Downsample

    A2-B2

    B2

    Blur

    DOG2

    A2

    Input Image

    Downsample

    A1-B1

    Blur

    DOG1

    Blur


    Other issues

    • Initial smoothing ignores highest spatial frequencies of images


    Other issues

    • Initial smoothing ignores highest spatial frequencies of images

      - expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid


    Other issues

    • Initial smoothing ignores highest spatial frequencies of images

      - expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid

    • How to do downsampling with bilinear interpolations?


    Bilinear Filter

    Weighted sum of four neighboring pixels

    x

    u

    y

    v


    Bilinear Filter

    y

    Sampling at S(x,y):

    (i,j)

    (i,j+1)

    u

    x

    v

    (i+1,j+1)

    (i+1,j)

    S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

    + (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)


    Bilinear Filter

    y

    Sampling at S(x,y):

    (i,j)

    (i,j+1)

    u

    x

    v

    (i+1,j+1)

    (i+1,j)

    S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

    + (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)

    To optimize the above, do the following

    Si = S(i,j) + a*(S(i,j+1)-S(i))

    Sj = S(i+1,j) + a*(S(i+1,j+1)-S(i+1,j))

    S(x,y) = Si+b*(Sj-Si)


    Bilinear Filter

    y

    (i,j)

    (i,j+1)

    x

    (i+1,j+1)

    (i+1,j)


    Pyramid Example

    A3

    DOG3

    B3

    A2

    B2

    DOG3

    A1

    B1

    DOG1


    Feature Detection

    • Find maxima and minima of scale space

    • For each point on a DOG level:

      • Compare to 8 neighbors at same level

      • If max/min, identify corresponding point at pyramid level below

      • Determine if the corresponding point is max/min of its 8 neighbors

      • If so, repeat at pyramid level above

    • Repeat for each DOG level

    • Those that remain are key points


    Identifying Max/Min

    DOG L+1

    DOG L

    DOG L-1


    Refining Key List: Illumination

    • For all levels, use the “A” smoothed image to compute

      • Gradient Magnitude

    • Threshold gradient magnitudes:

      • Remove all key points with MIJ less than 0.1 times the max gradient value

    • Motivation: Low contrast is generally less reliable than high for feature points


    Assigning Canonical Orientation

    • For each remaining key point:

      • Choose surrounding N x N window at DOG level it was detected

    DOG image


    Assigning Canonical Orientation

    • For all levels, use the “A” smoothed image to compute

      • Gradient Orientation

    +

    Gradient Orientation

    Gradient Magnitude

    Gaussian Smoothed Image


    Assigning Canonical Orientation

    • Gradient magnitude weighted by 2D gaussian

    =

    *

    Gradient Magnitude

    2D Gaussian

    Weighted Magnitude


    Assigning Canonical Orientation

    • Accumulate in histogram based on orientation

    • Histogram has 36 bins with 10° increments

    Weighted Magnitude

    Sum of Weighted Magnitudes

    Gradient Orientation

    Gradient Orientation


    Assigning Canonical Orientation

    • Identify peak and assign orientation and sum of magnitude to key point

    *

    Peak

    Weighted Magnitude

    Sum of Weighted Magnitudes

    Gradient Orientation

    Gradient Orientation


    Eliminating edges

    • Difference-of-Gaussian function will be strong along edges

      • So how can we get rid of these edges?


    Eliminating edges

    • Difference-of-Gaussian function will be strong along edges

      • Similar to Harris corner detector

      • We are not concerned about actual values of eigenvalue, just the ratio of the two


    Local Image Description

    • SIFT keys each assigned:

      • Location

      • Scale (analogous to level it was detected)

      • Orientation (assigned in previous canonical orientation steps)

    • Now: Describe local image region invariant to the above transformations


    SIFT key example


    Local Image Description

    For each key point:

    • Identify 8x8 neighborhood (from DOG level it was detected)

    • Align orientation to x-axis


    Local Image Description

    • Calculate gradient magnitude and orientation map

    • Weight by Gaussian


    Local Image Description

    • Calculate histogram of each 4x4 region. 8 bins for gradient orientation. Tally weighted gradient magnitude.


    Local Image Description

    • This histogram array is the image descriptor. (Example here is vector, length 8*4=32. Best suggestion: 128 vector for 16x16 neighborhood)


    Applications: Image Matching

    • Find all key points identified in source and target image

      • Each key point will have 2d location, scale and orientation, as well as invariant descriptor vector

    • For each key point in source image, search corresponding SIFT features in target image.

    • Find the transformation between two images using epipolar geometry constraints or affine transformation.


    Image matching via SIFT featrues

    Feature detection


    Image matching via SIFT featrues

    • Image matching via nearest neighbor search

    • - if the ratio of closest distance to 2nd closest distance greater than 0.8 then reject as a false match.

    • Remove outliers using epipolar line constraints.


    Image matching via SIFT featrues


    Summary

    • SIFT features are reasonably invariant to rotation, scaling, and illumination changes.

    • We can use them for image matching and object recognition among other things.

    • Efficient on-line matching and recognition can be performed in real time


  • Login