Scale invariant feature transform sift
This presentation is the property of its rightful owner.
Sponsored Links
1 / 49

Scale-Invariant Feature Transform (SIFT) PowerPoint PPT Presentation


  • 96 Views
  • Uploaded on
  • Presentation posted in: General

Scale-Invariant Feature Transform (SIFT). Jinxiang Chai. Review. Image Processing - Median filtering - Bilateral filtering - Edge detection - Corner detection. Review: Corner Detection. 1. Compute image gradients

Download Presentation

Scale-Invariant Feature Transform (SIFT)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Scale invariant feature transform sift

Scale-Invariant Feature Transform (SIFT)

Jinxiang Chai


Review

Review

Image Processing

-Median filtering

- Bilateral filtering

- Edge detection

- Corner detection


Review corner detection

Review: Corner Detection

1. Compute image gradients

2. Construct the matrix from it and its neighborhood values

3. Determine the 2 eigenvalues λ(i.j)= [λ1, λ2].

4. If both λ1 and λ2 are big, we have a corner


The orientation field

The Orientation Field

Corners are detected where both λ1 and λ2 are big


Good image features

Good Image Features

  • What are we looking for?

    • Strong features

    • Invariant to changes (affine and perspective/occlusion)

    • Solve the problem of correspondence

      • Locate an object in multiple images (i.e. in video)

      • Track the path of the object, infer 3D structures, object and camera movement,


Scale invariant feature transform sift1

Scale Invariant Feature Transform (SIFT)

  • Choosing features that are invariant to image scaling and rotation

  • Also, partially invariant to changes in illumination and 3D camera viewpoint


Invariance

Invariance

  • Illumination

  • Scale

  • Rotation

  • Affine


Required readings

Required Readings

  • Object recognition from local scale-invariant features [pdf link], ICCV 09

  • David G. Lowe, "Distinctive image features from scale-invariant keypoints,"International Journal of Computer Vision, 60, 2 (2004), pp. 91-110


Motivation for sift

Motivation for SIFT

  • Earlier Methods

    • Harris corner detector

      • Sensitive to changes in image scale

      • Finds locations in image with large gradients in two directions

    • No method was fully affine invariant

      • Although the SIFT approach is not fully invariant it allows for considerable affine change

      • SIFT also allows for changes in 3D viewpoint


Sift algorithm overview

SIFT Algorithm Overview

  • Scale-space extrema detection

  • Keypoint localization

  • Orientation Assignment

  • Generation of keypoint descriptors.


Scale space

Scale Space

  • Different scales are appropriate for describing different objects in the image, and we may not know the correct scale/size ahead of time.


Scale space cont

Scale space (Cont.)

  • Looking for features (locations) that are stable (invariant) across all possible scale changes

    • use a continuous function of scale (scale space)

  • Which scale-space kernel will we use?

    • The Gaussian Function


Scale space of image

Scale-Space of Image

  • variable-scale Gaussian

  • input image


Scale space of image1

Scale-Space of Image

  • variable-scale Gaussian

  • input image

  • To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function


  • Scale space of image2

    Scale-Space of Image

    • variable-scale Gaussian

    • input image

  • To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function


  • Scale space of image3

    Scale-Space of Image

    • variable-scale Gaussian

    • input image

  • To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

  • Look familiar?


    Scale space of image4

    Scale-Space of Image

    • variable-scale Gaussian

    • input image

  • To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

  • Look familiar?

    -bandpass filter!


    Difference of gaussian

    Difference of Gaussian

    • A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)

    • B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)

    • DOG (Difference of Gaussian) = A – B

    • So how to deal with different scales?


    Difference of gaussian1

    Difference of Gaussian

    • A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)

    • B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)

    • DOG (Difference of Gaussian) = A – B

    • Downsample B with bilinear interpolation with pixel spacing of 1.5 (linear combination of 4 adjacent pixels)


    Difference of gaussian pyramid

    B1

    A1

    Difference of Gaussian Pyramid

    A3-B3

    Blur

    B3

    DOG3

    A3

    Downsample

    A2-B2

    B2

    Blur

    DOG2

    A2

    Input Image

    Downsample

    A1-B1

    Blur

    DOG1

    Blur


    Other issues

    Other issues

    • Initial smoothing ignores highest spatial frequencies of images


    Other issues1

    Other issues

    • Initial smoothing ignores highest spatial frequencies of images

      - expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid


    Other issues2

    Other issues

    • Initial smoothing ignores highest spatial frequencies of images

      - expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid

    • How to do downsampling with bilinear interpolations?


    Bilinear filter

    Bilinear Filter

    Weighted sum of four neighboring pixels

    x

    u

    y

    v


    Bilinear filter1

    Bilinear Filter

    y

    Sampling at S(x,y):

    (i,j)

    (i,j+1)

    u

    x

    v

    (i+1,j+1)

    (i+1,j)

    S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

    + (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)


    Bilinear filter2

    Bilinear Filter

    y

    Sampling at S(x,y):

    (i,j)

    (i,j+1)

    u

    x

    v

    (i+1,j+1)

    (i+1,j)

    S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

    + (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)

    To optimize the above, do the following

    Si = S(i,j) + a*(S(i,j+1)-S(i))

    Sj = S(i+1,j) + a*(S(i+1,j+1)-S(i+1,j))

    S(x,y) = Si+b*(Sj-Si)


    Bilinear filter3

    Bilinear Filter

    y

    (i,j)

    (i,j+1)

    x

    (i+1,j+1)

    (i+1,j)


    Pyramid example

    Pyramid Example

    A3

    DOG3

    B3

    A2

    B2

    DOG3

    A1

    B1

    DOG1


    Feature detection

    Feature Detection

    • Find maxima and minima of scale space

    • For each point on a DOG level:

      • Compare to 8 neighbors at same level

      • If max/min, identify corresponding point at pyramid level below

      • Determine if the corresponding point is max/min of its 8 neighbors

      • If so, repeat at pyramid level above

    • Repeat for each DOG level

    • Those that remain are key points


    Identifying max min

    Identifying Max/Min

    DOG L+1

    DOG L

    DOG L-1


    Refining key list illumination

    Refining Key List: Illumination

    • For all levels, use the “A” smoothed image to compute

      • Gradient Magnitude

    • Threshold gradient magnitudes:

      • Remove all key points with MIJ less than 0.1 times the max gradient value

    • Motivation: Low contrast is generally less reliable than high for feature points


    Assigning canonical orientation

    Assigning Canonical Orientation

    • For each remaining key point:

      • Choose surrounding N x N window at DOG level it was detected

    DOG image


    Assigning canonical orientation1

    Assigning Canonical Orientation

    • For all levels, use the “A” smoothed image to compute

      • Gradient Orientation

    +

    Gradient Orientation

    Gradient Magnitude

    Gaussian Smoothed Image


    Assigning canonical orientation2

    Assigning Canonical Orientation

    • Gradient magnitude weighted by 2D gaussian

    =

    *

    Gradient Magnitude

    2D Gaussian

    Weighted Magnitude


    Assigning canonical orientation3

    Assigning Canonical Orientation

    • Accumulate in histogram based on orientation

    • Histogram has 36 bins with 10° increments

    Weighted Magnitude

    Sum of Weighted Magnitudes

    Gradient Orientation

    Gradient Orientation


    Assigning canonical orientation4

    Assigning Canonical Orientation

    • Identify peak and assign orientation and sum of magnitude to key point

    *

    Peak

    Weighted Magnitude

    Sum of Weighted Magnitudes

    Gradient Orientation

    Gradient Orientation


    Eliminating edges

    Eliminating edges

    • Difference-of-Gaussian function will be strong along edges

      • So how can we get rid of these edges?


    Eliminating edges1

    Eliminating edges

    • Difference-of-Gaussian function will be strong along edges

      • Similar to Harris corner detector

      • We are not concerned about actual values of eigenvalue, just the ratio of the two


    Local image description

    Local Image Description

    • SIFT keys each assigned:

      • Location

      • Scale (analogous to level it was detected)

      • Orientation (assigned in previous canonical orientation steps)

    • Now: Describe local image region invariant to the above transformations


    Sift key example

    SIFT key example


    Local image description1

    Local Image Description

    For each key point:

    • Identify 8x8 neighborhood (from DOG level it was detected)

    • Align orientation to x-axis


    Local image description2

    Local Image Description

    • Calculate gradient magnitude and orientation map

    • Weight by Gaussian


    Local image description3

    Local Image Description

    • Calculate histogram of each 4x4 region. 8 bins for gradient orientation. Tally weighted gradient magnitude.


    Local image description4

    Local Image Description

    • This histogram array is the image descriptor. (Example here is vector, length 8*4=32. Best suggestion: 128 vector for 16x16 neighborhood)


    Applications image matching

    Applications: Image Matching

    • Find all key points identified in source and target image

      • Each key point will have 2d location, scale and orientation, as well as invariant descriptor vector

    • For each key point in source image, search corresponding SIFT features in target image.

    • Find the transformation between two images using epipolar geometry constraints or affine transformation.


    Image matching via sift featrues

    Image matching via SIFT featrues

    Feature detection


    Image matching via sift featrues1

    Image matching via SIFT featrues

    • Image matching via nearest neighbor search

    • - if the ratio of closest distance to 2nd closest distance greater than 0.8 then reject as a false match.

    • Remove outliers using epipolar line constraints.


    Image matching via sift featrues2

    Image matching via SIFT featrues


    Summary

    Summary

    • SIFT features are reasonably invariant to rotation, scaling, and illumination changes.

    • We can use them for image matching and object recognition among other things.

    • Efficient on-line matching and recognition can be performed in real time


  • Login