Loading in 5 sec....

Scale-Invariant Feature Transform (SIFT)PowerPoint Presentation

Scale-Invariant Feature Transform (SIFT)

- By
**gene** - Follow User

- 167 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Scale-Invariant Feature Transform (SIFT)' - gene

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Scale-Invariant Feature Transform (SIFT)

Jinxiang Chai

Review: Corner Detection

1. Compute image gradients

2. Construct the matrix from it and its neighborhood values

3. Determine the 2 eigenvalues λ(i.j)= [λ1, λ2].

4. If both λ1 and λ2 are big, we have a corner

The Orientation Field

Corners are detected where both λ1 and λ2 are big

Good Image Features

- What are we looking for?
- Strong features
- Invariant to changes (affine and perspective/occlusion)
- Solve the problem of correspondence
- Locate an object in multiple images (i.e. in video)
- Track the path of the object, infer 3D structures, object and camera movement,

Scale Invariant Feature Transform (SIFT)

- Choosing features that are invariant to image scaling and rotation
- Also, partially invariant to changes in illumination and 3D camera viewpoint

Invariance

- Illumination
- Scale
- Rotation
- Affine

Required Readings

- Object recognition from local scale-invariant features [pdf link], ICCV 09
- David G. Lowe, "Distinctive image features from scale-invariant keypoints,"International Journal of Computer Vision, 60, 2 (2004), pp. 91-110

Motivation for SIFT

- Earlier Methods
- Harris corner detector
- Sensitive to changes in image scale
- Finds locations in image with large gradients in two directions

- No method was fully affine invariant
- Although the SIFT approach is not fully invariant it allows for considerable affine change
- SIFT also allows for changes in 3D viewpoint

- Harris corner detector

SIFT Algorithm Overview

- Scale-space extrema detection
- Keypoint localization
- Orientation Assignment
- Generation of keypoint descriptors.

Scale Space

- Different scales are appropriate for describing different objects in the image, and we may not know the correct scale/size ahead of time.

Scale space (Cont.)

- Looking for features (locations) that are stable (invariant) across all possible scale changes
- use a continuous function of scale (scale space)

- Which scale-space kernel will we use?
- The Gaussian Function

Scale-Space of Image

- variable-scale Gaussian
- input image

Scale-Space of Image To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

- variable-scale Gaussian
- input image

Scale-Space of Image To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

- variable-scale Gaussian
- input image

Scale-Space of Image To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

- variable-scale Gaussian
- input image

Look familiar?

Scale-Space of Image To detect stable keypoint locations, find the scale-space extrema in difference-of-Gaussian function

- variable-scale Gaussian
- input image

Look familiar?

-bandpass filter!

Difference of Gaussian

- A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)
- B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)
- DOG (Difference of Gaussian) = A – B
- So how to deal with different scales?

Difference of Gaussian

- A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2)
- B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2)
- DOG (Difference of Gaussian) = A – B
- Downsample B with bilinear interpolation with pixel spacing of 1.5 (linear combination of 4 adjacent pixels)

A1

Difference of Gaussian PyramidA3-B3

Blur

B3

DOG3

A3

Downsample

A2-B2

B2

Blur

DOG2

A2

Input Image

Downsample

A1-B1

Blur

DOG1

Blur

Other issues

- Initial smoothing ignores highest spatial frequencies of images

Other issues

- Initial smoothing ignores highest spatial frequencies of images
- expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid

Other issues

- Initial smoothing ignores highest spatial frequencies of images
- expand the input image by a factor of 2, using bilinear interpolation, prior to building the pyramid

- How to do downsampling with bilinear interpolations?

Bilinear Filter

y

Sampling at S(x,y):

(i,j)

(i,j+1)

u

x

v

(i+1,j+1)

(i+1,j)

S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

+ (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)

Bilinear Filter

y

Sampling at S(x,y):

(i,j)

(i,j+1)

u

x

v

(i+1,j+1)

(i+1,j)

S(x,y) = a*b*S(i,j) + a*(1-b)*S(i+1,j)

+ (1-a)*b*S(i,j+1) + (1-a)*(1-b)*S(i+1,j+1)

To optimize the above, do the following

Si = S(i,j) + a*(S(i,j+1)-S(i))

Sj = S(i+1,j) + a*(S(i+1,j+1)-S(i+1,j))

S(x,y) = Si+b*(Sj-Si)

Feature Detection

- Find maxima and minima of scale space
- For each point on a DOG level:
- Compare to 8 neighbors at same level
- If max/min, identify corresponding point at pyramid level below
- Determine if the corresponding point is max/min of its 8 neighbors
- If so, repeat at pyramid level above

- Repeat for each DOG level
- Those that remain are key points

Refining Key List: Illumination

- For all levels, use the “A” smoothed image to compute
- Gradient Magnitude

- Threshold gradient magnitudes:
- Remove all key points with MIJ less than 0.1 times the max gradient value

- Motivation: Low contrast is generally less reliable than high for feature points

Assigning Canonical Orientation

- For each remaining key point:
- Choose surrounding N x N window at DOG level it was detected

DOG image

Assigning Canonical Orientation

- For all levels, use the “A” smoothed image to compute
- Gradient Orientation

+

Gradient Orientation

Gradient Magnitude

Gaussian Smoothed Image

Assigning Canonical Orientation

- Gradient magnitude weighted by 2D gaussian

=

*

Gradient Magnitude

2D Gaussian

Weighted Magnitude

Assigning Canonical Orientation

- Accumulate in histogram based on orientation
- Histogram has 36 bins with 10° increments

Weighted Magnitude

Sum of Weighted Magnitudes

Gradient Orientation

Gradient Orientation

Assigning Canonical Orientation

- Identify peak and assign orientation and sum of magnitude to key point

*

Peak

Weighted Magnitude

Sum of Weighted Magnitudes

Gradient Orientation

Gradient Orientation

Eliminating edges

- Difference-of-Gaussian function will be strong along edges
- So how can we get rid of these edges?

Eliminating edges

- Difference-of-Gaussian function will be strong along edges
- Similar to Harris corner detector
- We are not concerned about actual values of eigenvalue, just the ratio of the two

Local Image Description

- SIFT keys each assigned:
- Location
- Scale (analogous to level it was detected)
- Orientation (assigned in previous canonical orientation steps)

- Now: Describe local image region invariant to the above transformations

Local Image Description

For each key point:

- Identify 8x8 neighborhood (from DOG level it was detected)
- Align orientation to x-axis

Local Image Description

- Calculate gradient magnitude and orientation map
- Weight by Gaussian

Local Image Description

- Calculate histogram of each 4x4 region. 8 bins for gradient orientation. Tally weighted gradient magnitude.

Local Image Description

- This histogram array is the image descriptor. (Example here is vector, length 8*4=32. Best suggestion: 128 vector for 16x16 neighborhood)

Applications: Image Matching

- Find all key points identified in source and target image
- Each key point will have 2d location, scale and orientation, as well as invariant descriptor vector

- For each key point in source image, search corresponding SIFT features in target image.
- Find the transformation between two images using epipolar geometry constraints or affine transformation.

Image matching via SIFT featrues

Feature detection

Image matching via SIFT featrues

- Image matching via nearest neighbor search
- - if the ratio of closest distance to 2nd closest distance greater than 0.8 then reject as a false match.
- Remove outliers using epipolar line constraints.

Summary

- SIFT features are reasonably invariant to rotation, scaling, and illumination changes.
- We can use them for image matching and object recognition among other things.
- Efficient on-line matching and recognition can be performed in real time

Download Presentation

Connecting to Server..