Feature Detection and Descriptors

Feature Detection and Descriptors Charles Hatt NishaKiran Lulu Zhang

Overview • Background • Motivation • Timeline and related work • SIFT / SIFT Extensions • PCA – SIFT • GLOH • DAISY • Performance Evaluation

Scope • We cover local descriptors • Basic Procedure: • Find patches or key points • Compute a descriptor • Match to other points • Local vs Global: • Robust to occlusion and clutter • Stable under image transforms

Color Histogram: A Global Descriptor

Motivation

Object Recognition

Robot Self Localization • DARPA urban challenge, cars can recognize four way stops

Image Retrieval

Tracking

Things We Did in Class • Image stitching • Image alignment

Good Descriptors are Invariant to

Timeline • Cross correlation • Canny Edge Detector 1986 • Harris Corner Detector 1988 • Moment Invariants 1991 • SIFT 1999 • Shape Context 2002 • PCA-SIFT 2004 • Spin Images 2005 • GLOH 2005 • Daisy 2008

Cross Correlation

Moment Invariants • a = degree • p + q = order • Id(x, y) = image gradient in direction d • d = horizontal or vertical • Invariant to convolution, blurring, affine transforms; can compute any order or degree • Higher order sensitive to small photometric distortions

Spin Images (Johnson 97)

Spin Images (Lazebnik 05) • Normalized patch implies invariant to intensity changes; invariant to rotation. • Usually there are 10 bins for intensity, 5 bins for distance from center in the histogram. • Descriptor is 50 elements.

Shape Context

Scale Invariant Features

Characteristic of good features • Repeatability • The same feature can be found in several images despite the geometric and photometric transformation • Saliency • Each feature has a distinctive description • Compactness and efficiency • Many fewer features than image pixels • Locality • Features occupy very small area of the image, robust to clutter and occlusion

Good features - Corners in image • Harris corner detector • Key idea: in the region around the corner, image gradient has two or more dominant directions • Invariant to • Rotation • Partially to affine intensity change • I = I + b (Invariant) – Only derivatives are used • I = a*I (Not in this case) • Not invariant to scale

Not invariant to scale All points will be classified as edges Corner !

Scale Invariant Detection • Consider regions (e.g. circles) of different sizes around a point • Regions of corresponding sizes will look the same in both images

Scale invariant feature detection • Goal: independently detect corresponding regions in scaled versions of the same image • Need scale selection mechanism for finding characteristic region size that is covariant with the image transformation

Recall: Edge detection • Convolution with derivative of Gaussian => Edge at maximum of derivative • Convolution with second derivative of Gaussian => Edge at zero crossing

f Edge Derivative of Gaussian dg/dx Edge = maximum of derivative f*dg/dx

f Edge Second derivative of Gaussian (Laplacian) Edge = Zero crossing of second derivative

Scale selection • Define the characteristic scale as the scale that produces peak of Laplacian response

SIFT stages • Scale space extrema detection • Keypoint localization • Orientation assignment • Keypoint descriptor

Scale space extrema detection • Approximate Laplacian of Gaussian with Difference of Gaussian • Computationally less intensive • Invariant to scale • Images of the same size(vertical) form an octave. Each octave have certain level of blurred images.

Maxima/Minima selection in DoG SIFT: Find the local maxima of difference of Gaussian in space and scale

Keypoint localization • Lot of keypoints detected • Sub pixel localization: Accurate location of keypoints • Eliminating points with low contrast • Eliminating edge responses

Sub pixel localization

Eliminating extra keypoints • If the magnitude of intensity at the current pixel in the DoG image (that is being checked for maxima/minima) is less than a certain value, it is rejected. • Removing edges – Idea similar to Harris corner detector

Until now, we have seen scale invariance • Now, let’s make the keypoint rotation invariant

Orientation assignment • Key idea: Collect gradient directions and magnitudes around each keypoint. Then figure out the most prominent orientations in that region. Assign these orientations to the keypoint • Size of the orientation collection region depends on the scale. Bigger the scale, bigger the collection region.

Compute gradient magnitude and orientations for each pixel and then construct a histogram • Peak of the histogram taken as the keypoint orientation

Keypoint descriptor

Based on 16*16 patches • 4*4 subregions • 8 bins in each subregion • 4*4*8=128 dimensions in total

PCA-SIFT • PCA-SIFT is a modification of SIFT, which changes how the keypoint descriptors are constructed • Basic Idea: Use PCA(Principal Component Analysis) to represent the gradient patch around the keypoint • PCA stages • Computing projection matrix • Constructing PCA-SIFT descriptor

Computing projection matrix • Select a representative set of pictures and detect all keypoints in these pictures • For each keypoint • Extract an image patch around it with size 41*41 pixels • Calculate horizontal and vertical gradients, resulting in a vector of size 39*39*2 = 3042 • Put all these vectors into a k*3042 matrix A where k is the number of keypoints detected • Calculate the covariance matrix of A

Contd.. • Compute the eigenvectors and the eigenvalues of cov A. • Select the first n eigenvectors; the projection matrix is a n*3042 matrix composed of these eigenvectors • The projection matrix is only computed once and saved.

Dimension reduction through PCA The image patches do not span the entire space of pixel values, and also not the Smaller space of patches from natural images. They consist of highly restricted set of patches that passed the first three stages of SIFT.

Constructing PCA-SIFT descriptor • Input: location of keypoint, scale, orientation. • Extract 41*41 patch around the keypoint at the given scale, rotated to its orientation • Calculate 39*39 horizontal and vertical gradients, resulting in a vector of size 3042 • Multiply this vector using the precomputed n*3042 projection matrix • This results in a PCA-SIFT descriptor of size n

Eigenspace construction

Feature Detection and Descriptors

Feature Detection and Descriptors

Presentation Transcript

EQF descriptors and the QF - EHEA descriptors

Feature Detection and Emotion Recognition

Feature Detection

Local Invariant Feature Descriptors

Feature Detection

Musical Feature Detection

Feature detection

Features, Feature descriptors, Matching

Cyclostationary Feature Detection

Musical Feature Detection

05 - Feature Detection

Lecture 3a: Feature detection and matching

Feature Detection

Facial Feature Detection

Feature Detection

Cyclostationary Feature Detection

Feature Detection

Lecture 5: Feature detection and matching