Visual object recognition
Download
1 / 25

Visual Object Recognition - PowerPoint PPT Presentation


  • 138 Views
  • Uploaded on

Visual Object Recognition. Bastian Leibe & Computer Vision Laboratory ETH Zurich Chicago, 14.07.2008. Kristen Grauman Department of Computer Sciences University of Texas in Austin. Outline. Detection with Global Appearance & Sliding Windows

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Visual Object Recognition' - zytka


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Visual object recognition

Visual Object Recognition

Bastian Leibe &

Computer Vision Laboratory

ETH Zurich

Chicago, 14.07.2008

Kristen Grauman

Department of Computer Sciences

University of Texas in Austin


Outline
Outline

Detection with Global Appearance & Sliding Windows

Local Invariant Features: Detection & Description

Specific Object Recognition with Local Features

― Coffee Break ―

Visual Words: Indexing, Bags of Words Categorization

Matching Local Features

Part-Based Models for Categorization

Current Challenges and Research Directions

2

K. Grauman, B. Leibe


Recognition with local features
Recognition with Local Features

  • Image content is transformed into local features that are invariant to translation, rotation, and scale

  • Goal: Verify if they belong to a consistent configuration

Local Features, e.g. SIFT

K. Grauman, B. Leibe

Slide credit: David Lowe


Finding consistent configurations
Finding Consistent Configurations

  • Global spatial models

    • Generalized Hough Transform [Lowe99]

    • RANSAC [Obdrzalek02, Chum05, Nister06]

    • Basic assumption: object is planar

  • Assumption is often justified in practice

    • Valid for many structures on buildings

    • Sufficient for small viewpoint variations on 3D objects

K. Grauman, B. Leibe


Hough transform

y

ρ

x

θ

Hough Transform

  • Origin: Detection of straight lines in clutter

    • Basic idea: each candidate point votes for all lines that it is consistent with.

    • Votes are accumulated in quantized array

    • Local maxima correspond to candidate lines

  • Representation of a line

    • Usual form y = a x + b has a singularity around 90º.

    • Better parameterization: x cos() + y sin() =

K. Grauman, B. Leibe


Hough transform noisy line
Hough Transform: Noisy Line

  • Problem: Finding the true maximum

ρ

θ

Tokens

Votes

K. Grauman, B. Leibe

Slide credit: David Lowe


Hough transform noisy input
Hough Transform: Noisy Input

  • Problem: Lots of spurious maxima

ρ

θ

Tokens

Votes

K. Grauman, B. Leibe

Slide credit: David Lowe


Generalized hough transform ballard81
Generalized Hough Transform [Ballard81]

  • Generalization for an arbitrary contour or shape

    • Choose reference point for the contour (e.g. center)

    • For each point on the contour remember where it is located w.r.t. to the reference point

    • Remember radius r and angle relative to the contour tangent

    • Recognition: whenever you find a contour point, calculate the tangent angle and ‘vote’ for all possible reference points

    • Instead of reference point, can also vote for transformation

       The same idea can be used with local features!

K. Grauman, B. Leibe

Slide credit: Bernt Schiele


Gen hough transform with local features
Gen. Hough Transform with Local Features

  • For every feature, store possible “occurrences”

  • For new image, let the matched features vote for possible object positions

  • Object identity

  • Pose

  • Relative position

K. Grauman, B. Leibe


3d object recognition
3D Object Recognition

  • Gen. HT for Recognition

    • Typically only 3 feature matches needed for recognition

    • Extra matches provide robustness

    • Affine model can be used for planar objects

[Lowe99]

K. Grauman, B. Leibe

Slide credit: David Lowe


View interpolation
View Interpolation

  • Training

    • Training views from similar viewpoints are clusteredbased on feature matches.

    • Matching features between adjacent views are linked.

  • Recognition

    • Feature matches may bespread over several training viewpoints.

       Use the known links to “transfer votes” to other viewpoints.

[Lowe01]

K. Grauman, B. Leibe

Slide credit: David Lowe


Recognition using view interpolation
Recognition Using View Interpolation

[Lowe01]

K. Grauman, B. Leibe

Slide credit: David Lowe


Location recognition
Location Recognition

Training

[Lowe04]

K. Grauman, B. Leibe

Slide credit: David Lowe


Applications
Applications

  • Sony Aibo(Evolution Robotics)

  • SIFT usage

    • Recognize docking station

    • Communicate with visual cards

  • Other uses

    • Place recognition

    • Loop closure in SLAM

K. Grauman, B. Leibe

Slide credit: David Lowe


Ransac random sample consensus fischler81
RANSAC (RANdom SAmple Consensus) [Fischler81]

  • Randomly choose a minimal subset of data points necessary to fit a model (a sample)

  • Points within some distance threshold t of model are a consensus set. Size of consensus set is model’s support.

  • Repeat for N samples; model with biggest support is most robust fit

    • Points within distance t of best model are inliers

    • Fit final model to all inliers

K. Grauman, B. Leibe

Slide credit: David Lowe


Ransac how many samples
RANSAC: How many samples?

  • How many samples are needed?

    • Suppose wis fraction of inliers (points from line).

    • n points needed to define hypothesis (2 for lines)

    • ksamples chosen.

  • Prob. that a single sample of n points is correct:

  • Prob. that all samples fail is:

     Choose k high enough to keep this below desired failure rate.

K. Grauman, B. Leibe

Slide credit: David Lowe


After ransac
After RANSAC

  • RANSAC divides data into inliers and outliers and yields estimate computed from minimal set of inliers

  • Improve this initial estimate with estimation over all inliers (e.g. with standard least-squares minimization)

  • But this may change inliers, so alternate fitting with re-classification as inlier/outlier

K. Grauman, B. Leibe

Slide credit: David Lowe


Example finding feature matches
Example: Finding Feature Matches

  • Find best stereo match within a square search window (here 300 pixels2)

  • Global transformation model: epipolar geometry

from Hartley & Zisserman

K. Grauman, B. Leibe

Slide credit: David Lowe


Example finding feature matches1
Example: Finding Feature Matches

  • Find best stereo match within a square search window (here 300 pixels2)

  • Global transformation model: epipolar geometry

before RANSAC

after RANSAC

from Hartley & Zisserman

K. Grauman, B. Leibe

Slide credit: David Lowe


Comparison

Gen. Hough Transform

Advantages

Very effective for recognizing arbitrary shapes or objects

Can handle high percentage of outliers (>95%)

Extracts groupings from clutter in linear time

Disadvantages

Quantization issues

Only practical for small number of dimensions (up to 4)

Improvements available

Probabilistic Extensions

Continuous Voting Space

RANSAC

Advantages

General method suited to large range of problems

Easy to implement

Independent of number of dimensions

Disadvantages

Only handles moderate number of outliers (<50%)

Many variants available, e.g.

PROSAC: Progressive RANSAC [Chum05]

Preemptive RANSAC [Nister05]

Comparison

[Leibe08]

K. Grauman, B. Leibe


Example applications

Aachen Cathedral

Example Applications

  • Mobile tourist guide

  • Self-localization

  • Object/building recognition

  • Photo/video augmentation

[Quack, Leibe, Van Gool, CIVR’08]

B. Leibe


Web demo movie poster recognition
Web Demo: Movie Poster Recognition

50’000 movieposters indexed

Query-by-imagefrom mobile phoneavailable in Switzer-land

http://www.kooaba.com/en/products_engine.html#

K. Grauman, B. Leibe


Application large scale retrieval
Application: Large-Scale Retrieval

Query

Results from 5k Flickr images (demo available for 100k set)

[Philbin CVPR’07]

K. Grauman, B. Leibe


Application image auto annotation
Application: Image Auto-Annotation

Moulin Rouge

Old Town Square (Prague)

Tour Montparnasse

Colosseum

ViktualienmarktMaypole

Left: Wikipedia imageRight: closest match from Flickr

[Quack CIVR’08]

K. Grauman, B. Leibe


Outline1
Outline

Detection with Global Appearance & Sliding Windows

Local Invariant Features: Detection & Description

Specific Object Recognition with Local Features

― Coffee Break ―

Visual Words: Indexing, Bags of Words Categorization

Matching Local Features

Part-Based Models for Categorization

Current Challenges and Research Directions

29

K. Grauman, B. Leibe


ad