Loading in 5 sec....

Visual Object RecognitionPowerPoint Presentation

Visual Object Recognition

- By
**zytka** - Follow User

- 138 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Visual Object Recognition' - zytka

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Presentation Transcript

### Visual Object Recognition

Bastian Leibe &

Computer Vision Laboratory

ETH Zurich

Chicago, 14.07.2008

Kristen Grauman

Department of Computer Sciences

University of Texas in Austin

Outline

Detection with Global Appearance & Sliding Windows

Local Invariant Features: Detection & Description

Specific Object Recognition with Local Features

― Coffee Break ―

Visual Words: Indexing, Bags of Words Categorization

Matching Local Features

Part-Based Models for Categorization

Current Challenges and Research Directions

2

K. Grauman, B. Leibe

Recognition with Local Features

- Image content is transformed into local features that are invariant to translation, rotation, and scale
- Goal: Verify if they belong to a consistent configuration

Local Features, e.g. SIFT

K. Grauman, B. Leibe

Slide credit: David Lowe

Finding Consistent Configurations

- Global spatial models
- Generalized Hough Transform [Lowe99]
- RANSAC [Obdrzalek02, Chum05, Nister06]
- Basic assumption: object is planar

- Assumption is often justified in practice
- Valid for many structures on buildings
- Sufficient for small viewpoint variations on 3D objects

K. Grauman, B. Leibe

ρ

x

θ

Hough Transform- Origin: Detection of straight lines in clutter
- Basic idea: each candidate point votes for all lines that it is consistent with.
- Votes are accumulated in quantized array
- Local maxima correspond to candidate lines

- Representation of a line
- Usual form y = a x + b has a singularity around 90º.
- Better parameterization: x cos() + y sin() =

K. Grauman, B. Leibe

Hough Transform: Noisy Line

- Problem: Finding the true maximum

ρ

θ

Tokens

Votes

K. Grauman, B. Leibe

Slide credit: David Lowe

Hough Transform: Noisy Input

- Problem: Lots of spurious maxima

ρ

θ

Tokens

Votes

K. Grauman, B. Leibe

Slide credit: David Lowe

Generalized Hough Transform [Ballard81]

- Generalization for an arbitrary contour or shape
- Choose reference point for the contour (e.g. center)
- For each point on the contour remember where it is located w.r.t. to the reference point
- Remember radius r and angle relative to the contour tangent
- Recognition: whenever you find a contour point, calculate the tangent angle and ‘vote’ for all possible reference points
- Instead of reference point, can also vote for transformation
The same idea can be used with local features!

K. Grauman, B. Leibe

Slide credit: Bernt Schiele

Gen. Hough Transform with Local Features

- For every feature, store possible “occurrences”

- For new image, let the matched features vote for possible object positions

- Object identity
- Pose
- Relative position

K. Grauman, B. Leibe

3D Object Recognition

- Gen. HT for Recognition
- Typically only 3 feature matches needed for recognition
- Extra matches provide robustness
- Affine model can be used for planar objects

[Lowe99]

K. Grauman, B. Leibe

Slide credit: David Lowe

View Interpolation

- Training
- Training views from similar viewpoints are clusteredbased on feature matches.
- Matching features between adjacent views are linked.

- Recognition
- Feature matches may bespread over several training viewpoints.
Use the known links to “transfer votes” to other viewpoints.

- Feature matches may bespread over several training viewpoints.

[Lowe01]

K. Grauman, B. Leibe

Slide credit: David Lowe

Applications

- Sony Aibo(Evolution Robotics)
- SIFT usage
- Recognize docking station
- Communicate with visual cards

- Other uses
- Place recognition
- Loop closure in SLAM

K. Grauman, B. Leibe

Slide credit: David Lowe

RANSAC (RANdom SAmple Consensus) [Fischler81]

- Randomly choose a minimal subset of data points necessary to fit a model (a sample)
- Points within some distance threshold t of model are a consensus set. Size of consensus set is model’s support.
- Repeat for N samples; model with biggest support is most robust fit
- Points within distance t of best model are inliers
- Fit final model to all inliers

K. Grauman, B. Leibe

Slide credit: David Lowe

RANSAC: How many samples?

- How many samples are needed?
- Suppose wis fraction of inliers (points from line).
- n points needed to define hypothesis (2 for lines)
- ksamples chosen.

- Prob. that a single sample of n points is correct:
- Prob. that all samples fail is:
Choose k high enough to keep this below desired failure rate.

K. Grauman, B. Leibe

Slide credit: David Lowe

After RANSAC

- RANSAC divides data into inliers and outliers and yields estimate computed from minimal set of inliers
- Improve this initial estimate with estimation over all inliers (e.g. with standard least-squares minimization)
- But this may change inliers, so alternate fitting with re-classification as inlier/outlier

K. Grauman, B. Leibe

Slide credit: David Lowe

Example: Finding Feature Matches

- Find best stereo match within a square search window (here 300 pixels2)
- Global transformation model: epipolar geometry

from Hartley & Zisserman

K. Grauman, B. Leibe

Slide credit: David Lowe

Example: Finding Feature Matches

- Find best stereo match within a square search window (here 300 pixels2)
- Global transformation model: epipolar geometry

before RANSAC

after RANSAC

from Hartley & Zisserman

K. Grauman, B. Leibe

Slide credit: David Lowe

Advantages

Very effective for recognizing arbitrary shapes or objects

Can handle high percentage of outliers (>95%)

Extracts groupings from clutter in linear time

Disadvantages

Quantization issues

Only practical for small number of dimensions (up to 4)

Improvements available

Probabilistic Extensions

Continuous Voting Space

RANSAC

Advantages

General method suited to large range of problems

Easy to implement

Independent of number of dimensions

Disadvantages

Only handles moderate number of outliers (<50%)

Many variants available, e.g.

PROSAC: Progressive RANSAC [Chum05]

Preemptive RANSAC [Nister05]

Comparison[Leibe08]

K. Grauman, B. Leibe

Example Applications

- Mobile tourist guide
- Self-localization
- Object/building recognition
- Photo/video augmentation

[Quack, Leibe, Van Gool, CIVR’08]

B. Leibe

Web Demo: Movie Poster Recognition

50’000 movieposters indexed

Query-by-imagefrom mobile phoneavailable in Switzer-land

http://www.kooaba.com/en/products_engine.html#

K. Grauman, B. Leibe

Application: Large-Scale Retrieval

Query

Results from 5k Flickr images (demo available for 100k set)

[Philbin CVPR’07]

K. Grauman, B. Leibe

Application: Image Auto-Annotation

Moulin Rouge

Old Town Square (Prague)

Tour Montparnasse

Colosseum

ViktualienmarktMaypole

Left: Wikipedia imageRight: closest match from Flickr

[Quack CIVR’08]

K. Grauman, B. Leibe

Outline

Detection with Global Appearance & Sliding Windows

Local Invariant Features: Detection & Description

Specific Object Recognition with Local Features

― Coffee Break ―

Visual Words: Indexing, Bags of Words Categorization

Matching Local Features

Part-Based Models for Categorization

Current Challenges and Research Directions

29

K. Grauman, B. Leibe

Download Presentation

Connecting to Server..