A study of approaches for object recognition
1 / 26

A Study of Approaches for Object Recognition - PowerPoint PPT Presentation

  • Uploaded on

A Study of Approaches for Object Recognition. Presented by Wyman Wong 12/9/2005. Outlines. Introduction Model-Based Object Recognition AAM Inverse Composition AAM View-Based Object Recognition Recognition based on boundary fragments Recognition based on SIFT Proposed Research

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'A Study of Approaches for Object Recognition' - Faraday

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
A study of approaches for object recognition l.jpg

A Study of Approaches for Object Recognition

Presented by Wyman Wong


Outlines l.jpg

  • Introduction

  • Model-Based Object Recognition

    • AAM

    • Inverse Composition AAM

  • View-Based Object Recognition

    • Recognition based on boundary fragments

    • Recognition based on SIFT

  • Proposed Research

  • Conclusion and Future Work

Introduction l.jpg

  • Object Recognition

    • A task of finding 3D objects from 2D images (or even video) and classifying them into one of the many known object types

    • Closely related to the success of many computer vision applications

      • robotics, surveillance, registration … etc.

    • A difficult problem that a general and comprehensive solution to this problem has not been made

Introduction4 l.jpg

  • Two main streams of approaches:

    • Model-Based Object Recognition

      • 3D model of the object being recognized is available

      • Compare the 2D representation of the structure of an object with the 2D projection of the model

    • View-Based Object Recognition

      • 2D representations of the same object viewed at different angles and distances when available

      • Extract features (as the representations of object) and compare them to the features in the feature database

Introduction5 l.jpg

  • Pros and Cons of each main stream:

    • Model-Based Object Recognition

      • Model features can be predicted from just a few detected features based on the geometric constraints

      • Models sacrifice its generality

    • View-Based Object Recognition

      • Greater generality and more easily trainable from visual data

      • Matching is done by comparing the entire objects, some methods may be sensitive to clutter and occlusion

Model based object recognition l.jpg
Model-Based Object Recognition

  • Commonly used in face recognition

  • General Steps:

    • Locate the object,

    • locate and label its structure,

    • adjust the model's parameters until the model generates an image similar enough to the real object.

  • Active Appearance Models (AAM) have been proved to be highly useful models for face recognition

Active appearance models l.jpg
Active Appearance Models

  • They model shape and appearance of objects separately

  • Shape: the vertex locations of a mesh

  • Appearance: the pixels’ values of a mesh

  • Both of the parameters above used PCA to generalize the face recognition to generic face

  • Fitting an AAM: non-linear optimization solution is applied which iteratively solve for incremental additive updates to the shape and appearance coefficients

Inverse compositional aams l.jpg
Inverse Compositional AAMs

  • The major difference of these models with AAMs is the fitting algorithm

  • AAM: additive incremental update shape and appearance parameters

  • ICAAM: inverse compositional update – The algorithm updates the entire warp by composing the current warp with the computed incremental warp

View based object recognition l.jpg
View-Based Object Recognition

  • Common approaches:

    • Correlation-based template matching (Li, W. et al. 95)

      • SEA, PDE, … etc

      • Not effective when the following happens:

        • illumination of environment changes

        • Posture and scale of object changes

        • Occlusion

    • Color Histogram (Swain, M.J. 90)

      • Construct histogram for an object and match it over image

      • It is robust to changing of viewpoint and occlusion

      • But it requires good isolation and segmentation of objects

View based object recognition10 l.jpg
View-Based Object Recognition

  • Common approaches:

    • Feature based

      • Extract features from the image that are salient and match only to those features when searching all location for matches

      • Feature types: groupings of edges, SIFT … etc

      • Feature’s property preferences:

        • View invariant

        • Detected frequently enough for reliable recognition

        • Distinctive

      • Image descriptor is created based on detected features to increase the matching performance

      • Image descriptor = Key / Index to database of features

      • Descriptor’s property preferences:

        • Invariant to scaling, rotation, illumination, affine transformation and noise

Nelson s approach l.jpg
Nelson’s Approach

  • Recognition based on 2D Boundary Fragments

  • Prepare 53 clean images for each object and build 3D recognition database:



Nelson s approach12 l.jpg
Nelson’s Approach

  • Test images used in Nelson’s experiment and their features

Nelson s approach13 l.jpg
Nelson’s Approach

  • Nelson’s experiment has shown his approach has high accuracy

    • 97.0% success rate for 24 objects database

  • under the following conditions:

    • Large number of images

    • Clean images

    • Very different objects

    • No occlusion and clutter

Lowe s approach l.jpg
Lowe’s Approach

  • Recognition based on Scale Invariant Feature Transform (SIFT)

    • SIFT generates distinctive invariant features

    • SIFT based image descriptors are generally most resistant to common image deformations (Mikolajczyk 2005)

    • SIFT – four steps:

      • Scale-space extrema detection

      • Keypoint localization

      • Orientation assignment

      • Keypoint descriptor computation

Scale space extrema detection l.jpg
Scale-space extrema detection

  • DOG ~ LOG

  • Search over all sample points in all scales and find extrema that are local maxima or minima in laplacian space

Small keypoints  Solve occlusion problem

Large keypoints  Robust to noise and image blur

Keypoint localization l.jpg
Keypoint localization

  • Reject keypoints with the following properties:

    • Low contrast (sensitive to noise)

    • Localized along edge (sliding effect)

  • Solution:

    • Filter points with value D below 0.03

    • Apply Hessian edge detector

Orientation assignment l.jpg
Orientation assignment

  • Pre-compute the gradient magnitude and orientation

  • Use them to construct keypoint descriptor

Keypoint descriptor computation l.jpg
Keypoint descriptor computation

  • Create orientation histogram over 4x4 sample regions around the keypoint locations

  • Each histogram contains 8 orientation bins

  • 4x4x8 = 128 elements vectors (distinctively representing a feature)

Object recognition based on sift l.jpg
Object Recognition based on SIFT

  • Nearest-neighbor algorithm

  • Matching: assign features to objects

  • There can be many wrong matches

    • Solution

      • Identify clusters of features

      • Generalized Hough transform

  • Determine pose of object and then discard outliers

Proposed research l.jpg
Proposed Research

  • Personally, I think model-based approach does have better performance

  • Success of model-based approach requires:

    • All models of objects to be detected

    • Automatically construct models

    • Automatically select the best model

  • How do the system know which 3D model to be used on a specific image of object?

    • By view-based approach

    • Human looks at an image of object for a moment and then realize which model to be used on that object

    • Then use the specific model to refine the identification of the specific object

Hybrid of bottom up and top down l.jpg
Hybrid of bottom-up and top-down

  • View-based approaches just presented are bottom-up approaches

    • Features: edges, extrema (Low Level)

    • Descriptors of features

    • Matching

    • Identification of object (High Level)

  • Can it be like that?

    • Features

    • Matching (Lower Level)

    • Guessing of object (Higher Level)

    • Matching (Lower Level)

    • Guessing of object (Higher Level)

    • Identification of object

Hierarchy of features l.jpg
Hierarchy of features

  • Lowe’s system

    • All features have equal weight in voting of object during identification of object (subject to be verified by examining the opened source code)

    • Special features do not have enough voting power to shift the result to the correct one

    • Consider the following scenario:

      • Two objects have many similar features, a1to a100 are similar to b1to b100, and have just one very different feature, a* for object A and b* for object B

      • Many a1to a100 may be poorly captured by imaging device and mismatched as b1to b100 , even we can still recognize the feature a*, the system may still think the object is B

Object A

Object B

Extension of sift l.jpg
Extension of SIFT

  • Color descriptors

  • Local texture measures incorporated into feature descriptors

  • Scale-invariant edge groupings

  • *Generic object class recognition

Conclusion and future work l.jpg
Conclusion and Future Work

  • Discussed the different approaches in object recognition

  • Discussed what is SIFT and how it works

  • Discussed the possible extensions to SIFT

  • Design hybrid approach

  • Design extensions

Slide25 l.jpg

Q & A

Thank you very much!

Things to be understood l.jpg
Things to be understood

  • Find extrema over same scale space is good, why need to find over different scale?