Learning local affine representations for texture and object recognition
This presentation is the property of its rightful owner.
Sponsored Links
1 / 49

Learning Local Affine Representations for Texture and Object Recognition PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

Learning Local Affine Representations for Texture and Object Recognition. Svetlana Lazebnik Beckman Institute, University of Illinois at Urbana-Champaign (joint work with Cordelia Schmid, Jean Ponce). Overview. Goal: Recognition of 3D textured surfaces, object classes Our contribution:

Download Presentation

Learning Local Affine Representations for Texture and Object Recognition

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Learning local affine representations for texture and object recognition

Learning Local Affine Representations for Texture and Object Recognition

Svetlana Lazebnik

Beckman Institute, University of Illinois at Urbana-Champaign

(joint work with Cordelia Schmid, Jean Ponce)


Overview

Overview

  • Goal:

    • Recognition of 3D textured surfaces, object classes

  • Our contribution:

    • Texture and object representations based on local affine regions

  • Advantages of proposed approach:

    • Distinctive, repeatable primitives

    • Robustness to clutter and occlusion

    • Ability to approximate 3D geometric transformations


The scope

The Scope

  • Recognition of single-texture images (CVPR 2003)

  • Recognition of individual texture regions in multi-texture images (ICCV 2003)

  • Recognition of object classes (BMVC 2004, work in progress)


1 recognition of single texture images

1. Recognition of Single-Texture Images


Affine region detectors

Affine Region Detectors

Harris detector (H)

Laplacian detector (L)

Mikolajczyk & Schmid (2002), Gårding & Lindeberg (1996)


Affine rectification process

Affine Rectification Process

Patch 1

Patch 2

Rectified patches (rotational ambiguity)


Rotation invariant descriptors 1 spin images

Rotation-Invariant Descriptors 1: Spin Images

  • Based on range spin images (Johnson & Hebert 1998)

  • Two-dimensional histogram: distance from center × intensity value


Rotation invariant descriptors 2 rift

Rotation-Invariant Descriptors 2: RIFT

  • Based on SIFT (Lowe 1999)

  • Two-dimensional histogram: distance from center × gradient orientation

  • Gradient orientation is measured w.r.t. to the direction pointing from the center of the patch


Signatures and emd

Signatures and EMD

  • SignaturesS = {(m1, w1), … , (mk, wk)}mi — cluster centerwi — relative weight

  • Earth Mover’s Distance (Rubner et al. 1998)

    • Computed from ground distances d(mi, m'j)

    • Can compare signatures of different sizes

    • Insensitive to the number of clusters


Database textured surfaces

Database: Textured Surfaces

25 textures, 40 sample images each (640x480)


Evaluation

Evaluation

  • Channels: HS, HR, LS, LR

    • Combined through addition of EMD matrices

  • Classification results

    • 10 training images per class, rates averaged over 200 random training subsets


Comparative evaluation

Comparative Evaluation


Results of evaluation classification rate vs number of training samples

(H+L)(S+R)

VZ-Joint

VZ-MRF

Results of Evaluation:Classification rate vs. number of training samples

  • Conclusion: an intrinsically invariant representation is necessary to deal with intra-class variations when they are not adequately represented in the training set


Summary

Summary

  • A sparse texture representation based on local affine regions

  • Two novel descriptors (spin images, RIFT)

  • Successful recognition in the presence of viewpoint changes, non-rigidity, non-homogeneity

  • A flexible approach to invariance


2 recognition of individual regions in multi texture images

2. Recognition of Individual Regions in Multi-Texture Images

  • A two-layer architecture:

    • Local appearance + neighborhood relations

  • Learning:

    • Represent the local appearance of each texture class using a mixture-of-Gaussians model

    • Compute co-occurrence statistics of sub-class labels over affinely adapted neighborhoods

  • Recognition:

    • Obtain initial class membership probabilities from the generative model

    • Use relaxation to refine these probabilities


Two learning scenarios

Two Learning Scenarios

  • Fully supervised: every region in the training image is labeled with its texture class

  • Weakly supervised: each training image is labeled with the classes occurring in it

brick

brick, marble, carpet


Neighborhood statistics

Neighborhood Statistics

  • Estimate:

  • probability p(c,c')

  • correlation r(c,c')

Neighborhood definition


Relaxation rosenfeld et al 1976

Relaxation (Rosenfeld et al. 1976)

  • Iterative process:

    • Initialized with posterior probabilities p(c|xi) obtained from the generative model

    • For each region i and each sub-class label c, update the probability pi(c) based on neighbor probabilities pj(c') and correlations r(c,c')

  • Shortcomings:

    • No formal guarantee of convergence

    • After the initialization, the updates to the probability values do not depend on the image data


Experiment 1 3d textured surfaces

Experiment 1: 3D Textured Surfaces

Single-texture images

T1 (brick)

T2 (carpet)

T3 (chair)

T4 (floor 1)

T5 (floor 2)

T6 (marble)

T7 (wood)

Multi-texture images

10 single-texture training images per class, 13 two-texture training images, 45 multi-texture test images


Effect of relaxation on labeling

Effect of Relaxation on Labeling

Original image

Top: before relaxation, bottom: after relaxation


Retrieval

Retrieval

(single-texture training images)

T1 (brick)

T2 (carpet)

T3 (chair)

T4 (floor 1)

T5 (floor 2)

T6 (marble)

T7 (wood)


Successful segmentation examples

Successful Segmentation Examples


Unsuccessful segmentation examples

Unsuccessful Segmentation Examples


Experiment 2 animals

Experiment 2: Animals

  • No manual segmentation

  • Training data: 10 sample images per class

  • Test data: 20 samples per class + 20 negative images

cheetah, background

zebra, background

giraffe, background


Cheetah results

Cheetah Results


Zebra results

Zebra Results


Giraffe results

Giraffe Results


Future work

Summary

Future Work

  • A two-level representation (local appearance + neighborhood relations)

  • Weakly supervised learning of texture models

  • Design an improved representation using a random field framework, e.g., conditional random fields (Lafferty 2001, Kumar & Hebert 2003)

  • Develop a procedure for weakly supervised learning of random field parameters

  • Apply method to recognition of natural texture categories


3 recognition of object classes

3. Recognition of Object Classes

The approach:

  • Represent objects using multiple composite semi-local affine parts

    • More expressive than individual regions

    • Not globally rigid

  • Correspondence search is key to learning and detection


Correspondence search

Correspondence Search

  • Basic operation: a two-image matching procedure for finding collections of affine regions that can be mapped onto each other using a single affine transformation

  • Implementation: greedy search based on geometric and photometric consistency constraints

    • Returns multiple correspondence hypotheses

    • Automatically determines number of regions in correspondence

    • Works on unsegmented, cluttered images (weakly supervised learning)

A


Matching 3d objects

Matching: 3D Objects


Matching 3d objects1

Matching: 3D Objects

closeup

closeup


Matching faces

Matching: Faces

spurious match ???


Finding symmetries

Finding Symmetries


Finding repeated patterns and symmetries

Finding Repeated Patterns and Symmetries


Learning object models for recognition

Learning Object Models for Recognition

  • Match multiple pairs of training images to produce a set of candidate parts

  • Use additional validation images to evaluate repeatability of parts and individual regions

  • Retain a fixed number of parts having the best repeatability score


Recognition experiment butterflies

Recognition Experiment: Butterflies

Admiral Swallowtail Machaon Monarch 1 Monarch 2 Peacock Zebra

  • 16 training images (8 pairs) per class

  • 10 validation images per class

  • 437 test images

  • 619 images total


Butterfly parts

Butterfly Parts


Recognition

Recognition

  • Top 10 parts per class used for recognition

  • Relative repeatability score:

  • Classification results:

total number of regions detectedtotal part size

Total part size (smallest/largest)


Classification rate vs number of parts

Classification Rate vs. Number of Parts


Detection results roc curves

Detection Results (ROC Curves)

Circles: reference relative repeatability rates. Red square: ROC equal error rate (in parentheses)


Successful detection examples

Successful Detection Examples

Training images

Test images (blue: occluded regions)

All ellipses found in the test images


Unsuccessful detection examples

Unsuccessful Detection Examples

Training images

Test images (blue: occluded regions)

All ellipses found in the test image


Summary1

Summary

Summary

  • Semi-local affine parts for describing structure of 3D objects

  • Finding a part vocabulary:

    • Correspondence search between pairs of images

    • Validation

  • Additional application:

    • Finding symmetry and repetition

Future Work

  • Find a better affine region detector

  • Represent, learn inter-part relations

  • Evaluation: CalTech database, harder classes, etc.


Birds

Birds

Egret

Puffin

Snowy Owl

Mandarin Duck

Wood Duck


Birds candidate parts

Birds: Candidate Parts

Mandarin Duck

Puffin


Objects without characteristic texture

Objects without Characteristic Texture

(LeCun’04)


Summary of talk

Summary of Talk

  • Recognition of single-texture images

    • Distribution of local appearance descriptors

  • Recognition of individual regions in multi-texture images

    • Local appearance + loose statistical neighborhood relations

  • Recognition of object categories

    • Local appearance + strong geometric relations

      For more information: http://www-cvr.ai.uiuc.edu/ponce_grp


Issues extensions

Issues, Extensions

  • Weakly supervised learning

    • Evaluation methods?

    • Learning from contaminated data?

  • Probabilistic vs. geometric approaches to invariance

  • EM vs. direct correspondence search

  • Training set size

  • Background modeling

  • Strengthening the representation

    • Heterogeneous local features

    • Automatic feature selection

    • Inter-part relations


  • Login