Epitomic location recognition
This presentation is the property of its rightful owner.
Sponsored Links
1 / 35

Epitomic Location Recognition PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on
  • Presentation posted in: General

K. Ni, A. Kannan , A. Criminisi and J. Winn. Epitomic Location Recognition. A g enerative approach for location recognition. In proc. CVPR 2008. Anchorage, Alaska. Goal Introduction Recognition Enhancements Evaluation. Location Recognition. Where am I? Instance recognition

Download Presentation

Epitomic Location Recognition

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Epitomic location recognition

K. Ni, A. Kannan, A. Criminisi and J. Winn

Epitomic Location Recognition

A generative approach for location recognition

In proc. CVPR 2008. Anchorage, Alaska.


Epitomic location recognition

Goal

Introduction

Recognition

Enhancements

Evaluation


Location recognition

Location Recognition

  • Where am I?

    • Instance recognition

    • Category recognition (more difficult)

Lobby? Cubicle? Hallway? Kitchen?


Epitomic location recognition

Goal

Introduction

Recognition

Enhancements

Evaluation


Geometry based recognition

Geometry Based Recognition

  • SLAM & structure from motion

    • Why do we need metric reconstruction?

    • Lose the flexibility to do class recognition.

Training Images

Local Feature Database

Geometry &Labels

Testing Image

Features

F. Schaffalitzky and A. Zisserman

G. Schindler, M. Brown, R. Szeliski


Appearance based recognition

Appearance Based Recognition

  • Capture global appearance information

    • Gaussian mixture model used by A. Torralba, et. al

Preprocessing

Image

Vectors

Training

Training Images

Appearance Model

(e.g. PCA)

A. Torralba, K. Murphy, W. T. Freeman and M. A. Rubin

M. Cummins and P. Newman


Appearance or geometry

Appearance or Geometry?

  • Can we do better by fusing both information together?

A small example with 2 location labels: cubicle and corridor


The simplest model

The Simplest Model

  • Nearest neighbor classification

    • Naive but still effective with enough samples.

    • A small shift may disrupt the recognition.

    • Does not capture uncertainty.


How to incorporate translation invariance

How to Incorporate Translation Invariance?

  • We need something better than a “bag of frames” model

Training

images

Testing image


Panorama

Panorama

  • It models both appearance & geometry

    • Adapts to camera rotation and focal length change

  • Generative

    • An image is a patch “extracted” from the panorama

M. Brown and D. G. Lowe


Cons of panoramas

Cons of Panoramas

  • Not easy to build a panorama due to parallax

  • Do not capture uncertainty

  • Only work for location instance recognition

  • No compact representation for repetitive scenes


Gaussian mixture model

Gaussian Mixture Model

  • Six mixtures trained as in Torralba et al’s paper

    • Handles uncertainties but no translation invariance

Remove boundaries

Much more blurred

Means

Variances


A weak panorama

A Weak Panorama

  • 3D motions can be roughly modeled by 2D translation + scaling.

2D translation

Scaling


Epitome panorama gmm

Epitome = Panorama + GMM

  • Epitome

    • Generative model for image patches /video frames

    • Captures repetitive patterns in the original image

    • Mapping = 2D translation + scaling

Epitome

A source image

Image patches

N. Jojic et.al., ICCV 2003; N. Petrovic, et.al., CVPR 2006


Epitome as probabilistic panorama

Epitome as Probabilistic Panorama

  • Model 3D scenes rather than a single 2D image

Location Epitome

Means

Variances

Environment = Virtual panorama


Learning the location epitome

Learning the Location Epitome

  • Initialize epitome randomly

  • EM Iterations

    • E-step: infer the posteriors over all mappings

    • M-step: use the posteriors as weights to update the mean and variance of epitome pixels

Free energy

EM iterations


Model comparison

Model Comparison

  • Epitome is a smart mixture of Gaussians model with parameters sharing among components

    • For the same number of parameters, the epitome generalizes better


Epitomic location recognition

Goal

Introduction

Recognition

Enhancements

Evaluation


Build label map s

Build Label Maps

  • The label maps are the posterior of the label given the mapping

Cubicle label map

Corridor label map

Epitome

Label maps


Recognition from location epitomes

Recognition from Location Epitomes

  • Fast correlation: infer the best mapping region

  • Sum the pixel-wise votes

  • Temporal smoothing using HMM

Best matching patch

Input testing image

Cubicle label map

Location epitome

Corridor label map


Epitomic location recognition

Goal

Introduction

Recognition

Enhancements

Evaluation


Color is not always the best feature

Color is not always the best feature

  • Other features besides RGB

    • For example, stereo feature captures the depth info.

    • Do not need high stereo accuracy (efficient DP here)

Corridor

Cubicle

Kitchen


Integrating multiple features

Integrating Multiple Features

  • Stack multiple feature “channels”

Stereo

R

G

B


Local histograms

Local Histograms

  • Enable better translation invariance and more generalization

    • Error rate: 0.49  0.36 in a test, 4-class dataset

  • Improve the efficiency dramatically: 30 times speed-up


Supervised learning

Supervised Learning

  • Incorporates training image labels

  • Helps discriminate images with similar features but different location labels.

A microwave in the kitchen

An example epitome

A monitor in the cubicle

Discriminative features

An example label feature


Epitomic location recognition

Goal

Introduction

Recognition

Enhancements

Evaluation


Mit image database

MIT Image Database

  • Created by Antonio Torralba, and et. al.

    • 17 sequences, 62 locations, 7 categories, 72077 images


Results on recognizing location instances

Results on Recognizing Location Instances

  • Location epitome vs. GMM, 10% better in average


Results on recognizing location classes

Results on Recognizing Location Classes

  • Location Epitome vs. GMM, 10%-20% better


Msrc data set

MSRC Data Set

  • Captured with a stereo camera

    • 5409 images collected at the speed of 4 fps

    • 11 sequences and 7 classes

corridor_visionlab

cubicle_mlp

kitchen-fl2-north

lectureroom-large

lectureroom-small

stairs-1st-to-2nd

stairs-2nd-to-1st


Integrate depth cues

Integrate Depth Cues

corridor_visionlab

cubicle_mlp

kitchen-fl2-north

lectureroom-large

lectureroom-small

stairs-1st-to-2nd

stairs-2nd-to-1st


Instance recognition with multiple features

Instance Recognition with Multiple Features

  • RGB & Stereo overwhelms the other features

  • Learning: 5.7 fps

  • Recognition: 116 fps = 29 times the capture speed


Summary

Summary

  • A generative model for the recognition of both location instances and classes

    • Fast: capable of real-time applications

    • Flexible: capable of integrating various features

    • Probabilistic: capable of capturing uncertainties

  • Future applications

    • Navigation for visually impaired people

    • Appearance-based loop closing for SLAM problems


Epitomic location recognition1

K. Ni, A. Kannan, A. Criminisi and J. Winn

Epitomic Location Recognition

Thank you !

A generative approach for location recognition


Local histograms 2

Local Histograms (2)

  • Improves efficiency (both training and testing)

    • The bottle neck: convoluting epitome and images

    • Compression rate: 3*(C1C2)2/50 = 2400

  • Learning: 3 hours  6 mins, 30 times faster

Ne/C2

N/C2

Me/C1

M/C1

N

Ne

Epitome

Image

Me

M

*

*

Convolute 3-dimension RGB features

Convolute 50-dimension local histograms


  • Login