visual recognition with humans in the loop n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Visual Recognition With Humans in the Loop PowerPoint Presentation
Download Presentation
Visual Recognition With Humans in the Loop

Loading in 2 Seconds...

play fullscreen
1 / 50

Visual Recognition With Humans in the Loop - PowerPoint PPT Presentation


  • 61 Views
  • Uploaded on

Visual Recognition With Humans in the Loop. ECCV 2010, Crete, Greece. Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie. Peter Welinder Pietro Perona. What type of bird is this?. Field Guide. …?. What type of bird is this?. Computer Vision. ?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Visual Recognition With Humans in the Loop


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
visual recognition with humans in the loop

Visual Recognition With Humans in the Loop

ECCV 2010, Crete, Greece

Steve Branson

Catherine Wah

FlorianSchroff

Boris Babenko

Serge Belongie

Peter Welinder

PietroPerona

slide3

Field Guide

…?

What type of bird is this?

slide4

Computer Vision

?

What type of bird is this?

slide5

Computer Vision

Bird?

What type of bird is this?

slide6

Computer Vision

Chair?

Bottle?

What type of bird is this?

slide7

Field guides difficult for average users

  • Computer vision doesn’t work perfectly (yet)
  • Research mostly on basic-level categories

Parakeet Auklet

visual recognition with humans in the loop1
Visual Recognition With Humans in the Loop

What kind of bird is this?

Parakeet Auklet

levels of categorization
Levels of Categorization

Basic-Level Categories

  • Airplane? Chair?
  • Bottle? …

[Griffin et al. ‘07, Lazebnik et al. ‘06, Grauman et al. ‘06, Everingham et al. ‘06, Felzenzwalb et al. ‘08, Viola et al. ‘01, … ]

levels of categorization1
Levels of Categorization

Subordinate Categories

  • American Goldfinch?
  • Indigo Bunting? …

[Belhumeur et al. ‘08 , Nilsback et al. ’08, …]

levels of categorization2
Levels of Categorization

Parts and Attributes

  • Yellow Belly?
  • Blue Belly?…

[Farhadi et al. ‘09, Lampert et al. ’09, Kumar et al. ‘09]

visual 20 questions game
Visual 20 Questions Game
  • Blue Belly? no
  • Cone-shaped Beak? yes
  • Striped Wing? yes
  • American Goldfinch? yes

Hard classification problems can be turned into a sequence of easy ones

recognition with humans in the loop
Recognition With Humans in the Loop
  • Computer Vision
  • Computers: reduce number of required questions
  • Humans: drive up accuracy of vision algorithms
  • Cone-shaped Beak? yes
  • Computer Vision
  • American Goldfinch? yes
research agenda
Research Agenda

2010

2015

Heavy Reliance on Human Assistance

More Automated

  • Striped Wing? yes
  • American Goldfinch? yes
  • Blue belly? no
  • Cone-shaped beak? yes
  • Striped Wing? yes
  • American Goldfinch? yes

Computer Vision Improves

Fully Automatic

  • American Goldfinch? yes

2025

field guides
Field Guides

www.whatbird.com

field guides1
Field Guides

www.whatbird.com

basic algorithm
Basic Algorithm

Computer Vision

Max Expected Information Gain

Input Image ( )

A: NO

Question 1:

Is the belly black?

Max Expected Information Gain

A: YES

Question 2:

Is the bill hooked?

without computer vision
Without Computer Vision

Class Prior

Max Expected Information Gain

Input Image ( )

A: NO

Question 1:

Is the belly black?

Max Expected Information Gain

A: YES

Question 2:

Is the bill hooked?

basic algorithm1
Basic Algorithm

Select the next question that maximizes expected information gain:

  • Easy to compute if we can to estimate probabilities of the form:

Object Class

Image

Sequence of user responses

basic algorithm2
Basic Algorithm

Normalization factor

Model of user responses

Computer vision estimate

basic algorithm3
Basic Algorithm

Normalization factor

Model of user responses

Computer vision estimate

modeling user responses
Modeling User Responses
  • Assume:
  • Estimate using Mechanical Turk

Definitely

grey

red

black

white

brown

blue

Probably

Guessing

grey

red

black

white

brown

blue

grey

red

black

white

brown

blue

Pine Grosbeak

What is the color of the belly?

incorporating computer vision
Incorporating Computer Vision
  • Use any recognition algorithm that can estimate: p(c|x)
  • We experimented with two simple methods:

1-vs-all SVM

Attribute-based classification

[Lampert et al. ’09, Farhadiet al. ‘09]

incorporating computer vision1
Incorporating Computer Vision
  • Used VLFeat and MKL code + color features

Color Histograms

Color Layout

Geometric Blur

Self Similarity

Color SIFT, SIFT

Multiple Kernels

Bag of Words

Spatial Pyramid

[Vedaldi et al. ’08, Vedaldi et al. ’09]

birds 200 dataset
Birds 200 Dataset
  • 200 classes, 6000+ images, 288 binary attributes
  • Why birds?

Parakeet Auklet

Field Sparrow

Vesper Sparrow

Black-footed Albatross

Groove-Billed Ani

Baird’s Sparrow

Henslow’s Sparrow

Arctic Tern

Forster’s Tern

Common Tern

birds 200 dataset1
Birds 200 Dataset
  • 200 classes, 6000+ images, 288 binary attributes
  • Why birds?

Parakeet Auklet

Field Sparrow

Vesper Sparrow

Black-footed Albatross

Groove-Billed Ani

Baird’s Sparrow

Henslow’s Sparrow

Arctic Tern

Forster’s Tern

Common Tern

birds 200 dataset2
Birds 200 Dataset
  • 200 classes, 6000+ images, 288 binary attributes
  • Why birds?

Parakeet Auklet

Field Sparrow

Vesper Sparrow

Black-footed Albatross

Groove-Billed Ani

Baird’s Sparrow

Henslow’s Sparrow

Arctic Tern

Forster’s Tern

Common Tern

results without computer vision
Results: Without Computer Vision

Comparing Different User Models

results without computer vision1
Results: Without Computer Vision

Perfect Users: 100% accuracy in 8≈log2(200) questions

if users answers agree with field guides…

results without computer vision2
Results: Without Computer Vision

Real users answer questions

MTurkers don’t always agree with field guides…

results without computer vision3
Results: Without Computer Vision

Real users answer questions

MTurkers don’t always agree with field guides…

results without computer vision4
Results: Without Computer Vision

Probabilistic User Model: tolerate imperfect user responses

results with computer vision1
Results: With Computer Vision

Users drive performance: 19%  68%

Just Computer Vision

19%

results with computer vision2
Results: With Computer Vision

Computer Vision Reduces Manual Labor: 11.1 6.5 questions

examples
Examples

Different Questions Asked w/ and w/out Computer Vision

Western Grebe

Without computer vision:

Q #1: Is the shape perching-like?no (Def.)

With computer vision:

Q #1: Is the throat white?yes (Def.)

perching-like

examples1
Examples

User Input Helps Correct Computer Vision

Magnolia Warbler

Common Yellowthroat

computer vision

Common Yellowthroat

Is the breast pattern solid?no (definitely)

Magnolia Warbler

recognition is not always successful
Recognition is Not Always Successful

Acadian Flycatcher

Parakeet Auklet

Is the belly multi-colored?

yes (Def.)

Unlimited questions

Least Auklet

Least Flycatcher

summary
Summary
  • Recognition of fine-grained categories

11.1 6.5 questions

19%

  • Users drive up performance
  • Computer vision reduces manuallabor
  • More reliable than field guides
summary1
Summary
  • Recognition of fine-grained categories

11.1 6.5 questions

19%

  • Users drive up performance
  • Computer vision reduces manuallabor
  • More reliable than field guides
summary2
Summary
  • Recognition of fine-grained categories

11.1 6.5 questions

19%

  • Users drive up performance
  • Computer vision reduces manuallabor
  • More reliable than field guides
summary3
Summary
  • Recognition of fine-grained categories

11.1 6.5 questions

19%

  • Users drive up performance
  • Computer vision reduces manuallabor
  • More reliable than field guides
future work
Future Work
  • Extend to domains other than birds
  • Methodologies for generating questions
  • Improve computer vision
questions
Questions?

Project page and datasets available at:

http://vision.caltech.edu/visipedia/

http://vision.ucsd.edu/project/visipedia/