16-721: Advanced Machine Perception - PowerPoint PPT Presentation

16 721 advanced machine perception l.jpg
Download
1 / 40

16-721: Advanced Machine Perception Staff: Instructor: Alexei (Alyosha) Efros ( efros @cs ), 4207 NSH TA: David Bradley ( dbradley@cs ), 2216 NSH Web Page: http://www.cs.cmu.edu/~efros/courses/AP06/ Today Introduction Why Perception ? Administrative stuff Overview of the course

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

16-721: Advanced Machine Perception

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


16 721 advanced machine perception l.jpg

16-721: Advanced Machine Perception

  • Staff:

    • Instructor: Alexei (Alyosha) Efros (efros@cs), 4207 NSH

    • TA: David Bradley (dbradley@cs), 2216 NSH

  • Web Page:

    • http://www.cs.cmu.edu/~efros/courses/AP06/


Today l.jpg

Today

  • Introduction

  • Why Perception?

  • Administrative stuff

  • Overview of the course

  • Image Datasets


A bit about me l.jpg

A bit about me

  • Alexei (Alyosha) Efros

  • Relatively new faculty (RI/CSD)

  • Ph.D 2003, from UC Berkeley (signed by Arnie!)

  • Research Fellow, University of Oxford, ’03-’04

  • Teaching

  • I am still learning…

  • The plan is to have fun and learn cool things, both you and me!

  • Social warning: I don’t see well

  • Research

  • Vision, Graphics, Data-driven “stuff”


Phd thesis on texture and action synthesis l.jpg

PhD Thesis on Texture and Action Synthesis

Smart Erase button in Microsoft Digital Image Pro:

Antonio Criminisi’s son cannot walk but he can fly


The story begins l.jpg

The story begins…

  • “All happy families are alike; each unhappy family is unhappy in its own way.”

  • -- Lev Tolstoy, Anna Karenina

  • “What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking.”

  • -- David Marr, Vision (1982)


Vision a split personality l.jpg

depth map

Vision: a split personality

  • “What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking. In other words, vision is the process of discovering from images what is present in the world, and where it is.”

  • Answer #1: pixel of brightness 243 at position (124,54)

  • …and depth .7 meters

  • Answer #2: looks like bottom edge of whiteboard showing at the top of the image

  • Is the difference just a matter of scale?


Measurement vs perception l.jpg

Measurement vs. Perception


Brightness measurement vs perception l.jpg

Brightness: Measurement vs. Perception


Brightness measurement vs perception9 l.jpg

Brightness: Measurement vs. Perception

Proof!


Lengths measurement vs perception l.jpg

Lengths: Measurement vs. Perception

Müller-Lyer Illusion

http://www.michaelbach.de/ot/sze_muelue/index.html


Vision as measurement device l.jpg

Vision as Measurement Device

Real-time stereo on Mars

Physics-based Vision

Virtualized Reality

Structure from Motion


But why l.jpg

…but why?

  • Reason #1:

    • Semester too short, can’t cover everything

    • Other great classes offered at CMU, e.g.:

      • Appearance Modeling (Srinivas Narasimhan, every fall)

      • Medical Vision (Yanxi Liu)

      • Structure from Motion (Martial Hebert, sometime?)

  • “But what if I don’t care about this wishy-washy human perception stuff? I just want to make my robot go!”

  • Reason #2:

    • For measurement, other sensors are often better (in DARPA Grand Challenge, vision was barely used!)

  • Reason #3:

  • The goals of computer vision (what + where) are in terms of what humans care about.


Slide13 l.jpg

So what do humans care about?

slide by Fei Fei, Fergus & Torralba


Slide14 l.jpg

Verification: is that a bus?

slide by Fei Fei, Fergus & Torralba


Slide15 l.jpg

Detection: are there cars?

slide by Fei Fei, Fergus & Torralba


Slide16 l.jpg

Identification: is that a picture of Mao?

slide by Fei Fei, Fergus & Torralba


Slide17 l.jpg

Object categorization

sky

building

flag

face

banner

wall

street lamp

bus

bus

cars

slide by Fei Fei, Fergus & Torralba


Slide18 l.jpg

Scene and context categorization

  • outdoor

  • city

  • traffic

slide by Fei Fei, Fergus & Torralba


Slide19 l.jpg

Rough 3D layout, depth ordering


Slide20 l.jpg

Challenges 1: view point variation

Michelangelo 1475-1564


Slide21 l.jpg

Challenges 2: illumination

slide credit: S. Ullman


Slide22 l.jpg

Challenges 3: occlusion

Magritte, 1957


Slide23 l.jpg

Challenges 4: scale

slide by Fei Fei, Fergus & Torralba


Slide24 l.jpg

Challenges 5: deformation

Xu, Beihong 1943


Slide25 l.jpg

Challenges 6: background clutter

Klimt, 1913


Challenges 7 object intra class variation l.jpg

Challenges 7: object intra-class variation

slide by Fei-Fei, Fergus & Torralba


Challenges 8 local ambiguity l.jpg

Challenges 8: local ambiguity

slide by Fei-Fei, Fergus & Torralba


Challenges 9 the world behind the image l.jpg

Challenges 9: the world behind the image


In this course we will l.jpg

In this course, we will:

Take a few baby steps…


Course organization l.jpg

Course Organization

  • Requirements:

    • Paper Presentations (50%)

      • Paper Advocate

      • Paper Demo Presenter

      • Paper Opponent

    • Class Participation (20%)

      • Keep annotated bibliography

      • Post questions / comments on Quick-topic

      • Ask questions / debate / flight / be involved!

    • Final Project (30%)

      • Do something with lots of data (at least 500 images)

      • Groups of 1, 2, or 3


Paper advocate l.jpg

Paper Advocate

  • Pick a paper from list

    • That you like and willing to defend

    • Sometimes I will make you do two papers, or background

  • Meet with me before starting to talk about how to present the paper(s)

  • Prepare a good, conference-quality presentation (20-45 min, depending on difficulty of material)

  • Meet with me again 2 days before class to go over the presentation

    • Office hours at end of each class

  • Present and defend the paper in front of class


Paper demo presenter l.jpg

Paper Demo Presenter

  • For some papers, we will have separate demo presentations

  • Sign up for a paper you find interesting

  • Get the code online (or implement if easy)

  • Run it on a toy problem, play with parameters

  • Run it on a new dataset

  • Prepare short 5-10 min presentation detailing results

  • Can cooperate with Paper Advocate


Paper opponent l.jpg

Paper Opponent

  • Sign up for a paper you don’t like / suspicious about

  • Prepare an argument (with or without slides) against the paper:

    • Paper weaknesses

    • Relevance to real problems

    • Existence of better alternative approaches

    • Etc.

  • Present in front of class (5-10 min)


Class participation l.jpg

Class Participation

  • Keep annotated bibliography of papers you read (always a good idea!). The format is up to you. At least, it needs to have:

    • Summary of key points

    • A few Interesting insights, “aha moments”, keen observations, etc.

    • Weaknesses of approach. Unanswered questions. Areas of further investigation, improvement.

  • Submit your thoughts for current paper(s) before each class (printout)


Class participation35 l.jpg

Class Participation

  • In addition, submit interesting observations or questions to QuickTopic before class for public discussion.

  • Be active in class. Voice your ideas, concerns.

  • You need to participate: either in class or in QuickTopic every week!

  • Dave will be watching and keeping track!


Final project l.jpg

Final Project

  • Can grow out of paper presentation, or your own research

  • But it needs to use large amounts of data!

  • 1-3 people per project.

  • Project proposals in a few weeks.

  • Project presentations at the end of semester.

  • Results presented as a CVPR-format paper.

  • Hopefully, a few papers may be submitted to conferences.


End of semester awards l.jpg

End of Semester Awards

  • We will vote for:

    • Best Paper Presenter

    • Best Paper Opponent

    • Best Demo

    • Best Project

  • Prize: dinner in a nice restaurant


Course outline l.jpg

Course Outline

  • Physiology of Vision (1 lecture)

  • Overview of Human Visual Percetion (1 lecture)

    • Need presenter for Monday!

  • Part I: Low-level vision (images as texture)

    • Texture segmentation, image retrieval, scene models, “Bag of words” representations

  • Part II: Mid-level vision (segmentation)

    • Principles of grouping, Normalized Cuts, Mean-shift, DD-MCMC, Graph-cut, super-pixels

  • Part III: 2D Recognition

    • Window scanning (Schniderman+Kanade, Viola+Jones)

    • Correspondence Matching (schanfer matching, housedorf distance, shape contexts, invariant features, active appearance models)

    • Recognition with Segmentation (top-down + buttom-up)

    • Words and Pictures


Course outline cont l.jpg

Course Outline (cont.)

  • Part IV: Intrinsic Images

    • Shading vs. reflectance

    • Recovering surface orientations and depth

    • Style vs. content

  • Part V: Dealing with Data

    • Isomap, LLE, Non-negative Matrix Factorization

  • Part VI: Tracking and Motion Segmentation

    • Particle filtering, examplar-based, layers

  • Sign up to present one paper on Wed on QuickTopic


Datasets l.jpg

Datasets

  • See web page


  • Login