1 / 1

Steve Branson 1 Catherine Wah 1 Boris Babenko 1 Florian Schroff 1 Peter Welinder 2 Pietro Perona 2 Serge Belo

Sayornis. Gray Kingbird. Parakeet Auklet. Visipedia : Visual Recognition with Humans in the Loop. Western Grebe. Rose-breasted Grosbeak. Input Image ( ). Computer Vision. Question 1: Is the belly black?. Question 2: Is the bill hooked?. (A) Easy for Humans. (B) Hard for Humans.

Download Presentation

Steve Branson 1 Catherine Wah 1 Boris Babenko 1 Florian Schroff 1 Peter Welinder 2 Pietro Perona 2 Serge Belo

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Sayornis Gray Kingbird Parakeet Auklet Visipedia: Visual Recognition with Humans in the Loop Western Grebe Rose-breasted Grosbeak Input Image ( ) Computer Vision Question 1: Is the belly black? Question 2: Is the bill hooked? (A) Easy for Humans (B) Hard for Humans (C) Easy for Humans Indigo Bunting Blue Grosbeak Q : Is the belly multi-colored? yes (Def.) A: YES A: NO Yellow-headed Blackbird w/ vision: Q #1: Is the throat white? yes (Def.) w/o vision: Q #1: Is the shape perching-like? no (Def.) Rose-breasted Grosbeak Q: Is the belly red?yes (Def) Q: Is the breast black? yes (Def.) Q : Is the primary color red? yes (Def.) 2Electrical Engineering California Institute of Technology {welinder,perona}@caltech.edu 1Computer Science and Engineering University of California, San Diego {sbranson,cwah,bbabenko,gschroff,sjb}@cs.ucsd.edu Only CV Chair? Airplane? … Finch? Bunting?… Yellow Belly? Blue Belly? … CV + Q #1: Is the crown black? yes (Def.) Rose-breasted Grosbeak • MTurker Label Certainty • User Responses are Stochastic • Adding Computer Vision Helps • Computer vision reduces manual labor • Computer vision improves performance • Different questions are asked with and without computer vision • Recognition is not Always Successful • MTurker Feedback • “These hits were fun. Will you be posting more of them anytime soon? Thanks!” • “These are Beautiful birds and I am enjoying this hit collection” • “I really enjoy doing your hits, they are fun and interesting. Thanks.” • “Love doing these because I'm a bird watcher.” • “the birds are so cute..hope u can send more kind of birds” • “I REALLY LOVE THE COLOR OF THE BIRDS.” • “Thank you for providing this job. The fact that the images are beautiful to look at make it a lot more enjoyable to do!” • Hourly Wage ≈ $1.25 • Caltech Birds-200 Dataset • Image harvesting: text search of species name on Flickr • Data cleaning: identifying bird presence/absence with Amazon Mechanical Turk (“a marketplace for work that requires human intelligence” [http://www.mturk.com]) • Attribute-based Classification • Visual attributes from http://www.whatbird.com • Attribute classification tasks might be easier • Easier to incorporate human knowledge • Attribute labeling: MTurk interface • Visual 20-Questions Game • Choose question to maximize expected information gain We introduce Visipedia, a user-generated Encyclopedia of visual knowledge that is intended to enrich the content of Wikipedia.  Visual data is the predominant sensory input through which people observe the world, people are visual learners, and visual images are fundamentally important toward the ways in which people encode knowledge and perceive the world.  Unfortunately, the organization of visual content on the web is still very impoverished.  This is in large part due to the raw size and complexity of images and the non-existence of scalable computer vision algorithms capable of automaticallyrecognizing or organizing images on a semantic level.   The shortcomings of computer vision algorithms can in part be explained by a shortage in the quantity and quality of the labeled visual images necessary for training machine learning algorithms.   We propose acollaborative effort between computers and humans toward the development of Visipedia, where the initial user-generated population of Visipedia will help train machine learning algorithms, which will in turn help automate the process of building Visipedia. Toward this aim, we propose new paradigms for interactive algorithms combining computer vision with user-input, richer representations for representing visual objects than are traditionally studied in computer vision, and earning algorithms that are more scalable to Internet-scale recognition. Modeling User Responses Interactive Object Recognition Abstract • Visual counterpart to Wikipedia • User-generated encyclopedia of visual knowledge • An effort to associate Wikipedia articles with large • quantities of well-organized, intuitive visual concepts • A paradigm for combining computer vision and machine • learning with human annotation • Motivation • Need for more training data • Need for more realistic data • Dealing with Many Related Classes What is Visipedia? Steve Branson1 Catherine Wah1 Boris Babenko1Florian Schroff1 Peter Welinder2Pietro Perona2Serge Belongie1 Least Auklet

More Related