Data annotation with Amazon Mechanical Turk.

Data annotation with Amazon Mechanical Turk. X 100 000 = $5000 Alexander Sorokin David Forsyth University of Illinois at Urbana-Champaign http://vision.cs.uiuc.edu/annotation/

Motivation • Unlabeled data is free (47M creative commons-licensed images at Flickr) • Labels are useful • We need large volumes of labeled data • Different labeling needs: • Is there Xin the image? • Outline X. • Where is part Y of X. • Of these 500 images, which belong to category X? • ……………. and many more ……………….

Amazon Mechanical Turk Workers Task Task: Dog? Broker Answer: Yes Pay: $0.01 Is this a dog? www.mturk.com o Yes o No $0.01

Amazon Mechanical Turk Workers Task Task: Dog? Broker Answer: Yes Pay: $0.01 Is this a dog? www.mturk.com o Yes o No $0.01 x 100 000 = $1 000

Annotation protocols • Type keywords • Select relevant images • Click on landmarks • Outline something • Detect features ……….. anything else ………

Type keywords $0.01 http://austinsmoke.com/turk/.

Select examples Joint work with Tamara and Alex Berg http://vision.cs.uiuc.edu/annotation/data/simpleevaluation/html/horse.html

Select examples $0.02 requester mtlabel

Click on landmarks $0.01 http://vision-app1.cs.uiuc.edu/mt/results/people14-batch11/p7/

Outline something $0.01 http://vision.cs.uiuc.edu/annotation/results/production-3-2/results_page_013.html Data from Ramanan NIPS06

Detect features Measuring molecules. Joint work with Rebecca Schulman (Caltech) ?? $0.1 http://vision.cs.uiuc.edu/annotation/all_examples.html

Ideal task properties • Easy cognitive task Good: Where is the car? (bounding box) Good: How many cars are there? (3) Bad: How many cars are there? (132) • Well-defined task Good: Locate corners of the eyes. Bad: Label joint locations. (low resolution or close-up images) • Concise definition Good: 1-2 paragraphs, fixed for all tasks Good: 1-2 unique sentences per task. Bad: 300 pages annotation manual • Low amount of input Good: few clicks or a couple words Bad: detailed outlines of all objects (100s of control points)

Ideal task properties • High volume Good: 2-100K tasks Bad: <500 tasks (DIY) • Data diversity Bad: Independently label consecutive video frames. • Data is being used Good: Direct input into [active] learning. Bad: Let’s build a dataset for other people to use. • Pay “well” Good: try to pay at the market rate, $0.03-$0.05/image Good: offer bonuses for good work Bad: $0.01 for detailed image segmentation

Price • $0.01 per image (16 clicks) ~ $1500 / 100 000 images >1000 images per day <4 months • Amazon listing fee 10%, $0.005 min • Workers suggested $0.03 - $0.05/img • $3500 - $5500 / 100 000 images

Price-elastic throughput $0.01/ 40 clicks 15 hours 900 labels $0.01 / 14 clicks 1.6 hours 900 labels $0.01 / 16 clicks 4 hours 900 labels

Annotation quality Agree within 5-10 pixels on 500x500 screen There are bad ones. A C E G Protocol: label people, 14pts; Volume 305 images

Submission breakup Protocol: label people, box+14pts; Volume 3078 HITs • We need to “manually” verify the work

Grading tasks • Take 10 submitted results • Create new task to verify the result • Verification is easy • Pay the same or slightly higher price • Total overhead - 10% (work in progress) http://vision-app1.cs.uiuc.edu/mt/grading/people14-batch11-small/p1/

Annotation Method Comparison

How do I sign up? • Go to our web page: http://vision.cs.uiuc.edu/annotation/ • Send me an e-mail: sorokin2@uiuc.edu • Register at Amazon Mechanical Turk http://www.mturk.com

What are the next steps • Collecting more data • 100K labeled people at $5000 • Accurate models for 2.1D pose estimation • Complex models, high accuracy, real time • Visualization and storage • If we all collect labels, how do we share? • Active learning/Online classifiers • If we can ask for labels, why not automatically? • Limited domain Human-Computer racing • Run learning until computer model beats humans

Open Issues • What data to annotate? • Is image resolution important? • Images or videos? • Licensing? • How to allocate resources? • Uniformly per object category • Non-uniformly and use transfer learning • How much data do we need? • What is the value of labeled data? • Will 10 000 000 labeled images (for$1M) solve everything?

Acknowledgments Special thanks to: David Forsyth Tamara Berg Rebecca Schulman David Martin Kobus Barnard Mert Dikmen All workers at Amazon Mechanical Turk This work was supported in part by the National Science Foundation under IIS - 0534837 and in part by the Office of Naval Research under N00014-01-1-0890 as part of the MURI program. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect those of the National Science Foundation or the Office of Naval Research.

Thank you X 100 000 = $5000

References • Mechanical turk web site http://www.mturk.com • Our project web site http://vision.cs.uiuc.edu/annotation/ • Label Me - open annotation tool http://labelme.csail.mit.edu/ • Games with a purpose (ESP++) http://www.gwap.com/gwap/ • Lotus hill research institute/image parsing http://www.imageparsing.com/ • Tips on how to formulate a task http://developer.amazonwebservices.com/connect/thread.jspa?threadID=17867

EXTRA SLIDES

Creative Commons Licenses Attribution. You must attribute the work in the manner specified by the author… Noncommercial. You may not use this work for commercial purposes ShareAlike. You may distribute the modified work only under the same, similar or a compatible license. No Derivative Works. You may not alter, transform, or build upon this work. Adapted from http://creativecommons.org/licenses/

Flickr images by license BY 8,831,568 BY-SA 6,137,030 BY-NC-SA 21,678,154 BY-NC 10,724,800 Total: 47,371,552 http://flickr.com/creativecommons/, as of 07/20/08

Motivation X 100 000 = $5000 Custom annotations Large scale Low price

Mechanical Turk terminology • Requester • Worker • HIT (human intelligence task) • Reward • Bonus • Listing fee • Qualification

Commercial applications • Label objects on the highway (asset management) • Create transcript of videos and audios (text-based video search) • Outline a golf course and objects (property valuation) • Write and summarize product review

Scalability • My current throughput is 1000 HITs/day • There are 30K - 60K HITs at a time • Workers enjoy what they do • Popular HITs “disappear” very quickly • Scalability is Amazon’s job!

Why talk to us? • We can jump-start your annotation project • We discuss the annotation protocol • You give us sample data(e.g. 100 images) • We run it through MT • We give you detailed step-by-step instructions how to run it • We can build new tools • All our tools are public • You can always do it yourself

Objective • To build • A simple tool • To obtain annotations • At large scale for • A specific research project • Very quickly • And at low cost

Projects in progress • People joint locations • 2380 images/ 2729 good annotations • Relevant images • Consistency at 20 annotations/set • Annotate molecules • 30% usable data at the first round

Data annotation with Amazon Mechanical Turk.

Data annotation with Amazon Mechanical Turk.

Presentation Transcript

Utility data annotation via Amazon Mechanical Turk

Using Mechanical Turk for linguistic research

Amazon Mechanical Turk New York City Meet Up

Using Amazon Mechanical Turk for Product Term Annotation

Mechanical Turk and AWS Workshop

Crowdsourcing for NLP Using Amazon Mechanical Turk and CrowdFlower Matteo Negri and Yashar Mehdad

Amazon Mechanical Turk Artificial Artificial Intelligence

Mechanical Turk and AWS Workshop

Synchronous Experiments on Mechanical Turk

Amazon Mechanical Turk

Mechanical Turk

Rethinking Grammatical Error Detection and Evaluation with the Amazon Mechanical Turk

Insights into Mechanical Turk (or, “Mistakes Requesters Make”)

Amazon Mechanical Turk ( Mturk )

Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection

Running Experiments on Mechanical Turk: Day 1

Crowdsourcing with Amazon Mechanical Turk

Corpus Annotation with Linked Open Data

Running Experiments on Mechanical Turk: Day 1

Data Annotation Tools

Data Annotation