1 / 73

Anand Kulkarni Björn Hartmann University of California, Berkeley Matthew Can Stanford University

Turkomatic. Collaboratively Crowdsourcing Complex Work With Turkomatic. Anand Kulkarni Björn Hartmann University of California, Berkeley Matthew Can Stanford University. Microtask marketplaces excel at simple, repetitive work.

aerona
Download Presentation

Anand Kulkarni Björn Hartmann University of California, Berkeley Matthew Can Stanford University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Turkomatic Collaboratively Crowdsourcing Complex Work With Turkomatic Anand KulkarniBjörn Hartmann University of California, Berkeley Matthew Can Stanford University

  2. Microtask marketplaces excel at simple, repetitive work.

  3. Microtask marketplaces excel at simple, repetitive work. Transcribe a business card.

  4. Microtask marketplaces excel at simple, repetitive work. Transcribe a business card. Look up a fact online.

  5. Much of the work we do in our daily lives is not simple or repetitive. “Arrange my trip to Seattle.” “Create algebra problems for my mathematics exam.” “Write a research paper.” “Create a small piece of software.” “Write a blog about Mechanical Turk with a few good entries.”

  6. How do we crowdsource complex work?

  7. Complex work with crowds Soylent: Editing word processing documents (Bernstein et al ’10) Vizwiz: Answering queries about visual scenes (Bigham et al ‘10) More complex applications: Platemate[NHZG11], Adrenaline [BBMK11], Crowdforge[KSK11]….

  8. Workflows: Crowd Algorithms Divide complex tasks into a sequence of microtasks arranged in a workflow Soylent, Bernstein et al, UIST 2010

  9. Workflow design is labor-intensive 1. Design individual HITs 2. Implement parallelism to make sure tasks are done correctly 3. Write software to launch HITs and parse worker results 4. Test workflow by running program 5. Identify errors 6. Iterate from step 1

  10. Workflow design is labor-intensive Difficult and domain-specific: Workflow design requires extensive up-front iteration and experimentation and is specific to a given task domain. Inaccessible to non-experts: Few have the patience to implement this process in code

  11. What is Turkomatic? Turkomatic is a system for crowdsourcing high-level complex and creative work where the crowd designs the workflow.

  12. What is Turkomatic? Create a new blog about Mechanical Turk with two posts.

  13. Price-Divide-Solve (PDS) How do we induce the crowd to design a workflow?

  14. Price-Divide-Solve (PDS) PDS is a divide and conquer algorithm to create workflows. Price: Can this task be solved for 20 cents? If yes: Solve task and return the answer. If no: Divide task into multiple steps. For each step, recurse. Mergesteps into solution.

  15. Price-Divide-Solve (PDS) PDS is a divide and conquer algorithm to create workflows. Price: Can this task be solved for 20 cents? If yes: Solvetask and return the answer. If no: Divide task into multiple steps. For each step, recurse. Mergesteps into solution.

  16. Price-Divide-Solve (PDS) Redundancy is used at each step to ensure quality. Divide Task Best subdivision Price Task Price Task Vote Price check Consensus on price Price Task Price Task Majority Solve Task Best solution Price Task Price Task Vote

  17. Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Can we solve it for 20 cents? Price

  18. Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Can we solve it for 20 cents? No. Price

  19. Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Can we solve it for 20 cents? Divide it into two or more steps. No. Divide Price Write a second entry for a blog. Write one entry for a blog.

  20. Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Divide it into two or more steps. Divide Price Write a second entry for a blog. Write one entry for a blog.

  21. Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Price Write a second entry for a blog. Write one entry for a blog. Can we solve it for 20 cents? Can we solve it for 20 cents? Can we solve it for 20 cents?

  22. Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Price Write a second entry for a blog. Write one entry for a blog. Can we solve it for 20 cents? Can we solve it for 20 cents? Can we solve it for 20 cents? Yes. Yes. Yes.

  23. Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Solve Write a second entry for a blog. Write one entry for a blog.

  24. “Welcome to my blog about Mechanical Turk! Here, I’ll be posting some of my favorite recipes for Mechanical Turk. You’ll be able to follow along at home and create delicious HITs. From the comfort of your own home! Stay tuned and i’ll show you some of the best strategies for keeping your Turk workers engaged.” Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Solve Write a second entry for a blog. Write one entry for a blog.

  25. “You may be inclined to price your HITs at the lowest possible rate, but this isn’t always the best choice. Instead, you should base your pricing on: -How long will the HIT take? -Is the HIT similar to other HITs? If so, price it slightly less than theirs. -If the HIT involves a lot of qualifications, you may want to price it higher, to attract more qualified workers.” Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Solve Write one entry for a blog. Write a second entry for a blog.

  26. Price-Divide-Solve (PDS) Create a new blog about Mechanical Turk with two posts. Create a new blog on Wordpress.com. Combine the results of solved steps. Merge Write a second entry for a blog. Write one entry for a blog. mtworker.wordpress.com

  27. mtworker.wordpress.com

  28. Can this task be solved for 20 cents? Write a blog about Mechanical Turk Yes No Submit

  29. Break down the following task. Write a blog about Mechanical Turk Step 1: Step 2: Add Step Submit

  30. Solve the following task. Create a new blank blog on Wordpress Submit

  31. Merge the following subtasks. Write a blog about Mechanical Turk Workers previously divided this task into simpler steps and solved each step. Combine their work into a complete solution. Step 1: Create a blank blog about Mechanical Turk [answer: www...] Step 2: Write a blog post about Mechanical Turk. [answer: This post is…] Submit

  32. Price-Divide-Solve (PDS) PDS guides the crowd to design workflows in a particular way. It can attempt to create a workflow for any task, but it can’t produce all workflows. Write a sentence. Improve the previous worker’s answer. Check that the previous answer was improved.

  33. System Recap Price Solve Divide Requester Interface Algorithm Algorithm Worker Interface System Output

  34. Experiment 1: Can the crowd plan and execute workflows using PDS? Over 150 trials, including: • Java programming • Booking restaurants • Sorting and cleaning data • Blogging • Creating self-portraits • Solving an SAT • Logo design • Travel planning • Writing essays • Web research …

  35. Experiment 1: Can the crowd plan and execute workflows using PDS? Over 150 trials, including: • Java programming • Booking restaurants • Sorting and cleaning data • Blogging • Creating self-portraits • Solving an SAT • Logo design • Travel planning • Writing essays • Web research …

  36. Experiment 1: Success Modes Write a 3-paragraph essay about whether it’s ever OK to lie. Write one sentence to open the conclusion. Write 2-3 sentences in the middle of the conclusion. Write a concluding sentence. Write one paragraph arguing it’s OK to lie sometimes. Write one paragraph suggesting it’s never OK to lie. Write a conclusion reconciling the two.

  37. Experiment 1: Success Modes Data: • 6 subnodes were produced • 44 separate worker judgments were used • Task completed with a full essay

  38. Experiment 1: Success Modes “…although many people believe it is always essential to tell the truth, sometimes it may be better to lie. There is credibility in both views. And like many ethical decisions, sometimes the circumstances dictate. When you tell the truth you develop a stronger bond of trust with those around you. A relationship can not exist without trust. If you lie, you end up telling more lies to cover the first….”

  39. Experiment 1: Failure Modes There are two ways we found that the algorithm could fail: -Failing to terminate at all -Completing, but producing wrong answers

  40. Experiment 1: Failing to terminate Plan a trip from New York to S.F. that visits 5 interesting places. Think about where to go next in Ohio. Think about where to go next in Ohio.

  41. Experiment 1: Wrong answers List the department chairs of the top 20 US programs in CS. • aalto armchair • poang lounge chair • adirondack chair • aeron chair • balans chair • ball chair • ….

  42. Why does the crowd lose context? Turkomatic worker: “…I’ve taken a look at your instructions, and I understand them perfectly. However, this task seems to have been inadvertently sabotaged by other turkers who do not understand what you are asking them to do…”

  43. Long workflows involve increasing chains of trust. Each individual worker has a ~30% probability of failure [Chi/Kittur/Suh ’08, Bernstein et al ’10]Weakest link problem: If one worker early in the workflow design process makes mistakes, the subsequent decompositions will fail.

  44. Including context doesn’t suffice

  45. One explanation What if we used more competent workers?

  46. Experiment 2: Can expert workers make Turkomatic work? Setup:We recruited five graduate students with experience as requesters on Mechanical Turk. We ran the PDS algorithm on three complex tasks with this crowd: online research, essay writing, and creating a blog

  47. Experiment 2: Can expert workers make Turkomatic work? Results: Each of three tested tasks completed correctly when we used only expert workers!

  48. Experiment 2: Can expert workers make Turkomatic work? Results: Each of three tested tasks completed correctly when we used only expert workers! Conclusion: PDS works well with qualified crowds.

  49. How can we successfully run PDS with unskilled workers?

More Related