Download
1 / 51

Crowdsourcing Complexity: Lessons for Software Engineering - PowerPoint PPT Presentation


  • 107 Views
  • Uploaded on

Crowdsourcing Complexity: Lessons for Software Engineering. Lydia Chilton 2 June 2014 ICSE Crowdsourcing Workshop. Clarification: Human Computation. Mechanical Turk Microtasks :. 2007: JavaScript Calculator. 2007: JavaScript Calculator. Evolution of Complexity in Human Computation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Crowdsourcing Complexity: Lessons for Software Engineering' - tracey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Crowdsourcing complexity lessons for software engineering

Crowdsourcing Complexity:Lessons for Software Engineering

Lydia Chilton

2 June 2014 ICSE Crowdsourcing Workshop


Clarification human computation
Clarification: Human Computation

Mechanical Turk

Microtasks:




Evolution of Complexity

in Human Computation

Task Decomposition: Cascade & Frenzy



1 collective intelligence
1. Collective Intelligence

1906: 787 aggregated votes averaged 1197 lbs.

Actual answer: 1198 lbs.


1 collective intelligence1
1. Collective Intelligence

Principles:

  • Small tasks

  • Intuitive tasks

  • Independent answers

  • Simple aggregation

    Application:

    - ESP Game


2 iterative workflows
2. Iterative Workflows

work

improve

vote

improve

vote

Collective Intelligence


2 iterative workflows1
2. Iterative Workflows

Principles:

  • Use fresh eyes

  • Vote to ensure improvement

    Application:

    - Bug finding

    “given enough eyeballs, all bugs are shallow”



3 psychological boundaries1
3. Psychological Boundaries

Applications:

  • Manager / programmer

  • Writer / editor

  • Write code / test code

  • Addition / subtraction

    Principle:

    • Task switching is hard

    • Natural boundaries for tasks


4 task decomposition
4. Task Decomposition

Legion:Scribe Real Time Audio Captioning on MTurk


4 task decomposition1
4. Task Decomposition

Principles:

  • Must be able to break apart tasks AND put them back together.

  • Complex aggregation

  • Hint: Solve backwards. Find what people can do, and build up from there.


5 worker choice
5. Worker Choice

Mobi: Trip Planning on Mturk with an open UI.


5 worker choice1
5. Worker Choice

Applications:

  • Trip planning

  • Conference time table

  • Conference session-making

    Principles:

  • Giving workers freedom relieves requesters’ burden of task decomposition.

  • Workers feel more involved and empowered.

  • BUT complex interface that is difficult to scale.



6 learning and doing1
6. Learning and Doing

Applications:

  • Peer assessment

  • Do grading assignments before you do your own assignment

  • Task Feedback

    Principles:

  • Teaching workers makes them better.

  • How long will they stay?


Lessons for software engineering
Lessons for Software Engineering

  • Propose and vote

  • Find natural psychological boundaries between tasks

  • Find the tasks people can do, then assemble them using complex aggregation techniques.

  • Teach.

221

+ 473

-221

+ 473


Evolution of Complexity

in Human Computation

Task Decomposition: Cascade & Frenzy



Cascade engineering

Crowdsourcing Taxonomy Creation

  • Lydia Chilton (UW), Greg Little (oDesk), Darren Edge (MSR Asia),

  • Dan Weld (UW), James Landay (UW)


Problem
Problem engineering


  • 1000 engineeringeGovernment suggestions

  • 50 top product reviews

  • 100 employee feedback comments

  • 1000 answers to “Why did you decide to major in Computer Science?”

Machines can’t analyze it

People don’t have time to analyze it

  • time consuming

  • overwhelming

  • no right answer


Solution
Solution engineering


Solution crowdsourced taxonomies
Solution: engineeringCrowdsourced Taxonomies



Initial prototypes
Initial Prototypes engineering


Iterative improvement
Iterative Improvement engineering

Problems

The hierarchy grows and becomes overwhelming

Workers have to decide what to do

Lesson

Break up the task more


Initial approach 2 category comparison
Initial Approach 2: engineeringCategory Comparison

  • Problem

  • Without context, it’s hard to judge relationships

  • flying vs. flights

  • TSA liquids vs. removing liquids

  • Packing vs. what to bring

  • Lesson

  • Don’t compare abstractions to abstractions

  • Instead compare data to abstractions


Use lesson 3
Use Lesson #3 engineering

Find the tasks people can do.

Assemble them using complex

aggregation techniques.

Categorize

Select Best Labels

Generate Labels


Cascade algorithm
Cascade Algorithm engineering

For a subset of items

Generate Labels

Select Best Labels

{good labels}

Categorize

For all items,

for all good labels,

Then recurse


Aggregate data into taxonomy
Aggregate Data into Taxonomy engineering

redundant

  • Blue:

    • Light Blue:

  • Green:

  • Other:

nested

Green

Blue

Light Blue

singletons




Propose vote test
Propose, Vote, Test subsets of the data?

Workers have good heuristics.

Let them propose categories.

Vote on categories to weed out bad ones.

Test the heuristics by verifying it on data.

Propose

Vote

Test


Lesson
Lesson subsets of the data?

Propose, Vote, Test.


Deploy cascade to real needs
Deploy subsets of the data?Cascade to Real Needs

  • CHI 2013 Program Committee

    Organize 430 accepted papers to help session making

  • 40 CrowdCamp Hack-a-thon Participants

    Organize 100 hack-a-thon ideas to help organize teams


430 chi papers good results but
430 CHI Papers: Good Results, but… subsets of the data?

Patina: Dynamic Heatmaps for Visualizing Application Usage',

Effects of Visualization and Note-Taking  on Sensemaking and Analysis',

Contextifier: Automatic Generation of Messaged Visualizations',

Interactive Horizon Graphs: Improving the Compact Visualization of Multiple Time Series',

Quantity Estimation in Visualizations of Tagged Text',

Motif Simplification: Improving Network Visualization Readability with Fan, Connector, and Clique Glyphs',

Evaluation of Alternative Glyph Designs for Time Series Data in a Small Multiple Setting',

Individual User Characteristics and Information Visualization: Connecting the Dots through Eye Tracking',

"Without the Clutter of Unimportant Words": Descriptive Keyphrases for Text Visualization']],

Direct Space-Time Trajectory Control  for Visual Media Editing

Your eyes will go out of the face: Adaptation for virtual eyes in video see-through HMDs

Swifter: Improved Online Video Scrubbing

Direct Manipulation Video Navigation in 3D

NoteVideo: Facilitating Navigation of Blackboard-style Lecture Videos

Ownership and Control of Point of View in Remote Assistance

EyeContext: Recognition of High-level Contextual Cues from Human Visual Behaviour

Your eyes will go out of the face: Adaptation for virtual eyes in video see-through HMDs

Still Looking: Investigating Seamless Gaze-supported Selection, Positioning, and Manipulation of Distant Targets

Individual User Characteristics and Information Visualization: Connecting the Dots through Eye Tracking

Quantity Estimation in Visualizations of Tagged Text

  • Visualization (19)

    • evaluating infovis(9)

      • text (2)

    • video (6)

    • visualizing time data (5)

    • gaze (4)

      • gaze tracking (3)

    • user requirements (3)

    • color schemes (2)


“Don’t treat subsets of the data?me like a Turker.”

“I just want to see all the data”


Lesson1
Lesson subsets of the data?

Authority and Responsibility should be aligned.


Frenzy collaborative data organization for creating conference sessions
Frenzy: subsets of the data?Collaborative Data Organization for Creating Conference Sessions

Lydia Chilton (UW), Juho Kim (MIT), Paul Andre (CMU),

Felicia Cordeiro (UW), James Landay (Cornell?), Dan Weld (UW),

Steven Dow (CMU), Rob Miller (MIT), Haoqi Zhang (NW)


Groupware
Groupware subsets of the data?

Creating conference sessions is a social process.

Grudin: Social process are often guided by personalities, tradition, convention.

Challenge: support to the process without seeking to replace these behaviors.

Challenge: remain flexible and do not improve rigid structures.


DEMO subsets of the data?


Light-weight contributions subsets of the data?

Label

Vote

Categorize


2 stage workflow
2-Stage Workflow subsets of the data?

Stage 1

Stage 2

Set-up

  • Collect Meta Data

  • 60 PC members

  • Low authority

  • Low responsibility

  • Session Making

  • 11 PC members

  • High authority

  • High responsibility


Goals
Goals subsets of the data?

Collect data: labels, votes

Session-Making


Results
Results subsets of the data?

Sessions created in record-setting 88 minutes.


Lessons for software engineering1
Lessons for Software Engineering subsets of the data?

  • Propose and vote.

  • Find natural psychological boundaries between tasks.

  • Find the tasks people can do, then assemble them using complex aggregation techniques.

  • Teach.

  • Propose, vote, test.

  • Align authority and responsibility.

221

+ 473

-221

+ 473

Low

Hi


ad