Assessing the pragmatics of experiments with crowdsourcing: The case of scalar implicature

Assessing the pragmatics of experiments with crowdsourcing: The case of scalar implicature PranavAnand, Caroline Andrews, Matthew Wagers University of California, Santa Cruz

Experiments & Pragmatic Processing Case Study: (Embedded) Implicatures Each of the critics reviewed some of the movies. but not all ? Depending on the study: - no evidence of EI’s – evidence for EI’s, with different response choices Worry: Are we adequately testing the influence of methodologies on our data? Worry: How much do methodologies themselves influence judgements? Previous Limitation: Lack of Subjects and Money Crowd-sourcing addresses both problems

Pragmatics of Experimental Situations Evaluation Apprehension – subjects know they are being judged Teleological Curiosity - Subjects hypothesizing “expected” behavior, matching an ideal Worry: How much do methodologies themselves influence judgements? The experiment itself is part of the pragmatic context See Rosenthal & Rosnow. (1975) The Volunteer Subject.

Elements of Experimental Context Protocol – Social Context / Task Specification Response Structure – Response choices available to the subject e.g. True / False, Yes / No, 1-7 scale Prompt – the Question directions for the Response Structure Immediate Linguistic/Visual Context Our Goal: Explore variations of these elements in a systematic way Worry: How much do methodologies themselves influence judgements?

Experimental Design Is this an accurate description? Some of the spices have red lids. Linguistic Contexts – All Relevant, All Irrelevant, No Context Protocol Experimental – normal experiment instructions Annotation – checking the work of unaffiliated annotators 4 Implicature Targets, 6 Some/All Controls, 20 Fillers

Experiment 1:Social Context Focus on Protocol Annotation vs Experiment Accuracy Prompt - “Is this an accurate description?” Response Categories - Yes, No, Don’t Know Population: Undergraduates

Experiment 1:Social Context Finding: Social context even when linguistic context does not. Linguistic Context: No Effect

Experiment 1:Social Context Finding: Social context even when linguistic context does not. Lower SI rate for Annotation (p<0.05)

Experiment 2Prompt Type Accuracy Prompt - “Is this an accurate description?” Response Categories - Yes, No, Don’t Know Informativity Prompt - “How Informative is this sentence?” Response Categories - Not Informative Enough Informative Enough Too Much Information False Population: Mechanical Turk Workers Systematic Debriefing Survey

Experiment 2Prompt Type Effect for Prompt

Experiment 2Prompt Type Effect for Prompt (p<0.001) Effect for Context (p<0.001)

Experiment 2Prompt Type Effect for Prompt (p<0.001) Effect for Context (p<0.001) Weak Interaction: Prompt x Context (p<0.06)

Experiment 2Prompt Type No Effect for Protocol

Experiment 2Prompt Type Low SI rates overall But the debriefing survey indicates that (roughly) 70% of participants were aware of some/all contrast

Populations Turkers – More sensitive to Linguistic Context Less sensitive to changes in changes in social context/ evaluation apprehension Undergraduates – More sensitive to Protocol

Take Home Points • Methodological variables should be explored alongside conventional linguistic variables • Ideal: models of these processes (cf. Schutze 1996) • Crowdsourcing allows for cheap/fast exploration of parameter spaces • New Normal: Don’t guess, test. • Controls, norming, confounding … all testable online

A potential check on exuberance • Undergraduates may be WEIRD*, but crowdsourcing engenders its own weirdness • High evaluation apprehension • Uncontrolled backgrounds, skillsets, focus levels • Unknown motivations • Ignorance does not necessarily mean diversity • This requires study if we rely on such participants more * Heinrich et al. (2010) The Weirdest People in the World? BBS

Acknowledgments Thanks Jaye Padgett and to the attendees of two Semantics Lab presentations and the XPRAG conference for their comments, to the HUGRA committee for their generous award and support, and thanks to Rosie Wilson-Briggs for stimuli construction.

Assessing the pragmatics of experiments with crowdsourcing: The case of scalar implicature

Assessing the pragmatics of experiments with crowdsourcing: The case of scalar implicature

Presentation Transcript

The History of Pragmatics

The Pragmatics of Mentoring Success

Assessing the Poverty Impact of Economic Growth: The Case of Indonesia

The Challenge of Experiments

The Design of Experiments

Case 2: Assessing the Value of Alex Rodriguez

Assessing the needs of Students with Disabilities

Informativeness, Relevance and Scalar Implicature

Assessing the Effect of Visualizations on Bayesian Reasoning through Crowdsourcing

Definitions of pragmatics

Probing the scalar sector with ZZH

Simulations of the Experiments

Crowdsourcing Technologies for the Monitoring of

The Pragmatics of Dialogue Interpreting

The Ontology of Experiments

Pragmatics of Place

Case 2: Assessing the Value of Alex Rodriguez

An experimental investigation on the semantics/pragmatics of scalar terms

With The Help Of Scalar Energy Heal your Problems

Pragmatics of deception

Assessing Quality of Pathology Reporting: The Case of Tongue Cancer

Scalar Implicature in the Antecedent of Conditionals