1 / 16

An Examination of Different Delivery Modes for Interactive IR Studies

An Examination of Different Delivery Modes for Interactive IR Studies. Diane Kelly School of Information and Library Science University of North Carolina Schloss Dagstuhl, IIR Seminar, March 03, 2009. Different Types of IIR Studies. Standard Evaluation Usability Study Experiment

verdi
Download Presentation

An Examination of Different Delivery Modes for Interactive IR Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Examination of Different Delivery Modes for Interactive IR Studies Diane Kelly School of Information and Library Science University of North Carolina Schloss Dagstuhl, IIR Seminar, March 03, 2009

  2. Different Types of IIR Studies • Standard Evaluation • Usability Study • Experiment • Requires manipulation of independent variable • Random assignment to condition • Lab and Field Experiments • Log-based Analysis • Information-Seeking (Online and Otherwise)

  3. Other types of IIR Studies • Infrastructure Development (NOT an IIR or “User” study) • “Users” made relevance assessments • “Users” label objects for training data

  4. Different Types of Online Studies • Web Experiment • Remote Usability Studies • Synchronous • Asynchronous • Surveys (Questionnaire Mode) • Correlation Designs • Often used to test psychometric properties of an instrument • Interviews and Focus Groups • Mechanical Turk and ESP

  5. Major Issues to Consider • Validity • Internal • External • Reliability • Sampling • Control • Sources of Variance

  6. “Some say that psychological science is based on research with rats, the mentally disturbed, and college students. We study rats because they can be controlled, the disturbed because they need help, and college students because they are available.” - Birnbaum, M. H. (1999). Testing critical properties of decision making on the Internet. Psychological Science, 10, 399-407, pg. 399.

  7. Some Good Things • Broader range of more diverse participants • Age • Education • Race • Culture • Geography • Sex • … • Targeted Recruitment • Large samples (increased statistical power) • Science becomes more accessible to more people

  8. More Good Things • Experimental situation is less artificial (although not completely) • Familiarity and comfort with physical situation • No travel time • No coordination • No navigation

  9. And Even More Good Things • Volunteer Bias (?) • Freedom to Quit • In general (condition-independent drop-out ) • As an indicator (condition-dependent drop-out) • Computation of refusal rates • Demand Effects • Experimenter Effects (includes biases introduced during execution of the experiment, data transformation, analysis and interpretation)

  10. And a Few More Good Things • Lower costs • Openness • Replication

  11. Some Bad Things • Control Issues (Cheating and Fraud) • Multiple submissions • Faking data • Collaborating with others • Imitation of treatments • Control Issues (Experimental Control) • Do subjects understand what they are suppose to be doing? • Multi-tasking • Interruptions • Consulting other sources • EWI

  12. More Bad Things • So, time is not such a great measure anymore (but maybe it isn’t really a good measure anyway) • Self-selection Bias (Topical interests) • No control over recruitment • Attrition (too high is not good) • Technical variance • Communication challenges with subjects • Difficult to explain deception

  13. And a few More … • Experiment “Marketplace” • Encourages researcher laziness and carelessness? • Requires more knowledge of experimental design and measurement • Measurement checks • Decisions to eliminate data • Bad designs waste people’s time

  14. More Thoughts and Questions • Some of the “bad” things add random error to the model (which exists even in lab experiments) • But you gain more participants, so if this error stays constant proportionately to your sample size, then is this an issue? • Random assignment to condition CRITICAL • Ultimately, do you get what you pay for?

  15. And Finally … • Many studies have found that results obtained from lab and web experiments are similar • A Web mode does not excuse poor experimental design or instrumentation • What are the implications with respect to reporting practices?

  16. For Your Reference • Birnbaum, M. H. (2000). Psychological Experiments on the Internet. London, UK: Academic Press. • Olonso, O., Rose, D., & Stewart, B. (2008). Crowdsourcing for relevance feedback. SIGIR Forum, 42(4), 9-15. • http://psych.hanover.edu/research/exponnet.html

More Related