1 / 36

The Online Use of Randomized Response Measurements

The Online Use of Randomized Response Measurements. Chris Snijders Eindhoven University of Technology The Netherlands c.c.p.snijders@tue.nl Jeroen Weesie Utrecht University The Netherlands j.weesie@uu.nl. Questions in surveys.

chuck
Download Presentation

The Online Use of Randomized Response Measurements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Online Use of Randomized Response Measurements Chris Snijders Eindhoven University of Technology The Netherlands c.c.p.snijders@tue.nl Jeroen Weesie Utrecht University The Netherlands j.weesie@uu.nl Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  2. Questions in surveys Surveys are one of the standard instruments of the social scientist You ask for behavior, attitudes, characteristics etc Big problem: non-response (especially firms), you get selective responses (cf. Dutch elections) Many surveys now conducted online either after email invitation, banners, etc

  3. Internet surveys Seem to work relatively well Except for sensitive questions (which were problematic off-line as well) Social desirability bias: the tendency to reportaboutoneselfinafavourablemannerorinaccordancewithlocalnorms(Edwards,1957)

  4. Getting rid of social desirability bias Indirect questions "Covariate technique" Marlowe-Crownescale(MCSDS) Lie-detector (!; this does not seem to work that well online) stress-reduction through question wording ("everybody does things they later regret ...") randomized response

  5. Sensitive questions in surveys For instance, questions about • criminal behavior • sexual preferences • monetary issues • ... Two major concerns • Survey drop-out • Useless answers (respondents do not admit to behavior that is likely to be considered unappropriate or weird) Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  6. Throughout ONLY BINARY RESPONSE VARIABLES (0/1) YES = ADMITTING TO THE SENSITIVE ISSUE

  7. Possible solution: Randomized Response See, e.g., Warner, 1965; Kuk, 1990; Chaudhuri & Mukerjee, 1988; Fox, 1986 Basic idea (here you see the forced response method): Did you cheat on your tax-return last year? Respondent is instructed to roll two dice: if 2, 3 or 4 : reply YES if 11 or 12 : reply NO otherwise : tell truth Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  8. Randomized Response • Allows estimation of group averages (if respondents follow the protocol). • Protocols other than using dice are possible, e.g., using a question such as: If your mother’s birthday is in Jan, Feb, Mar : YES If your mother’s birthday is in Nov, Dec : NO Otherwise : truth Did you cheat on your tax-return last year? Respondent is instructed to roll two dice: if 2, 3 or 4 : reply YES if 11 or 12 : reply NO otherwise : tell truth Might be negative! Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  9. Other ways to do randomized response Roll the dice. If you roll 2 through 7 answer question number 1, otherwise answer question 2: 1) I own an illegal copy of Microsoft Office. correct / not correct 2) I do not own an illegal copy of Microsoft Office correct / not correct

  10. Or: How many of the following issues pertain to you: - you own a laptop - you like country music - you own a motor-cycle - you play a musical instrument - you own a laptop - you like country music - you own a motor-cycle - you play a musical instrument - you own an illegal copy of Microsoft Office Version 1 Version 2

  11. Randomized Response • Main use: dichotomous variables (yes-no) • Two kinds of studies: • With an objective control • Without an objective control, we assume higher observed percentages are better measurements • RR improves results (in paper-and-pencil surveys; Edgell et al., 1982; Lensvelt-Mulders et al, 2005), but is still far from perfect Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  12. Randomized Response But ... not often used, because: • Necessary sample size is larger (typically 750 or more [given prev=7%]) • Wide-spread myth that analyses at the individual level are impossible. And • Most of the evidence is based on off-line research Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  13. Individual level (logistic) regressions ... are possible, but that is not common knowledge. 1. Stata 2. SPSSwww.randomizedresponse.nl, search for HLanalyse.pdf (in Dutch, unfortunately) capture program drop rrlogit_lf program rrlogit_lf args lnf lp tempvar p quietly { gen double `p' = exp(`lp')/(1 + exp(`lp')) replace `p' = (1/6) + (3/4)*`p’ replace `lnf' = ($ML_y1==1)*log(`p') + ($ML_y1==0)*log(1-`p') } end ml clear ml model lf rrlogit_lf (y = x1 x2) ml max Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  14. Randomized Response online • Might work: replying online  you feel more anonymous  when combined with RR, you feel even more anonymous and hence do not mind answering sensitive questions • Might not work: • Implementation online is non-trivial • Since online already makes one feel anonymous, loss in precision might not be compensated for • Respondents might “play it safe” and not follow protocol Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  15. Design Population: Internetpanel in Netherlands - EuroClix Sensitive questions in four conditions (n=3,557) Adirect [control condition] n=1,078 compl Bdice embedded in the survey n = 910 compl C“downloadable dice” n = 679 compl Doptional rand. response (if yes  B) n = 890 compl Question ... Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  16. Sensitive questions about three topics • Behavior in surveys • Traffic violations • Illegal copies of software / movies / music Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  17. Behavior in surveys S1: At PanelClix I am registered under more than just one name. S2: I fill out the surveys without really reading what they ask me. S3: In the past two weeks, I filled out 4 or more surveys from PanelClix S4: I sometimes fill out surveys under the id of another PanelClix member S5: I sometimes let somebody else fill out surveys under my id. S6: I sometimes lie about personal characteristics in a PanelClix survey S7: When I have to respond to large numbers of statements I sometimes just rush through the answers. S8: I am what you could call "a professional respondent" S9: Almost always I leave open questions blank. Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  18. Traffic violations T1: Have you had a speeding ticket in the past 3 months? T2: Do you ever drive faster than 100 where only 50 is allowed? T3: Do you ever drive a car or motorcycle when you know you have had too much to drink? T4: Did you neglect a red traffic light in the past week (by car or motorcycle)? T5: On the highway I tend to drive closely behind the car in front of me, so that they will get out of my way ("bumperkleven"). T6: Have you ever damaged the vehicle of somebody else without reporting it? T7: In the past two months, have you driven faster than 150 km/h with a car or motorcycle? T8: In the past two months, did you park in a place where you had to pay, but paid less than you had to, or nothing at all? Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  19. Illegal music, movies and software IC1: I own copied music for which I have not paid although I should have IC2: -------------- movies ---------------------------------------------------- IC3: -------------- software -------------------------------------------------- IC4: I have past on copied music, movies or software to others so that they do not have to pay for it, although I know they should have just bought it. IC5: I have an illegal copy of Microsoft Windows in my possession. IC6: Whenever possible, I try to get commercial software without having to pay for it. IC7: The largest part of my music collection is actually illegal. IC8: I have an illegal copy of Microsoft Office in my possession. Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  20. Do you understand ... ...what you are supposed to do? No clue 4% Not really 3% I think I do 39% Completely clear 55% ...what the purpose of the procedure is? Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  21. Some data cleaning is necessary ...

  22. Completion rates per condition (Mean time for survey-completion 15 minutes) (given that respondent started survey) A: direct 85.5% B: RR embedded 80.7% C: RR download 62.2% D: RR optional 78.5% So downloadable dice cost 15-20 percentage points of the completion rate Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  23. Also: nine indirect questions Out of 100 people, how many ... (behavior in surveys) S6 S2 S4 (traffic violations) T2 T3 T4 (illegal copies) IC1 IC5 IC8 And several covariates: - age - gender - computer literacy - ... [for those in the control condition] Indirect questions correlate with the direct question scores, but are not strong enough predictors to actually predict the behavioral data. Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  24. Results: surveys NB Estimate = 4/3*(Obs – 1/6) Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  25. Results: traffic Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  26. Results: illegal copies Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  27. This does not seem to work that well... The group of people who is convinced by the Randomized Response method is not large enough Or ... respondents are not following the protocol!

  28. Who are not following the protocol? 7.0% of respondents has 2 “yes”-answers or less 2.4% gives no “yes”-answers at all Logistic regression on 2 “yes”-answers or less Age + Female + Education - Computer literacy1 (podcasts, RSS etc) 0 Computer literacy2 (basic internet skills) - Understand how 0 Understand why - Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  29. Randomized response: non-compliance Three types of respondents: 1) Honest yes = 2) Honest no = 3) Cheater: has yes, but says no = And: they add up to 1, and we are interested in Assumption: if ordered to say no, all do so Then we have: Direct questions : Indirect questions :

  30. Note: the idea itself is old (and not mine) The idea for this is in Clarke 1998. Downloadable from http://chrissnijders.com/tempback/Clarke1998.pdf

  31. Taking non-compliance into account

  32. Conclusions– Rand.Response online • Not much support for Randomized Response (with the Forced Response method) for these particulartopics if non-compliance is not taken into account. For illegal software we find small positive effects. Larger effects if non-compliance is taken into account. • Some indication that RR works betteras the sensibility of the topic increases Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  33. Conclusions (2) • Quick and dirty result: compliance with protocol more often for younger, male, high-educated, computer-literate respondents (who understand what RR is for). • Allowing for optional Randomized Response does not seem to work very well; perhaps some support with the illegal copying topic • Downloadable dice – not a good idea ? Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

  34. So there is good and bad news ... Internet  high openness, which makes RR less necessary. For really sensitive behavior, RR can be conducted online and analyzed relatively easily ... ... but compliance with the protocol is a major issue and has to be explicitly modeled  different kinds of non-compliance  analyses less straightforward

  35. Possible assignments First review the literature on Randomized Response measurement (given online). Either: 1) For the fanatics: Run a mini-survey on some topic that is sensitive but interests you (help will be provided). Try to come up with different ways to measure the topic of interest. 2) Design an experiment to further test the use of randomized response measurement. For instance, compare different methods. 3) Give a brief overview of randomized response measurement, and come up with a large set of questions that can be used as randomizers (such as "in which month was your mother born?")

  36. Results: traffic Online use of randomized response measurements – C.Snijders, J.Weesie. GOR, Hamburg, March 10-12, 2008

More Related