480 likes | 568 Views
Explore the history, challenges, and future prospects of RDD sampling for telephone surveys. Discover how the method evolved, its advantages, and ongoing developments in survey research.
E N D
GfK NOP Social Research GfK. Growth from Knowledge RDD Sampling for Telephone Surveys Nick Moon, GfK NOP Social Research
1 • Early History of RDD 2 • First successful method Agenda 3 • Becoming mainstream 4 • A new challenge 5 • Future challenges
The Theoretical Case • Telephone generally cheaper than face to face • Unclustered no dearer than clustered • Possible advantages for sensitive questions • More questions per minute • Little evidence of mode effects • Shortage of face to face fieldwork
Telephone Interviewing • Took off from early 1980s • Rapidly increased proportion of total research volume • But very rarely used for social research • Initially concerns about penetration • penetration >95% • Concerns also about sampling • Need for probability sampling
Probability Sampling • Needs a sample frame with 100% coverage • Only frame for telephone is the telephone directories • Considerably <100% coverage • High and growing ex-directory rate • UK 36% overall London 50%+ (1999)
Ex-Directory Subscribers • Urban • Younger • Female • Lowest social class groups • Higher income • Smaller households
A theoretical alternative • While addresses or name lists are infinite, the number of telephone numbers is finite • It should therefore be possible to generate numbers entirely at random from the known list of possible numbers
But not so good in practice • A ten digit telephone numbering system allows 1 billion numbers • only 25 million households in the UK • many numbers are business only and not residential • vast majority of possible numbers are not in use • The cost of finding live numbers can be prohibitive • Foreman and Collins (1991) • Lack of information about the numbering system
The first compromise • RDD is not cost-effective • Directory sampling is biased • Directory plus 1 in theory compensates for this bias • Later developed into directory plus n
Problems of Directory plus n • Unlisted numbers tend to be clustered • If 10 consecutive numbers are unlisted then directory + any number from the first one will yield no listed number • the chance of any number being selected depends on the proportion of the previous nine numbers that are in the directory
Number Propagation • Developed by BMRB in 1991 • Collect telephone numbers from all respondents on their Omnibus • This includes ex-directory numbers (though some will still refuse) • Generate numbers from n-3 through to n+3 • Has advantages over directories, but still effectively plus n rather than RDD • BMRB now use RDD
Telephone Interviewing • The 01727 exchange allows 1 million numbers, for a town of 50,000 people • But all start with either 8 or 7 • This immediately reduces total possible to 200,000 • More information now available from Ofcom over and above exchange listings • Can identify blocks of 100,000 numbers that you know exist • Working blocks still contain huge numbers of non-allocated numbers that must be winnowed out
2 The First Success
The US example • Random Digit Dialling (RDD) used widely in the US • Initially Mitofsky-Waksberg • Subsequently list assisted
Mitofsky Waksberg 1 • All US numbers are in format 123-456-7890 • All possible combinations 123-456 known • Draw random sample of area code+local exchange+ random number from 00 to 99 as primary sampling units • The “100 Block” • Generate at random a number from 00 to 99 to produce one full number per psu • Telephone each of these numbers
Mitofsky Waksberg 2 • If that number rings, assume the block is in use, and keep in sample • If that number doesn’t ring, assume the block is not in use, and reject from sample • Because the chance of one selected number ringing is higher the more working numbers there are, then this is effectively pps sampling
Mitofsky Waksberg 3 • Since psu’s are selected pps, there should be a constant number of sample members per psu • Generate more random numbers from 00 to 99 to produce more numbers within the same 100 Block • Ring these numbers until the prescribed number of WORKING numbers has been reached • These are then treated in the same way as any other random sample with cal-backs, no replacement etc
A quick quiz Paper title: “Forty eight red, white and blue shoestrings” Why?
Forty eight red, white and blue shoe strings 'Mac the Finger said to Louis the King, I got forty eight red, white and blue shoe strings, And a thousand telephones that will not ring, Tell me where I can rid of these things' Bob Dylan, Highway 61 Revisited
The Moon–Noble experiment • Translation of Mitofsky Waksberg to UK hampered by lack of transparency about UK numbering system • BT cited concerns about privacy • Also irregular number length (9 digit, 10 digit) • Empirical mapping of number system using large scale tele-marketing databases • 15 million numbers, so can map nearly the whole system • Identifies nearly all 100 banks currently in use • Random sample of 100 banks for Mitofsky Waksberg approach • Yielded 66% working lines at second stage
Moon Noble Experiment conclusion • Clearly a success • Showed true probability telephone sampling could be done cost-effectively in the UK • Won Best Technical Paper award at 1998 MRS Conference • Sat back and waited for glory
3 Becoming Mainstream
A Very Short-lived Glory • Growth of competition in the telephone supply market • Increased role for Oftel (now Ofcom) • Far greater transparency about the number system • And standardisation of length • Now possible for US techniques to be applied to the UK • Commercial suppliers now make RDD samples available to all • Specialist agencies now sell RDD samples • Epsem • List-assisted • Pre-dialled with auto-dialler to weed out non-working numbers
Two Different Approaches • True epsem • Does not take number of directoried lines into account • List Assisted • Glorified version of directory plus n • Directory minus n digits with last n digits randomised • List assisted will always be biased against blocks with large numbers of working numbers of which very few are directoried • This may or may not matter • Epsem will always produce far more non-working numbers • Use of pre-dialing or “pinging” to remove non-working lines
Issues of Geography • Less than perfect match between telephone numbers and geography • Fine at exchange code level • Progressively less good at more detailed levels • No official record of geography of numbers • Postcodes no longer printed in telephone directories • Commercial suppliers will sell samples based on various geographical levels • Assign numbers to postcodes based on available databases • Reasonable at constituency/local authority level • Poor at ward level • Need to oversample for ineligibles and allow for screening • People don’t always know where they live
Implications of Geographical Issues • No real problem for national surveys • Doesn’t really matter if isolated case is in the “wrong” psu • Considerable implications for local studies • Lose some of the cost advantages of telephone interviewing • Especially problematic for clustered samples • Such as locating areas of high density of BME populations
Evidence from the Staffordshire SE experiment • Allocation of numbers not geographically rational • The vast majority of numbers were in the format EEEEE-Lxxxxx • Where EEEEE was the exchange and all started with the same sub-exchange number L • A small number were in the format EEEEE-Nxxxxx • Presumably the L sub areas was filling up and a new sub-area N came into use • One might expect these new numbers to run sequentially from EEEEE-N00000, EEEEE-N00001, EEEEEN00002 etc • In fact they were scattered over huge range of numbers
Implications of this number scatter • The core of any RDD sampling approach is blocks of 100 or 1000 numbers • We have already examined the implications of different densities of directoried numbers per block • Relative density of working numbers per block is also important • Scatter of the Staffordshire kind will lead to a density distribution with a hugely long tail • In theory pps sampling should take care of this • A small proportion of very low-density blocks will still get selected • But most RDD systems rely on finding a fixed, and non-trivial, number of working numbers per block • Nicolaas (2001) suggests very low density blocks might be safely ignored
Sampling Individuals • A telephone is usually a household item • But we want a sample of individuals • Quota approach can take whoever answers the phone • Old social rules on who answers phone have disappeared • New ones appeared in their place • Especially in households with teenagers • Random selection is ideal
Random Selection of Individuals • Kish Grid is the gold standard • Enumerate household in set order • Take nth person list according to randomised procedure • Requires enumeration at very beginning of interview • Felt to cause potentially higher refusal rates • NatCen experiment suggests this may not be so • Next/last birthday method a simpler compromise • NatCen experiment shows similar profile to Kish grid • The Rizzo/Brick/Park variant even easier • How many adults in household? • If one continue interview • If two randomly select either the person who answered the phone or the other one • If three or more go through next birthday method
Weighting the data • People in larger households have less of a chance of selection than those in smaller ones • Same principle as face to face surveys using PAF • Effective sample size only c80% of actual one • Households with >1 line have >1 chance of selection • Need question on number of lines • But if the line is used only for a modem/fax then it shouldn’t count • Need question on lines that ring • All adds to the cost
A Cautionary Note • There’s more to random sampling than just sampling • Many agencies buy RDD samples but treat them as leads for a quota sample • If you get a response rate of 15% does it matter if the sample is random?
4 A New Challenge
2003 Communications Act • Gives Ofcom the power to deal with silent calls • Primarily directed at tele-marketers • Unexpected impact on RDD
Silent calls • Caused by use of auto-diallers • Machine dials number • When it is answered machine hands it over to any available interviewer/salesman • If all are busy respondent gets “silent call” • Not cut off but no-one there
Impact of Silent Calls • Silent call rate can be set to any level on dialler • Above that rate and system stops making calls • The higher the rate the less interviewer salesman dead time • But more pissed-off recipients of silent calls • Lots of tele-marketers don’t care, MR firms generally do • GfK NOP silent call rate always 1% or under
Consequences for RDD • The only supplier of epsem samples temprarily ceased to do so • They were concerned pinging may be against the new law • Without pinging epsem is not cost-effective • Now supplying pinged sample again - BUT • The main clients for true random samples are government bodies • Concern about seeming to be against spirit of legislation
5 Future Challenges
Mobile Phones • Only 1% of households have no phones • But 8% have only mobiles and no landline • These are being missed from RDD surveys • Does it matter?
Problems of Mobile Phones • Mobiles are individual • Landlines are household • Mobiles may be provided by employer • Mobiles may be answered while driving • NB in the US many cell phone users pay to receive calls
Mobile-only households • Younger • 38% of those with a mobile but no landline are 18-24 • Down-market • 58% of those with a mobile but no landline are social class DE • More likely to be in full time education • More likely to be unemployed
Dealing with Mobiles • Mobile-only households will reach a level where they can’t be ignored • Mobile numbers identifiable • Could sample in the same way as landlines for RDD • Multiple chances of selection • Need to ask for number of landlines and number of mobiles • No geographical data
The Future’s Orange/Virgin/O2/Vodaphone • Further Changes to the Numbering System • Particularly influenced by competition among suppliers • Ofcom keen to reduce barriers to competition • Number Portability • Must be same broad area BUT • VOIP • Mobile/landline hybrids • Wireless Office
Implications of future changes • The link between telephone number and geography may break down completely • National samples can still be drawn by simple random sampling, but with no benefits of stratification • Local sampling by telephone may become impossible • Cost may force local surveys onto postal or online methodologies
The Paradox • The demand for high quality high volume social research is beginning to exceed the supply of suitable face to face fieldwork • This is diverting social research to telephone, that previously would only ever have been conducted face to face • This research demands high quality epsem samples, at the very time that it is being more difficult to achieve them • What’s wrong with paper self-completion anyway?