200 likes | 219 Views
Explore Statistics Canada's research on leveraging paradata for optimal data collection. Discover the challenges, sources, database, and research objectives, with a focus on improving survey operations. Learn about past studies and future plans.
 
                
                E N D
Paradata Collection Research for Social Surveys at Statistics Canada François Laflamme International Total Survey Error Workshop (ITSEW) Quebec, June 2011 Statistics Canada • Statistique Canada
Outline • Data collection organization • Data collection challenges • Paradata • Sources • Database • Paradata Research • Objectives • Scope • Past research • Current and future plans Statistics Canada • Statistique Canada
Data Collection Organization • 3 regions and 8 locations across Canada • HO collects quarterly and annual business survey data • All other business moved in the regions • Interviewers (~ 1,500) collect the following data • Face-to-face: CAPI/PAPI (~ monthly 100,000 attempts) • Concurrent surveys • CATI call centres (5) (~ monthly 900,000 calls) • Household, agriculture and business surveys • Concurrent surveys • Unionization and operational constraints Statistics Canada • Statistique Canada
Data Collection Challenges Handling sophistication and increase of data requirements Maintaining acceptable response rates Ensuring highest quality of data collected Optimizing capacity Balancing work within and between Regional Offices Retention of workforce Reducing / maintaining collection costs Developing and deploying surveys consistently, cost-effectively and timely Keeping abreast of evolving survey collection methodologies and technologies (e.g. multi-mode surveys) Taking into account operational constraints Statistics Canada • Statistique Canada 4
Paradata Sources • Paradata is Data Collection Process Information • Paradata sources • Call and contact information • Audit trail (interview key strokes) • Interviewer administrative and payroll information • Interviewer notes and observations - Not used extensively • Can be enhanced with • Sample design and sample unit information • Capacity and planning assumptions • Budget and target figures • Paradata from previous cycle or supplement surveys Statistics Canada • Statistique Canada
Paradata Database • Paradata Database includes: • Call/attempts information for both • Computer-Assisted Telephone Interview (CATI) surveys • Computer-Assisted Personal Interview (CAPI) surveys • Interviewer payroll information • Processed and standardized information • Raw files always available • Historical information since 2003 • Updated on daily basis • Prior to 2006, used for reporting purposes - not for research • Audit trail kept separately Statistics Canada • Statistique Canada
Paradata Research Paradata can be used for : Operational research (including survey management) Essentially before and during data collection Methodological research Historically, the focus is after data collection (e.g. non-response and measurement errors) Often ‘grey’ zone between the two types of research Need to make the link between operational and methodological research 2019-12-20 Statistics Canada • Statistique Canada 7 7
Paradata Research Objectives Better understand data collection process Identify potential operational efficiencies Evaluate new data collection initiatives Provide timely feedback and information Data collection survey management (Active Management) Maintain and improve data quality Improve the way surveys are conducted and managed 2019-12-20 Statistics Canada • Statistique Canada 8 8
Paradata Research Scope • Initial focus on • On CATI social surveys • RDD, cross-sectional, longitudinal surveys • Call and contact information • Extended to • CATI agriculture surveys • CAPI surveys • Payroll information • Audit trail • And more recently to • Business surveys Statistics Canada • Statistique Canada
Past Research • Initial analysis • Effort spent: calls and system time • Reaching respondents: contact rate, sequence of calls, best time to call, contact versus interview, etc. • Active management • Customized reports • Dashboard of key survey performance indicators • Impact of cap on calls • On response rates, survey estimates and costs • Production and cost analysis • Relationship between production and cost • Productivity indicators and survey cost analysis Statistics Canada • Statistique Canada
Past Research 2..3 • Pace of interview (PoINT) • CAPI surveys - Initial investigations • Basic analysis: attempts, time spent, contact rates • Paradata quality and consistency • Productivity and cost relationship • Interaction between CAPI surveys • Responsive Collection Design for CATI surveys • Active management • Identify a series of new indicators to assess data collection quality and performance (e.g. representativity, productivity and cost, responding potential of in-progress case measure) • Implementation - two pilots surveys • Analysis Statistics Canada • Statistique Canada
Past Research 3..3 • Many ad hoc research projects • Interviewer productivity by level of experience • Interaction between concurrent surveys • System time versus non-system time, etc. • Research increased knowledge about data collection process and practices • Demonstrate potential benefits - Based on facts (empirical data) • Investigate, test and implement new collection strategies and tools - Think outside the box • Make the balance between theory and practice • Focus on operationally viable projects • Communicate and share information • Documentation, papers, presentations, seminars, etc. Statistics Canada • Statistique Canada
Distribution of Calls and Timeby Collection Phase • More calls and system time spent after a first contact for both respondents and non-respondents Statistics Canada • Statistique Canada 13
Relationship between Production and Cost Throughout Survey Cycle • Strong relationship • Most distributions have the same shape • System time is a good predictor for payroll hours • Ratios of cost to production can be used to derive productivity indicators Statistics Canada • Statistique Canada 14
Survey Productivity Indicators Based on time • Completed Interview System Time / Total System Time Ratios • Productivity ratios decrease during collection period for CATI • Longitudinal CATI survey (SLID) shows larger decreases • Productivity for CAPI survey is higher and more stable • This ratio is affected by interview length and response rate Statistics Canada • Statistique Canada
Current and Future Research Plans Focus on “Strategies to improve the way data collection is conducted and managed”. Hence the research need to Be sound and operationally viable Lead to more cost-effective collection and sample design strategies Lead to data quality improvements 2019-12-20 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 16 16
Current and Future Research Plans 2..3 • Responsive Collection Design (RCD) - ongoing • Full RCD for SLID 2011 (including embedded experiment for 1st call) • Improve current RCD strategy (e.g. propensity models, phase-in of RCD, new conditions for decision making, cost-efficiency objective) • RCD for CAPI surveys • Documentation • CATI cost-efficient framework (5 dimensions) • Metrics used for costing and budgeting • Optimal resources allocation within and between surveys (2) • Collection process and practices • Operational constraints • Investigate approaches and assumptions to plan data collection for multi-mode surveys Statistics Canada • Statistique Canada
Current and Future Research Plans 3..3 • Paradata course • Describe the paradata (e.g. type, contents, quality, etc.) • Applications of paradata to plan, manage, monitor, assess and improve the survey process • Share experiences • Long and short versions • Other paradata research projects • Sample coordination for CAPI surveys • Consolidate and extend the use of audit Trail • RCD - Theoretical framework • Simulation and optimization projects • Ad hoc research Statistics Canada • Statistique Canada
Potential Issues for Discussion Are there important gaps in paradata research? If so Which type of research need to be done? What are the research priorities? Any specific research with respect to TSE? Sharing information (communication) Paradata working group, conferences/events (paradata sessions in many international events), international network… Is it enough/too much? Is it efficient? Potential collaboration between organizations - can it be improved? What is the most efficient organizational structure for this type of research? 2019-12-20 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 19
For more information, please contact Pour plus d’information, veuillez contacter François Laflamme francois.laflamme@statcan.gc.ca Statistics Canada • Statistique Canada