Using paradata to monitor and improve the collection process in annual business surveys
Sponsored Links
This presentation is the property of its rightful owner.
1 / 29

Using Paradata to Monitor and Improve the Collection Process in Annual Business Surveys PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Using Paradata to Monitor and Improve the Collection Process in Annual Business Surveys. By Sylvie DeBlois, Statistics Canada Rose-Carline Evra, Statistics Canada ICES-III, Montreal, June 19 th , 2007. OUTLINE. Introduction Score Function Paradata Score Function Recent Update

Download Presentation

Using Paradata to Monitor and Improve the Collection Process in Annual Business Surveys

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Using Paradata to Monitor and Improve the Collection Process in Annual Business Surveys


Sylvie DeBlois, Statistics Canada

Rose-Carline Evra, Statistics Canada

ICES-III, Montreal, June 19th, 2007


  • Introduction

  • Score Function

  • Paradata

  • Score Function Recent Update

  • Future Developments


  • The Unified Enterprise Survey (UES) is an annual economic survey on financial and characteristic variables, which has been conducted by Statistics Canada since 1998. It combines many surveys.

  • Average collection period: February to early October

  • Collection Processing System: Blaise

  • More than 48,000 questionnaires each year.

UES Questionnaire

  • UES includes Services, Trades, Manufactures, Agriculture (aquaculture) and Transportation (couriers and taxi & limousine) surveys.

  • A questionnaire has about 7 to 10 sections (the number of sections varies depending on the survey):

    • Introduction (Stats Act - Confidentiality, Respondent info)

    • Revenue

    • Expenses

    • Events that may have affected business units

    • Comments


  • Collection Process:

    • Mail-out of questionnaires

    • Follow-up in case of non-response for some units / Mail-back of questionnaires

    • Verification of received questionnaires / Edits

    • Coding of questionnaires

    • Imaging & Data Capture

  • Sometimes during the collection period, follow-ups are required due to non-response. The score function is used to determine the priority of an enterprise in follow-up.


  • Collection follow-up tool: Score function (SF)

    • Annual Survey of Manufactures (ASM) score function

    • Non-ASM score function

  • Both score functions have their own ways of calculating scores, defining cells and priorities.

  • This presentation will focus mainly on the Non-ASM score function.

Score Function

  • Reduces collection costs yet retains data quality.

  • Similar to the collection goal of obtaining a high weighted coverage response rate.

  • PRIORITY 1:Extensive follow-up for the larger revenue Collection Entities (CE) in cases of non-response.

  • PRIORITY 0:Minimum follow-up for the smaller CE’s in cases of non-response.

Useful definitions


Sampling Unit(part of the enterprise within the cell)


NAICS:North American Industry

Classification System (5-digit








Method: Initial Scores

  • Within each cell, calculate the score for each UES sampling unit (SU).

  • Score = the sample weighted revenue of the SU as a percentage of the cell’s total revenue.

    • Sample weight: UES sampling weight

    • Revenue: Sampling Revenue

Method: Initial Scores

  • Cell:

    • For Distributive Trades & Aquaculture:

      NAICS * Province

    • For Transportation:

      NAICS*Prov*Stratum(Take All /Take Some)

    • For Services:

      NAICS*Prov*Stratum(TA /TS)* Type of questionnaire (long / characteristic)

Method: Initial Scores

  • Within each cell

    • Sort SUs by descending score

    • Cumulate to the survey’s target coverage threshold for the Priority=1s, and the rest are Priority=0s.

Method: Dynamic Scores

  • During collection process,twice a week, we:

    • receive updated response codes;

    • recalculate the scores within the cell (i.e. make it dynamic) to update priorities;

    • update priorities on Blaise, the collection tool.

Method: Dynamic Scores

  • As collection proceeds:

    • Response (received or completed) questionnaires contribute to the cell threshold

    • Non-response questionnaires contribute nothing to the threshold

    • Out-of-scope are removed entirely from the cell (reduces the cell’s revenue total)

    • In-Progress questionnaires are still being collected (include appointments)

During Collection

  • New total weighted revenue for the CELL (exclude the OOS).

  • Priority 1’s or 0’s received or completed contribute to reaching the CELL threshold.

CELL: XXXXXXXXTotal: 475,000k

Received or Completed

15% reached

Priority 1

In progress

50% left to do

Threshold= 65%(308,750k)

In progress

Priority 0




Method: Dynamic Scores

  • Has the cell reached its threshold?

    • If yes, stop follow-up.

    • If no, recalculate scores using In-progress units and the remaining threshold.

      • Some cells must close due to lack of In-Progress questionnaires

      • Some In-progress Priority 0s may be promoted to Priority 1s.


  • Definition: All variables directly related to data collection process

  • Currently used:

    • Response code

    • Appointment reason (edit – data collection)

    • Appointment date (recently added)

    • Currently used only by Annual Survey of Manufactures (ASM):

      • Number of attempts, commodity revenue and shipment revenue

  • Could possibly be used:

    • Type of contact with the respondent

    • Previous year’s response code

    • Type of reminder sent / Date / # (mail, remail,…)

    • Others

Score Function Recent Update

  • Recently, a study was done on the impact of appointments on the response rate (for reference year 2003).

  • Following our findings the “appointment date” was added as paradata into the score function.

Appointments: The Study

  • During the collection period, an appointment might be scheduled with the respondent.

  • “Does the fact of having a appointment affect the response rate?”

  • Note: When an appointment is made and it’s a priority 1 questionnaire, it remains in the SF with a priority 1 with the “still in progress status”. Therefore, no priority 0 will be put as priority 1.

Response Rates: app versus no app

  • The response rate is significantly lower for the questionnaires with an appointment.

    RY2003 (Non-ASM surveys)

Response Rates: Scheduling of the appointment

  • The response rate is significantly lower for questionnaires when the appointment is made toward the end of the collection period.

Other Facts

  • The longer a questionnaire stays in appointment, the greater is the probability of that questionnaire being a non-response at the end of the collection period.

  • 23.8% of the questionnaires with appointments were classified as non-respondent, because at the end of the collection period their cases were still open.

Appointment: Conclusion

  • When possible, we should avoid making an appointment. Especially, at the end of the collection period.

  • In cases of appointments, follow-up should occur soon after the appointment is made. An appointment is still a good way of improving the response rates.

  • The treatment of the appointments in the score function should be modified. Extra “In progress” units will be promoted to priority 1 in order to compensate for possible non-response.

Facts / Findings

  • A unit may not have an appointment date or may have one that is constantly changing.

  • Many appointment dates are within a few weeks.

  • It was decided to only consider units that have a late appointment date, and there are not many.

Facts / Findings

  • An appointment can mean many things.

  • Many unexpected factors caused the changes to be less efficient than initially expected.

Human Errors

  • The interviewer:

    • Enters the wrong value for a variable (for example, appointment reason)

    • Does not update a key variable (for example, appointment date)

System Problems

  • System Failures

    • As a result, some variables are affected, like the number of attempts.

  • Files not properly loaded

    • Missing values or variables

  • Some follow-up events occur outside of the system

Theoretical / Practical

  • Appointment date is also used to set the “remail” (remail of questionnaire) and fax date.

  • Also, some appointment dates are default dates (differ from survey to survey).

  • Appointment is also used as a reminder to the interviewer to call a respondent unavailable at the moment of the initial call.

Future Developments

  • Establish what is really an appointment; do more studies on the appointments.

  • Study more paradata to “quantify” the importance of each unit, give priority and improve the score function.

  • Introduction of a cost function to help assign the priority and the type of follow-up.

  • Combine the ASM score function and the Non-ASM score function.

Thank You / Merci!!!Questions ???

Pour plus d’information veuillez contacter /

For more information, please contact:

ou / or

  • Login