1 / 25

Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures. Steve Matthews and Wesley Yung May 16, 2004 The United Nations Statistical Commission and Economic Commission for Europe Conference of European Statisticians. Outline. Introduction

bjorn
Download Presentation

Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures Steve Matthews and Wesley Yung May 16, 2004 The United Nations Statistical Commission and Economic Commission for Europe Conference of European Statisticians

  2. Outline • Introduction • Tax data programs at Statistics Canada • The Annual Survey of Manufactures (ASM) • Overview • Strategy for use of tax data • Analytical studies • Conclusions and Future Work

  3. Introduction • Desire to increase use of tax data • Reduce respondent burden • Reduce survey costs • Can be used at many stages of survey process • Stratification • Survey data validation • Edit and imputation • Estimation

  4. Tax Data programs at Statistics Canada • Tax data available to Statistics Canada • Collected by Canada Revenue Agency (CRA) • Access via a data-sharing agreement • To be used only for statistical purposes • Two extensive tax data programs • Unincorporated businesses (T1) • Incorporated businesses (T2)

  5. Tax Data programs at Statistics Canada (cont’d) • T1 - Population • Unincorporated businesses • Account for small share of revenues • Administrative Data • Sample-based • Limited set of variables • Edit and imputation is applied • Weighted benchmarked estimates

  6. Tax Data programs at Statistics Canada (cont’d) • T2 - Population • Incorporated businesses • Account for large share of revenues • Administrative Data • Census-based • Extensive set of variables • Edit and imputation is applied • Micro-data is produced

  7. The Annual Survey of Manufactures • Manufacturing is an important sector of Canadian economy ~17% of GDP • Annual Survey of Manufactures • Take-none Portion and Survey Portion • Extensive questionnaire (financial and commodity) • Data requirements (pseudo-census)

  8. The Annual Survey of Manufactures (cont’d) • Target population • Drawn from Statistics Canada’s Business Register (BR) • All businesses classified to manufacturing • Sample design • Non-survey portion • Administrative data • Survey portion • Stratified SRS (Stratum = NAICS * Province * Size) • Small take-some / Large take-some / Take-all • Collected via mail-out / mail-back, follow-up via telephone

  9. The Annual Survey of Manufactures (cont’d) • Edit and Imputation • Edits applied to ensure accuracy and coherence • Extensive imputation to produce ‘pseudo-census’ dataset • Historical imputation • Ratio imputation • Nearest-neighbour donor imputation

  10. The Annual Survey of Manufactures (cont’d) • Estimation • Non-survey portion (tax data) • Total Expenses only • T1: weighted domain estimates • T2: aggregates from administrative census dataset • Survey portion (survey data and imputed data) • Aggregates from pseudo-census dataset • Domains of interest: NAICS and Province

  11. Analytical Studies • Motivation for two studies: Which variables should be ‘replaced’? What are the effects of the strategy on final estimates for all variables? Study 1 – Data comparison Study 2 – Impact Analysis

  12. Analytical Study 1 Study to select appropriate variables • Comparison of reported data collected via survey and tax • Simple businesses only • Assess suitability for substitution of survey data Based on ~6,000 businesses

  13. Analytical Study 1 (cont’d) • Correlation Analysis • Wide range of correlations • Total Expenses: 0.9 • Total Energy Expenses: -0.10 • Reporting Patterns • Same pattern (zero or positive) for individual businesses • Total Expenses: 99% • Total Energy Expenses: 50%

  14. Analytical Study 1 (cont’d) • Distribution of Ratios • Examined histograms, fraction between 0.9 and 1.1 • Total Expenses: 60% • Total Energy Expenses: 16% • Population Estimates • Relative difference between tax and survey-based estimates • Total Expenses: 3% • Total Energy Expenses: 28%

  15. Analytical Study 1 (cont’d) • Selected several variables for direct substitution • Section totals and sub-totals • expenses, revenues, inventories, etc. • Remaining variables are imputed • Imputation => assign distribution of details within each total

  16. Analytical Study 1 - Conclusions • Distinctively different results for different variables • Direct substitution seems feasible for totals • Direct substitution not recommended for details • Use standard methods to impute other variables

  17. Analytical Study 2 Analysis to evaluate impact of tax data strategy Bias • Comparison of estimates from different scenarios Variance • Shao-Steel approach for variance estimation • Reflects variance from sampling and imputation • Assume equal probability of response within imputation class

  18. Analytical Study 2 (cont’d) Scenarios

  19. Analytical Study 2 (cont’d) Comparison of resulting estimates for Total Expenses Relative Difference from “HT – No Tax” – Total Expenses * Median value for all such domains

  20. Analytical Study 2 (cont’d) Comparison of estimated CV’s for Total Expenses Co-efficient of Variation – Total Expenses * Median value for all such domains

  21. Analytical Study 2 (cont’d) Comparison of resulting estimates for Total Energy Expenses Relative Difference from “HT – No Tax” – Total Energy Expenses * Median value for all such domains

  22. Analytical Study 2 (cont’d) Comparison of estimated CV’s for Total Energy Expenses Co-efficient of Variation– Total Energy Expenses * Median value for all such domains

  23. Analytical Study 2 - Conclusions • Bias • Small relative difference between estimated totals from scenarios • Variance • Relatively low CV for all options • Tax substitution variables: Scenario 3 most efficient • Non-tax substitution variables: Scenario 1 most efficient • Analytical capabilities • Scenarios 2 and 3 provide most detail

  24. Conclusions • Results used to select 2004 strategy – “PC – Tax” • Meets needs of data users • Reduced cost and response burden • Maintain (improve) quality • Striving to further increase use of tax data • Increased portion of population • Increased number of variables

  25. Future Work • Editing of tax data • Similar approach to survey data approach • Potential to expand list of direct substitution variables • Indirect use of tax data • More adaptive models • Quality indicators • Account for increased variance and potential for bias due to imputation

More Related