1 / 24

Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey

Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey. Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada June 18-21, 2007. Outline. Overview of the UES Characteristics of the target population Current use of tax data At sampling

willow
Download Presentation

Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada June 18-21, 2007

  2. Outline • Overview of the UES • Characteristics of the target population • Current use of tax data • At sampling • At imputation • At estimation • Issues and Challenges • Towards a better use of tax data • Conclusion

  3. Overview of the UES • Unified Enterprise Survey (UES) started in 1997 • Objectives • Integrate all annual business surveys into one unified survey framework • To produce quality financial and commodity estimates • National and sub-national levels • Industrial levels

  4. Overview of the UES • Target population • All Canadian businesses within the covered industries • The UES is an Establishment based survey • Coverage over time • 1997: Seven Industries • 1998: Sixteen more (including Wholesale) • 1999: Four more (including Retail) • 2000: Four more (including Manufacture) • …. • 2007: Now covers over 60 major industries

  5. Characteristics of the Target Population • Divided into two main types of businesses: unincorporated (T1) and incorporated (T2) • General Index of Financial Information (GIFI) data are available electronically for the entire T2 population • T1 data are only available electronically for about half the T1s (e-filers)

  6. Characteristics of the Target Population • An enterprise is • Complex:Multi-provincial and/or Multi-industry and/or Multi-legal • Simple: The opposite • An enterprise is also • Single:Only one establishment • Multi: More than one establishment • Simple-Single enterprises represent about 95% of the population, although only about 40% of the economy

  7. Current Use of Tax Data • Why would someone use tax data? • Improve efficiency of the sample design • Reduce the response burden • Reduce the collection cost • Improve quality of the estimates

  8. Current Use of Tax Data • At sampling • Some key variables taken from different tax files are put on the sampling frame • Total Revenue, Total Expenses from GIFI • Total Sales from Goods & Services Tax (GST) • Salaries & Wages, # Employees from Payroll Deductions (PD7) • Used to define a size measure (Total Revenue) for each establishment on the frame • Used to stratify the population by size and to define the Take-None (T-N) portion

  9. Current Use of Tax Data • At imputation • Used to replace survey data (financial variables) for a predetermined sub-sample of selected Simple-Single units • Also used to replace survey data for some non-respondents • Used as auxiliary data during imputation

  10. Current Use of Tax Data • At estimation • GIFI data are used to produce estimates for all T2 units falling in the T-N portion • T1 e-filer data are used to produce estimates for all T1 units falling in the T-N portion

  11. UES Survey Design at a Glance T1 T2 Main sample to be surveyed Main sample to be surveyed Not eligible for tax : full questionnaire Characteristic quest. (services surveys) or full questionnaire (other surveys) Tax replaced EXCLUSION THRESHOLD For variables available from tax: Total estimate = Survey estimate (T1,T2) + T2 Take-None + T1 Take-none e-filer estimate For variables not available from tax (Characteristics): Total estimate= Survey estimate (T1, T2) T1 Take-None: Sample of e-filers T2 Take-None: Census of GIFI

  12. Issues and Challenges • At sampling • Sometimes we get inconsistent tax data • Ex: GIFI Total Revenue=$2M GST Total Sales=$25M • What do we do? • We use a conservative approach, i.e. we take the maximum • We manually verify and adjust the extreme cases (we’ll make use of survey data if available)

  13. Issues and Challenges • At sampling (cont’d) • Sometimes all we get is # Employees or Salaries & Wages (Revenues = . or $0) • What do we do? • We model Total Revenue using what’s available

  14. Issues and Challenges • At imputation • Sometimes we can’t find the link to tax data (ex.: not-for-profit organizations) • Sometimes we link to 2 or more tax files • We currently use direct tax replacement (i.e. Ysurvey = Xtax). Should we instead use a modelling approach (i.e. Ysurvey = f(Xtax)? • Studies have shown that in some cases it might be more appropriate to use f(X)

  15. Issues and Challenges • At estimation • Currently, we use the one-phase Horvitz-Thompson estimator • It’s a very simple, and fairly efficient estimator • Unfortunately, it could be severely biased if the model y = x doesn’t hold

  16. Issues and Challenges • At estimation (cont’d) • Estimates for variables not available from tax file (characteristics/commodity) do not cover the T-N portion • For some characteristics the T-N portion can count for a lot more than 10%

  17. Issues and Challenges • Data quality • Response rates (What is a respondent?) • Respond to tax but not to the characteristic questionnaire • Reported tax data vs imputed tax data • Planned tax replacement vs tax replacement for non-response • Variance & CV • A lot of imputation occurs in the current strategy (incl. tax replacement) • Shouldn’t we include the variance due to imputation?

  18. Towards a Better Use of Tax Data • Understand the particularities of the different tax data sources (ex.: GST vs T2 is currently under investigation) • Explore different administrative files to help with particular sub-populations (ex.: not-for-profit organizations)

  19. Towards a Better Use of Tax Data • Keep investigating why Ysurvey ≠ Xtax even when they should conceptually be equal • Explore the idea of using Ysurvey = f(Xtax) • Fine-tune our definition of who is eligible for tax replacement and who is not • Currently studying the possibility of using a more robust estimator to protect against the potential bias • Developing a strategy to cover the entire population for all variables of interest

  20. Towards a Better Use of Tax Data • Start taking into account the variability introduced by imputation when computing variances and CVs • A framework is under development to define response rates when both tax data and survey data are used for the same units • Explore the possibility of making use of all the GIFI data, not only for the T-N and the sample

  21. Towards a Better Use of Tax Data Eligible Ineligible T1 T2 Main sample to be surveyed Not eligible for tax : full questionnaire Characteristic quest. (services surveys) or full questionnaire (other surveys) Tax replaced EXCLUSION THRESHOLD For variables available from tax: Total estimate = Survey estimate (T1,T2) + T2 Take-None + T1 Take-none e-filer estimate For variables not available from tax (Characteristics): Total estimate= Survey estimate (T1, T2) T1 Take-None: Sample of e-filers T2 Take-None: Census of GIFI

  22. Conclusion • Since the introduction of the UES, the use of tax data has increased consistently • It has significantly reduced response burden and the cost of the survey • Unfortunately, sometimes at the expense of a reduced data interpretability • Fortunately, it was recently decided that we would take a few steps back to evaluate how we currently do things, and to determine how we could improve our strategy

  23. Claude Turmelle (613) 951-3327 claude.turmelle@statcan.ca

More Related