1 / 16

Methodological challenges in integrating data collections in business statistics

Methodological challenges in integrating data collections in business statistics. Paul Smith Office for National Statistics. Outline. Data quality for different sources quality measures for survey and administrative inputs quality measures for outputs Combinations of sources

brianz
Download Presentation

Methodological challenges in integrating data collections in business statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Methodological challenges in integrating data collections in business statistics Paul Smith Office for National Statistics

  2. Outline • Data quality for different sources • quality measures for survey and administrative inputs • quality measures for outputs • Combinations of sources • familiar and more advanced situations • Mode effects • Models • Discussion

  3. Statistical data collections - quality • Relevance • generally questions conform to desired concepts • may be tailoring for • practicality • consistency across collections even if concepts differ • Accuracy • affected by sampling • impacts from non-response, measurement error • Timeliness • generally relatively timely

  4. Administrative data - quality • Relevance • questions conform to administrative (not statistical) concepts • few concessions to statistical needs • Accuracy • unaffected by sampling • processes to discourage non-response • treatment of measurement error differs by variable • Timeliness • generally slow

  5. Differences between types of source • Sampling accuracy is measurable for surveys, not relevant for administrative data sources • confidence in quality reduced for admin data • balance of accuracy measures different • Building statistical requirements into administrative series • requires negotiation and agreement • VAT classification information in the UK • INSEE has statistical and accounting information well integrated

  6. Questionnaire design • Questionnaire design principles mostly used in designing statistical collections • Administrative data seen as “forms” not “questionnaires” • less attention to question phrasing to obtain required answer • more on statutory requirements

  7. Output data quality • Data quality from combined outputs can be challenging to measure • function of the qualities of the input sources, and the methods used to combine them • some well-known general approaches • development of measures needed for particular cases (eg from models)

  8. Combinations of sources - 1 • Frame and sample information • Sampling frames typically derived from administrative sources • Multiple uses of frame information • sample design • sample selection • validation and editing • estimation and variance estimation • Quality easily derived – standard situation

  9. Combinations of sources - 2 • Dual-frame surveys • More than one administrative source • Pension funds survey in the UK • Units • Business register • Challenges of population inflation if matching not perfect • Estimate probability that unit appears in sample from either source • use in appropriate weighting procedure • adjustment for P(in both surveys) depends on survey type

  10. Combinations of sources - 3 • Multiple surveys • different periodicity • summary information monthly, detail annually • for example capital expenditure – quarterly breakdown, annual summary • Benchmarking • where short-period surveys small (and variable) and annual larger (and less variable) • Quality measures • account for sampling error in both sources • account for non-response and measurement errors in larger survey

  11. Combinations of sources - 4 • Auxiliary information • If administrative concept not close to statistical concept, data may still be useful • Auxiliary information in estimation • not required to be correct, only correlated with outcome • the better the correlation, the better the accuracy • Auxiliary information in validation • use tax data to improve validation follow-up activity • Data confrontation • Use multiple sources to identify discrepancies • Balancing

  12. Mode effects • Mode effects manifest in several ways • differences in contact rate • differences in response rate given contact • differences in question replies given response • Test differences through a designed experiment (van den Brakel & Renssen 1998, 2005) • evaluates whole-process differences (not individual steps) • non-response adjustment if good predictors for response amongst auxiliary data (var increases) • model-based adjustments for other changes

  13. Temporal differences • Administrative data often have longer reference period than statistical requirement • Implies temporal disaggregation (model-based) – Dagum & Cholette 2006 • Quality implications • estimated data as inputs • sensitivity of model to interesting changes

  14. Models for combining data • Full flexibility in combining data available through modelling approach • Models at boundary between statistical producer and user • Ideally statistical results insensitive to model assumptions • small area estimates • useful for social surveys • challenges for business surveys not yet resolved • modelling for unit structures - BRES

  15. Discussion • Aim: more from existing sources • often imperfect matches • modelling only appropriate approach • subjective • robust to assumptions • sensitivity analysis • Mixed mode collections • usability and low cost • data combination • quality components harder to measure

  16. for more details see the paper, or contact • paul.smith@ons.gov.uk

More Related