180 likes | 370 Views
Improved input data quality from administrative sources though the use of quality indicators. Use of Administrative Registers in Production of Statistics Group work Oslo, 14 - 17 October 2014 Coen Hendriks Division for Statistical Populations Statistics Norway. Topics.
E N D
Improved input data quality from administrative sourcesthoughtheuseofqualityindicators Use of Administrative Registers in Production of Statistics Group work Oslo, 14 - 17 October 2014 Coen Hendriks Division for Statistical Populations Statistics Norway
Topics • The three C’s of register based statistics • Measure quality • Analyse quality
Co-operation, Communication and Co-ordination • The three C’s of register based statistics contribute strongly to the quality of register based statistics • How is this done in Statistics Norway?
Co-operation on registers within SN • SN has taken several measures to professionalise and develop the co-operation, o.a. • Improve the quality in the administrative sources • Develop ways to measure, document and communicate on quality • Professionalise the contact with the register owners
Rather then repairing, errors should be avoided in the source • Co-operation between SN and the register owners • SN reports errors • The register owner makes corrections in the source • Single source approach • Multiple source approach – agreements on data processing • Feed-back at micro level • Errors within the source can be reported • SN make a complaint on the data quality • Agreements on cooperation
Single sourceapproach • Feed-back at micro level • Errors within the source can be reported • SN make a complaint on the data quality
Multiple source approach: Agreements on data processing • Errors which appear after linking two sources • General rule: aggregated reporting • An agreement on data processing allows reporting at micro level from SN to the register owner • Provided the register owner can use both registers for administrative reasons • E.g. population registration uses information from the Cadastre to improve quality in the CPR • SN can do these checks “in batch” on behalf of the register owner
Agreements on co-operation • Co-ordination in SN • Involves the Director of the Division for Data Capture, SN’s legal adviser, experts on quality (CoP, methodology), statistical departments and the Department for Statistical Populations • Drafted an agreement • Developed quality reports • SN invites major register owners into an agreement • Very positive receipt by the register owners • A win-win situation • The agreement is supported by a quality report • Based on the quality indicators from the Blue-ets WP 4 • A descriptive approach, highlighting the problem areas
Managingstatisticalpopulations • Three administrative baseregisters and thestatisticalversions • The Central Coordinating Register for Legal Units (LU) - The Register for Businesses and Enterprises • Cadaster - The statisticalCadaster • Central Population Register (CPR) - The Statistical Population Register • Dailyupdates, integrated data in a common database • Othersourcesareintegrated, newsourcesarebeingadded • New information, new units, bettercoverage, more (actual) addresses, bettercontactinformation • Purpose: providequalityassured and updated registers withqualityindicators, which cover all statisticalpopulations
Qualityindicators from Blue–ets WP 4 • The group leaders determined which units to measure for quality and operationalized the indicators • CPR: registered person, family and residential address • Cadastre: address, building, land property and functional unit in a building (dwelling) • LU: legal entities and LKAU • The quality indicators where reviewed and coordinated • Programming in SAS • Counted up all the positive indicators (P) • Reporting (Q)
The indicator file Numberof positive indicators: P= A general qualityindicator: Q = (P/(N*M))*1000 Extracts: Indicators with many occurrences (e.g. Ind7) Units with many positive indicators (e.g. Unit1)
The practicalcooperationwiththe data owners (registerred persons in the CPR) Municipalitieswithhighestvaluesof Q, the major cities and Norway, 1.1. 2014 Analysis shows: - manyinconsistentvalues (PIN ofmother, fatherand/or spouce/partner is invalid) - manymeasurementerrors (missingdwellingnumber, invalid address) - trouble in thecountyof Nordland (18xx) Suspicious units aretransferred to the CPR
Otherexamplesofanalysis • Whatkindof positive indicatorsarefound for newlyregistered persons? • Measurementerrors (missingdwellingnumber, invalid addresses) • Dubiousobjects (toomanyregisterred persons in a dwelling) • Refer to Appendix, tabel 3 • Why do previouslyregistered persons show an increase in thenumberof positive indicators? • Inconsistent units and values due to immigration • Measurementerrors (missingdwellingnumber, invalid addresses) • Referto Appendix, tabel4
The principles for thepracticalcooperationwiththe data owners • Positive indicators are identified within a source: • SN complain on the quality of the deliverables • SN return individual based information with positive indicators • Positive indicators are found by matching to another source: • SN give feedback at aggregate level – main rule • Assuming a data processing agreement: We can supply individual data with positive indicators • The data owner has the legal authority to use the second source • The data owner has a copy of the other source available
Qualityacross registers • SN has a long record of matching sources for quality control and improvement • The approachworks: «Improvedquality» in theCadastregives «fewermistakes» in the CPR • Indicators for quality across registers need to be developed. We are just starting • A cluster with employees in an enterprise without business (LKAU) in reasonable distance, might indicate under coverage in the Business Register (missing LKAU)
Final remarks • There is a differencebetweengoodquality data from registers and goodquality register basedstatistics • Statistical inference • Definition errors – changes in the register due to politicaldecisions