1 / 14

Editing and Imputation Methods for Italian Censuses: An Overview

This article provides an overview of the editing and imputation methods used in the Italian censuses, including the strategies for the 2011 census and likely innovations. It discusses the impact on editing and validation procedures and highlights the use of the DIESIS system and data-driven and minimum change approaches. The article also addresses the identification of respondent paths, validation of person 1 in the household, and the importance of E&I for small but important groups in the population.

kcruz
Download Presentation

Editing and Imputation Methods for Italian Censuses: An Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Overview of Editing and Imputation Methods for the next Italian Censuses Gianpiero Bianchi, Antonia Manzari, Alessandra Reale UNECE-Eurostat Meeting on Population and Housing Censuses Geneva, 13-15 May, 2008

  2. Outline • Features of 2001 E&I strategy • E&I strategy for 2011 Census • Likely innovations for 2011 Census • Impact on editing and validation procedures • Conclusions

  3. Features of 2001 E&I strategy • Main E&I purpose: provide a complete and consistent set of data by performing plausible imputations and preserving the maximum amount of collected information • E&I strategy: divide the E&I problem into simpler sub-problems and find appropriate solutions for each of them • Overall E&I process composed of several (connected) procedures addressing to specific problems and implementing suitable methods • Development and use of new techniquesand software tools

  4. E&I strategy for 2011 Census • Built on the useful experience of the 2001 Census, taking account of: • The innovations in the survey design • Eurostat timeliness constraints In particular: • Census variables split into topics processed in pre-determined order (first demographic, then socio-economic) by appropriate procedures • Adaptation of 2001 procedures to the innovations and developing of new procedures by means of highly efficient algorithms • Proper planning, implementation and managing of the E&I procedures

  5. Main elements of the 2011 strategy • Use of DIESIS* system developed in 2001 by ISTAT and academic researchers (Department of Computer and Systems Science of the University of Roma “La Sapienza”). Based on optimization techniques, allows: • Treatment of qualitative and quantitative variables • Between-unit and within-unit edit rules • Joint use of data driven and minimum change approaches • DIESIS will process 2011 demographic variables and, likely, some socio-economic variables • * Data Imputation and Edit System - Italian Software

  6. Main elements of the 2011 strategy • Joint use of data driven and minimum change approaches by DIESIS system • When reduced pool of donors the data driven approach can require imputing too many values • Minimum change approach used to minimize the number of values to be changed

  7. Main elements of the 2011 strategy • Identification of the respondent path • Respondent paths used to: • Compute the Subset of Admissible Values (SAV) of Year of birth, a strata variable for the imputation of demographic variables – connection between demographic and socio-economic steps • Define strata for the imputation of socio-economic variables • Missing responses or errors can make uncertain the identification of the right respondent path • Automatic procedure for the identification of the most likely path based on the analysis of the responses given to filter and dependent questions

  8. Main elements of the 2011 strategy • Validation of Person 1 in the household • Based on optimization techniques implemented in the DIESIS system • The minimum change algorithm assigns the role of Person 1 to the person that minimizes the number of changes needed for the record to be consistent • Identification of potential couples • Components of couples having non-unique relationship to Person 1 identified prior to editing • Score based on the responses provided to the demographic variables

  9. Main elements of the 2011 strategy • Especial care in E&I of small but important groups in the population E.g. Centenarians validation • 2001 procedure: • Automatic match of individuals enumerated in the 2001 with same individuals enumerated in the 1991 • Automatic check for internal consistency of unlinked records • Manual check for consistency with questionnaire images of some ambiguous cases • New procedure supported by availability of local population registers

  10. Likely innovations for 2011 • Short-long form questionnaires • Short: (mainly) demographic variables • Long: demographic and socio-economic variables • Availability of registers • Local population registers (residing individuals) • Integrative registers from auxiliary sources • Residential address lists • Use of multi-mode data collection • Enumerators, CATI, mail, web

  11. Impact on E&I and validation • Socio-economic characteristics collected on sample basis (by long-form) • Two procedures for computing the SAV of Year of birth (one for short-form, one for long-form) • The reducedpool of donorsfor imputation of long-form variables requires careful managing of data collection and donor pool selection phases • Sampling weights required for data validation after E&I of long-form variables

  12. Impact on E&I and validation • Availability of registers : • Improvement of the quantitative control of the forms • Imputation of missing or inconsistent census values by matching census data and register data (Record linkage procedure) • availability of unique record identifiers • same time reference than census data • good quality of register data • Imputation of missing or inconsistent census values by adding register data to census data - enlarging the donor pool

  13. Impact on E&I and validation • Use of multi-mode data collection • Improvement of the collected data quality due to editing performed at the data capturing (CATI, web) • Procedure aiming at verifying duplicate questionnaires is required

  14. Conclusions • E&I strategy for 2011 Census based on 2001 experiences • The new survey design aims to reduce the respondent burden but requires a careful monitoring during production and a more complex E&I process • High efficient procedures need to be developed in order to meet the timeliness requirement E&I is an achievable but hard task

More Related