1 / 16

Data cleaning workshop Berlin, 8-10 June 2009

The Analysis of Interviewers‘ remarks Laura Crespo Spanish team CEMFI. Data cleaning workshop Berlin, 8-10 June 2009. This is based on my presentation for Wave 2 in Frankfurt, December 6 2007: Based on the remarks and feedback from PL, NL, BE-fr, DK, GR and ES from Wave 2 !

Download Presentation

Data cleaning workshop Berlin, 8-10 June 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Analysis of Interviewers‘ remarks Laura Crespo Spanish team CEMFI Data cleaning workshopBerlin,8-10 June 2009

  2. This is based on my presentation for Wave 2 in Frankfurt, December 6 2007: • Based on the remarks and feedback from PL, NL, BE-fr, DK, GR and ES from Wave 2 ! • Comments and suggestions from other countries’ experiences are very welcome! SHARE Data Cleaning Workshop

  3. Reminder: When a “remark” should be recorded?: • When a response (or non-response) needs to be commented. • When Blaise does not accept the answer provided by the respondent. • When a response is difficult to code. • When a response needs to be clarified. SHARE Data Cleaning Workshop

  4. Therefore, Good news: • They may contain very useful info for data cleaning (and also useful for SHARE-users, working-groups, country teams and even the survey agency). • They are an important source of info to detect errors, missing info, clarifications, problems with questions. One of the first things to look into. Bad news: • Very iwer-specific (large heterogeneity across iwers, questions, and even countries). • They need a case-by-case analysis. Very time consuming. • At some point, they will need translation to english. SHARE Data Cleaning Workshop

  5. Dealing with iwer remarks:Steps Step 1: They will be provided by MEA in an Excel file with a particular format for categorization. Step 2:Have a look at them and try to define specific categories based on their content and potential use. Very often we will need to check the corresponding question to understand perfectly the remark. Categories: • Remarks that should be investigated for data cleaning. • Useful remarks for researchers, working groups or country teams. • Both (useful for data cleaning and SHARE users). • Other remarks that should be investigated. SHARE Data Cleaning Workshop

  6. Dealing with iwer remarks:Steps Step 3: Focus on those that may be useful for data cleaning and identify which correction should be made. Step 4: Write programs to correct data or flag cases following instructions (examples do files from wave 1 and wave 2 provided by MEA?) SHARE Data Cleaning Workshop

  7. Step 2) Categories with different colours/columns • Remarks that should be investigated for data cleaning: • Specific amounts, frequencies, years, time periods (time consistency along the calendar or life cycle). • Currencies (maybe less problematic than in Wave 2). • Gross terms instead of net terms or viceversa. SHARE Data Cleaning Workshop

  8. Remarks for data cleaning • Answer category: Information that may be recorded or imputed to one of the categories already defined (instead of “Other” option) or should be back-checked with the reported answer: • (RC) Sources of income maternity leave. • (RP) Reasons for not living with a partner. • (AC) Type of residence. • (RE) Situation at 15 if no education, occupation (ISCO), economic activity (NACE), why worked part-time, reasons left job, title of the job. • (GS) reasons for no completed, positions during the tests. • (HS) Type of illness, reasons for no checks. • (IV) location and type of house SHARE Data Cleaning Workshop

  9. Remarks for data cleaning • Answer category: Information that may be recorded or imputed to one of the categories already defined (instead of “Other” option) or should be back-checked with the reported answer: • (EP) employment status, pensions, eligibility for pensions, occupation (ISCO), economic activity (NACE). • (HO) housing status. • (HC) health care payments. • (GS, WS, PT) positions during the tests. • (CH) age of children, education. • (DN) marital status. • (PH) illness and disorders, medication, surgery… • (IV) location and type of house SHARE Data Cleaning Workshop

  10. Remarks for data cleaning • Mistakes by iwers when coding the answers or the proxy status. • The system does not accept a particular answer (i.e, years, dates, amounts). • Corrected information that is included by the iwer when the respondent realizes that he/she made a mistake or reported wrong info previously (specially when inconsistencies are detected along the calendar). SHARE Data Cleaning Workshop

  11. Remarks for data cleaning • Remarks useful for researchers, working groups or country teams: • Grip strength test not performed or interrupted due to illness, disabilities, fears, concerns, not safe. • Problems encountered during the physical test (due to distraction, lack of concentration or interest, nerves, specific physical impairments or conditions,..). Presence of another person during the test. • Does not remember/Does not Know. • Does not know to read or write. SHARE Data Cleaning Workshop

  12. Remarks for data cleaning • Difficulties with Spanish (language problems). • IWERS' opinions about the reliability of the answers: contradictions, attitudes, random answers, reluctance… • Further clarifications or explanations of reported answers. • i.e., help (or influence) provided by another person (spouse, children, others,…) • Problems or circumstances with the drop-offs (help provided by the iwer, by a relative,…). SHARE Data Cleaning Workshop

  13. Remarks for data cleaning • More specific motives for non-response (private and sensitive information, does not understand the question): i.e., stillborn children, no available equipment to perform tests. • Complaints relating to the length of questionnaire. SHARE Data Cleaning Workshop

  14. Remarks for data cleaning 3. Both (useful for data cleaning and SHARE users): • Use of proxies (need to be back-checked with SMS data and also useful for researcher). 4. Other remarks that should be investigated: • Unclear meaning. • Phone numbers and addresses (may be important for contacts in next waves). Some examples. SHARE Data Cleaning Workshop

  15. Remarks for data cleaning Step 3:Focus on remarks for data cleaning and identify the correction needed. Step 4:Corrections (do files): • Instructions on this? • Even if a correction or imputation can not be made, the remark could still be useful for SHARE users, working groups/country teams and CentERdata (revision of questionnaire for Wave4). • Production of a specific file with translation for this purposes? • Translation of all remarks: Probably not worthy! SHARE Data Cleaning Workshop

  16. Remarks for data cleaning Thanks for your attention! Julie’s instructions Open discussion… SHARE Data Cleaning Workshop

More Related