1 / 11

Farm Household Surveys DATABASE ORGANISATION AND DATA CLEANING

Farm Household Surveys DATABASE ORGANISATION AND DATA CLEANING. Glwadys Aymone GBETIBOUO C4ECOSOLUTIONS, CAPE TOWN Economics analyses of climate change impacts workshop Accra, Ghana.

Download Presentation

Farm Household Surveys DATABASE ORGANISATION AND DATA CLEANING

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Farm Household Surveys DATABASE ORGANISATION AND DATA CLEANING Glwadys Aymone GBETIBOUO C4ECOSOLUTIONS, CAPE TOWN Economics analyses of climate change impacts workshop Accra, Ghana

  2. Database organisation and cleaning, or data management is generally seen as a set of tasks related to the tabulation phase of the survey, in other words, activities that are conducted towards the end of the survey project, that use computers in clean offices. • Survey data management should begin concurrently with questionnaire design. Keys points to consider: • Nature and identification of the statistical units observed • Built-in redundancies • Length and complexity of the questionnaire • Sample size and design • Survey timing and scheduling

  3. DATA ENTRY : “flat file”

  4. codification of the statistical unit

  5. Household code8 digits code

  6. DATA ENTRY SYSTEM • A complex household survey typically contains hundreds of variables. For example household survey dataset 2003 GEF study : 1342 variables • After the survey instrument has been finalized, you develop the data entry system and provide a protocol for data entry. • Coding questionnaire • Coding sheet • Household data: 12 worksheets • Climate data; soil data, runoff data

  7. DATA ENTRY

  8. Data cleaning • Generally data is subjected to control mechanisms: • range checks, • consistency checks and • typographical checks

  9. Range checks Every variable in the survey contains only data within a limited domain of valid values. tab farmtype, missing farmtype | Freq. Percent Cum. ------------+----------------------------------- -99 | 4 0.99 0.99 1 | 191 47.16 48.15 2 | 71 17.53 65.68 3 | 138 34.07 99.75 9 | 1 0.25 100.00 ------------+----------------------------------- Total | 405 100.00 hhcodefarmtype remark 39. 70013308 9 CHECK DATA FOR THIS OBS.

  10. Consistency check Values from one question are consistent with values from another question. • Demographic consistency of the household • Consistency of age and other individual characteristics gen test=hhmales+hhfemales list hhcodehhsizehhmaleshhfemales test remark if test!=hhsize, hhcodehhsizehhmaleshhfemales test remark 70013319 18 3 3 6 CHECK DATA FOR THIS OBS 70030507 14 4 4 8 CHECK DATA FOR THIS OBS. tab age5 hhcode age5 remark 70041703 281 CHECK DATA FOR THIS OBS.

  11. Typographical checks • Typographical error consists in the transposition of digits like entering : 41 rather than 14 This error can be check through the double data entry of all questionnaires -999 rather than .-99 in a numerical input foreachvar of varlist _all { replace `var'=-99 if `var'==-999 replace `var'=. if `var'==-99 } Use the tab function to obtain frequency tables of the data

More Related