1 / 36

Research Methodology

Research Methodology. Lecture No : 21 Data Preparation and Data Entry. Recap Lecture. In the last few lectures we discussed about: Research Design The purpose, investigation type, researcher interference, study setting, unit of analysis, time horizon, Measurement of variables

van
Download Presentation

Research Methodology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Methodology Lecture No : 21 Data Preparation and Data Entry

  2. Recap Lecture In the last few lectures we discussed about: • Research Design • The purpose, investigation type, researcher interference, study setting, unit of analysis, time horizon, Measurement of variables • Sources of Data • Sampling • Experimental Design

  3. Lecture Objectives Getting the data ready for analysis • Data preparation • Coding, codebook, pre-coding, coding rules • Data entry • Editing data • Data transformation

  4. Data Preparation and Description • Data preparation includes editing, coding, and data entry • It is the activity that ensures the accuracy of the data and their conversion from raw form to reduced and classified forms that are more appropriate for analysis. • Preparing a descriptive statistic summary is another preliminary step that allows data entry errors to be identified and corrected.

  5. Getting the Data Ready for Analysis • After data obtained through questionnaire, they need to be coded, keyed in, and edited. • Outliers, inconsistencies and blank responses, if any, have to be handled in some way.

  6. Coding • Data coding involves assigning a number to the participants responses so, they can be entered into data base. • In coding, categories are the partitions of a data set of a given variable. For instance, if the variable is gender, the categories are male and female. • Categorization is the process of using rules to partition a body of data. • Both closed and open questions must be coded.

  7. Coding Cont. • Numeric coding simplifies the researcher’s task in converting a nominal variable like gender to a 1 or 2.

  8. Code Construction There are two basic rules for code construction. • First, the coding categories should be exhaustive, meaning that a coding category should exist for all possible responses. • For example, household size might be coded 1, 2, 3, 4, and 5 or more. • The “5 or more” category assures all subjects of a place in a category.

  9. Code Construction Cont. • Second, the coding categories should be mutually exclusive and independent. • This means that there should be no overlap among the categories to ensure that a subject or response can be placed in only one category.

  10. Code Construction Cont. • Missing data should also be represented with a code. • In the “good old days” of computer cards, a numeric value such as 9 or 99 was used to represent missing data. • Today, most software will understand that either a period or a blank response represents missing data.

  11. Codebook • A codebook contains each variable in the study and specifies the application of coding rules to the variable. • It is used by the researcher or research staff to promote more accurate and more efficient data entry. • It is the definitive source for locating the positions of variables in the data file during analysis.

  12. Sample Codebook

  13. Pre-coding • Pre-coding means assigning codebook codes to variables in a study and recording them on the questionnaire. • Or you could design the questionnaire in such a way that apart from the respondents choice it also indicates the appropriate code next to it. • With a pre-coded instrument, the codes for variable categories are accessible directly from the questionnaire.

  14. Sample Pre-coded Instrument

  15. Coding Open-Ended Questions • One of the primary reasons for using open-ended questions is that insufficient information or lack of a hypothesis may prohibit preparing response categories in advance. Researchers are forced to categorize responses after the data are collected.

  16. Coding Open-Ended Questions Cont. • In the Figure on the next slide, question 6 illustrates the use of an open-ended question. After preliminary evaluation, response categories were created for that item. They can be seen in the codebook.

  17. Coding Open-Ended Questions Cont.

  18. Coding Rules Exhaustive Appropriate to the research problem Categories should be Mutually exclusive Derived from one classification principle

  19. Data Entry • After responses have been coded, they can be entered into data base. • Raw data can be entered through any software program. • For example: SPSS Data Editor.

  20. Keyboarding Database Programs Digital/ Barcodes Optical Recognition Voice recognition Data Entry Cont.

  21. Editing Data • After data entered, the blank responses, if any, have to be handled in some way, and inconsistent data have to be checked and followed up. • Data editing deals with detecting and correctingillogical,inconsistent, or illegal data and omissions in the information returned by the participants of study.

  22. Accurate Consistent Criteria Arranged for simplification Uniformly entered Complete Editing Data Cont.

  23. Field Editing • Field Editing Review • Entry Gaps  Callback • Validates  Re-interviewing

  24. Field Editing Review • In large projects, field editing review is a responsibility of the field supervisor. • It should be done soon after the data have been collected. • During the stress of data collection, data collectors often use ad hoc abbreviations and special symbols.

  25. If the forms are not completed soon, the field interviewer may not recall what the respondent said. • Therefore, reporting forms should be reviewed regularly.

  26. Field Editing Cont. • Entry Gaps  Callback • When entry gaps are present, a callback should be made rather than guessing what the respondent probably said.

  27. Field Editing Cont. • Validates  Re-interviewing • The field supervisor also validates field results by re-interviewing some percentage of the respondents on some questions to verify that they have participated. • Ten percent is the typical amount used in data validation.

  28. Central Editing • Scale of Study  Number of Editors • At this point, the data should get a thorough editing. • For a small study, a single editor will produce maximum consistency. • For large studies, editing tasks should be allocated by sections.

  29. Central Editing Cont. • Wrong Entry  Replacements • Sometimes it is obvious that an entry is incorrect and the editor may be able to detect the proper answer by reviewing other information in the data set. • This should only be done when the correct answer is obvious. • If an answer given is inappropriate, the editor can replace it with a no answer or unknown.

  30. Central Editing Cont. • Fakery  Open-ended Questions • The editor can also detect instances of armchair interviewing, fake interviews, during this phase. • This is easiest to spot with open-ended questions.

  31. Central Editing Cont. Guidelines for Editors Be familiar with instructions given to interviewers and coders Do not destroy the original entry Make all editing entries identifiable and in standardized form Initial all answers changed or supplied Place initials and date of editing on each instrument completed

  32. Handling “Don’t Know” Responses • When the number of “don’t know” (DK) responses is low, it is not a problem. However, if there are several given, it may mean that the question was poorly designed, too sensitive, or too challenging for the respondent. • The best way to deal with undesired DK answers is to design better questions at the beginning. • If DK response is legitimate, it should be kept as a separate reply category.

  33. Data Transformation • Data transformation, a variation of data coding, is a process of changing the original numerical representation of a quantitative value to another value. • E.g: The data given is in per year consumption and we need it for each month. • Data are typically changed to avoid problems in the next stage of data analysis process.

  34. Data Transformation Cont. • For example, economists often use a logarithmic transformation so that the data are more evenly distributed. • Data transformation is also necessary when several questions have been used to measure a single concept. • E.g: Intentions to leave is measured through 10 questions which need to be transformed into a single value for a single respondent

  35. Recap • Questionnaire checking involves eliminating unacceptable questionnaires. • These questionnaires may be incomplete, instructions not followed, missing pages, past cutoff date or respondent not qualified. • Editing looks to correct illegible, incomplete, inconsistent and ambiguous answers. • Coding typically assigns alpha or numeric codes to answers that do not already have them so that statistical techniques can be applied.

  36. Recap Cont. • Cleaning reviews data for consistencies. Inconsistencies may arise from faulty logic, out of range or extreme values. • Statistical adjustments applies to data that requires weighting and scale transformations.

More Related