1 / 26

Planning how to create the variables you need from the variables you have

Planning how to create the variables you need from the variables you have. Jane E. Miller, PhD. Overview. Why researchers sometimes need to create new variables to conduct their analysis Why it is important to plan ahead for how to create those new variables

greenevelyn
Download Presentation

Planning how to create the variables you need from the variables you have

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Planning how to create the variables you need from the variables you have Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2nd edition.

  2. Overview • Why researchers sometimes need to create new variables to conduct their analysis • Why it is important to plan ahead for how to create those new variables • What information is required to identify the new variables needed for the research question • How to write clear instructions on how to get from the variables you have to the variables you need The Chicago Guide to Writing about Numbers, 2nd Edition.

  3. Why create new variables? • For many statistical analyses, variables available on the original data set are not yet in the form needed to address the research question of interest. • Examples: • You want to study total family income, but the data set has separate variables measuring income components such as earned income, government benefits, and alimony. • You want to compare outcomes for age groups (children, working age adults, and the elderly), but the data set reports respondent’s age in single years. The Chicago Guide to Writing about Numbers, 2nd edition.

  4. Conceptualizing the new variable should precede programming it • Important to separate • Researching and planning how those variables should be defined • Programmingthe new variable in an electronic database • Each of those tasks • Has its own challenging aspects • Uses different • Skills • Resources

  5. Some common patterns of creating new from existing variables • A categorical version of a continuous variable • A simplified (collapsed) categorical variable • A binary indicator from a continuous variable • A new continuous variable that combines2+ continuous variables • A mathematical transformation of a continuous variable The Chicago Guide to Writing about Numbers, 2nd edition.

  6. A categorical version of a continuous variable • Original variable • Age in years (continuous) • Needed variable • Age group (categorical) The Chicago Guide to Writing about Numbers, 2nd edition.

  7. A simplified (collapsed) categorical variable • Original variable • Ten-category ethnicity variable • Needed variable • Three-category ethnicity variable The Chicago Guide to Writing about Numbers, 2nd edition.

  8. A binary indicator from a continuous variable • Original variable • Birth weight in grams (continuous) • Needed variable • Indicator of low birth weight status (yes or no) The Chicago Guide to Writing about Numbers, 2nd edition.

  9. A new continuous variable that aggregates 2+ continuous variables The Chicago Guide to Writing about Numbers, 2nd edition.

  10. A new continuous variable calculated from 2+ continuous variables The Chicago Guide to Writing about Numbers, 2nd edition.

  11. A mathematical transformation of a continuous variable The Chicago Guide to Writing about Numbers, 2nd edition.

  12. Planning steps for creating new variables • Finding relevant variables in the original data set • Becoming acquainted with the units and categories for available variables • Consulting the published literature on the topic to see how those concepts have been measured or classified by other researchers • Identifying pertinent formulas and thresholds • Writing out the logic or math needed to create the new variables from existing variables The Chicago Guide to Writing about Numbers, 2nd edition.

  13. Steps toward creating a new variable • Identify the name(s) of the original variable(s) in the data set that contain the data needed to create the new variable. • For the new variable, devise • A name (acronym) to convey • Content (meaning) of the new variable • The dates or survey rounds when the data were collected, if pertinent • A label (short descriptive phrase) for the new variable • Mention units, if pertinent The Chicago Guide to Writing about Numbers, 2nd edition.

  14. For new continuous variables • Write the formulato calculate the value of the new variable from the original variables. • Specify the units of the original variable(s) and the new variable. The Chicago Guide to Writing about Numbers, 2nd edition.

  15. Example: Calculating course grades from component test scores • For a hypothetical college course, the overall course grade is based on three exam scores • Two mid-term exams (EXAM1 and EXAM2) • Each scored from 0 to 25 points • A final exam (FINAL) • Scored from 0 to 50 points • For each student, the instructor wants to calculate • The percentage of questions s/he got correct on exam 1 • Total numeric course grade • Course letter grade, based on standard grade cutoffs The Chicago Guide to Writing about Numbers, 2nd edition.

  16. Calculating percentage of exam questions correct from number of questions correct • Logic: From the information in the data set, how does one calculate the percentage of questions correct? • Concepts: Percentage of questions correct is number of questions correct divided by the total number of questions on the exam, multiplied by 100. • Formula: Replace concepts with names of variables: PCCOREX1 = (EXAM1/25) * 100 STEP 2: name for new variable, not yet in data set. STEP 1: Identify existing variables, already in data set from which new variable will be calculated. STEP 3: Write the mathematical formula The Chicago Guide to Writing about Numbers, 2nd edition.

  17. Creating a variable for total numeric course grade from exam scores • Logic: From the information in the data set, how does one calculate total numeric course grade? • Concepts: Overall numeric course grade is the sum of the three exam scores. • Formula: Replace concepts with names of variables: TOTGRADE = EXAM1 + EXAM2 + FINAL STEP 2: name for new variable, not yet in data set. STEP 1: Identify existing variables, already in data set from which new variable will be calculated. STEP 3: Write the mathematical formula The Chicago Guide to Writing about Numbers, 2nd edition.

  18. For new categorical variables • Write the logical steps to classify the values of the original variable into the values of the new variable. • Show how every possible value of the original variable maps into a value of the new variable. • List the • Value label (descriptive phrase) for each value (category) of the new variable; • Code (numeric value) that the new variable will take on for each value or set of values of the original variable. The Chicago Guide to Writing about Numbers, 2nd edition.

  19. Classifying numeric course grades into letter grade ranges STEP 1: Identify existing variables from which new variable will be created. STEP 2: name for new variable, not yet in data set. STEP 3: Write the logic for classifying the numeric scores into letter grade ranges, based on the university’s standard grade cutoffs. E.g., scores below 60 are classified an “F.”

  20. Missing values for the new variable • Provide instructions to ensure that cases that have missing values on the original variables will also have missing values for new variables that are based on them. • Needed whether the new variable was created using • A formula • Classification instructions The Chicago Guide to Writing about Numbers, 2nd edition.

  21. Summary • It is often necessary to create new variables to answer one’s research question. • Planning steps for creating new variables include • Identifying source variables available in a data set • Finding references about how such variables are conventionally analyzed • Becoming familiar with units or categories of the variables • Writing formulas or classification instructions to create the new variables from the original variables • Providing instructions about missing values for the original and new variables The Chicago Guide to Writing about Numbers, 2nd Edition.

  22. Summary, cont. • With the formulas and classification instructions for creating the new variables, one can then use a spreadsheet or statistical software to create those variables within an electronic data set. • Separate • The researching and planning steps • The programming steps The Chicago Guide to Writing about Numbers, 2nd edition.

  23. Suggested resources • Miller, J. E. 2015. The Chicago Guide to Writing about Numbers, 2nd Edition. University of Chicago Press, chapter 10. The Chicago Guide to Writing about Numbers, 2nd edition.

  24. Suggested practice exercises Instructions and a planning template can be downloaded from the supplemental online materials at http://press.uchicago.edu/books/miller/numbers/index.htm The Chicago Guide to Writing about Numbers, 2nd Edition.

  25. Suggested online appendixes • How to Create the Variables You Need from the Variables You Have • Exercise includes • Step-by-step instructions • A template planning grid for a new categorical variable • Paper for instructors on how to teach the concepts and skills • Getting to Know Your Variables • Exercise to familiarize researchers with the concepts, units, categories of variables in their data set • Paper for instructors on how to teach the concepts and skills The Chicago Guide to Writing about Numbers, 2nd Edition.

  26. Contact information Jane E. Miller, PhD jmiller@ifh.rutgers.edu Online materials available at http://press.uchicago.edu/books/miller/numbers/index.html The Chicago Guide to Writing about Numbers, 2nd Edition.

More Related