1 / 51

Chapter 8 A Statistics Primer

Chapter 8 A Statistics Primer. Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e. Level of Measurement. Measures can be designed to have a higher, more complex level or a more basic, rudimentary level Influenced by how the variable is conceptualized

devika
Download Presentation

Chapter 8 A Statistics Primer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 8A Statistics Primer Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e

  2. Level of Measurement • Measures can be designed to have a higher, more complex level or a more basic, rudimentary level • Influenced by how the variable is conceptualized • Gender: can have only two categories (males and females) • Age: can be age (year of birth) or age groups (age categories, e.g., adolescent, young adult, etc.) • Influences choice of statistical analysis • Shown on Table 8.11 on page 222 • Related to measurement error (Chapter 13) © 2007 Pearson Education Canada

  3. Three Levels of Measurement Nominal: involves no underlying continuum; assignment of numeric values arbitrary • Examples: religious affiliation, gender, etc Ordinal: implies an underlying continuum; values are ordered but intervals are not equal. • Examples: community size, Likert items, etc. Ratio: involves an underlying continuum; numeric values assigned reflect equal intervals; zero point aligned with true zero. • Examples: weight, age in years, % minority © 2007 Pearson Education Canada

  4. Examples of Nominal Level Measures Do you have a valid driver’s licence? [ ] Yes [ ] No Your sex (Circle number of your answer) 1 Male 2 Female © 2007 Pearson Education Canada

  5. Example of Ordinal Level Measure The population of the place I considered my hometown when growing up was: Rural area 1 town under 5,000 2 5,000 to 19,999 3 20,000 to 99,999 4 100,000 to 999,999 5 1,000,000 or over 6 © 2007 Pearson Education Canada

  6. Ordinal Level Measures (cont’d) In the following items, circle a number to indicate the extent to which you agree or disagree with each statement. I would quit my present job if I won $1,000,000 through a lottery. Strongly Disagree 1 2 3 4 5 6 7 8 9 Strongly Agree I would be satisfied if my child followed the same type of career as I have. Strongly Somewhat Neither Agree Somewhat Strongly Disagree Disagree nor Disagree Agree Agree 1 2 3 4 5 © 2007 Pearson Education Canada

  7. Examples of Ratio Level • What is your age? ___ • Body temperature _____ • Index comprised of several ordinal variables are treated as ratio level variables in social sciences © 2007 Pearson Education Canada

  8. Describing an Individual Variable • Statistics provide ways to describe and compare sets of observations (e.g., income levels, infant mortality, morbidity, crime, etc.) • Two common ways of describing a distribution (a set of scores in a data set) • Measures of central tendency • Measures of dispersion © 2007 Pearson Education Canada

  9. Measures of Central Tendency • A number that typifies the central scores of a set of values • Mean • Median • Mode © 2007 Pearson Education Canada

  10. Mean • The arithmetic average or average • Calculated by summing values and dividing by number of cases • Used to describe central tendency of ratio level data © 2007 Pearson Education Canada

  11. Median • The midpoint • Used to describe central tendency of ordinal level data • Calculated by ordering a set of values and then using the middlemost value (in cases of two middle values, calculate the mean of the two values). • Often used when a data set has extreme cases © 2007 Pearson Education Canada

  12. Table 8.7 Median for Extreme Values © 2007 Pearson Education Canada

  13. Mode • The most frequently occurring value • Used to describe central tendency of nominal level data (gender, religion, nationality) TABLE 8.8 DISTRIBUTION OF RESPONDENTS BY COUNTRY COUNTRY NUMBER PERCENT Canada 65 34.9 ←mode New Zealand 58 31.2 Australia 63 33.9 TOTAL 186 100.0 © 2007 Pearson Education Canada

  14. Measures of Dispersion • Indicates dispersion or variability of values • Are scores close together or spread out? • Three common measures of dispersion: • Range • Standard deviation • Variance © 2007 Pearson Education Canada

  15. Table 8.9 Two Grade Distributions © 2007 Pearson Education Canada

  16. Range • Gap between the lowest and highest value • Computed by subtracting the lowest from the highest © 2007 Pearson Education Canada

  17. The standard deviation measures the average amount of deviation from the mean value of the variable The variance is the standard deviation squared Standard Deviation and Variance © 2007 Pearson Education Canada

  18. Table 8.10 Computation of Standard Deviation, Beth’s Grades Note: The “N – 1” term is used when sampling procedures have been used. When population values are used the denominator is “N.” SPSS uses N – 1” in calculating the standard deviation in the DESCRIPTIVES procedure. © 2007 Pearson Education Canada

  19. Standardizing Data • Standardizing data facilitates making comparisons between units of different size • Also, can standardize data to create variables that have similar variability (Z scores) • Standardization of data is commonly done • Several methods of standardizing data: proportions, percentages, percentage change, rates, ratios © 2007 Pearson Education Canada

  20. Proportions • A proportion represents the part of 1 that some element represents. Proportion female = Number female Total persons Proportion female = 31,216 58,520 Proportion female = .53 The females represent .53 of the population © 2007 Pearson Education Canada

  21. Percentage • A percentage represents how often something happens per 100 times • A proportion may be converted to a percentage by multiplying by 100 • Females constitute 53% of the population © 2007 Pearson Education Canada

  22. Percentage Change • Percentage change is a measure of how much something has changed over a given time period. • Percentage change is: Time 2 – Time 1 x 100 Time 1 • Example: percentage change in number of women in selected occupations (Table 8.13, p. 223) © 2007 Pearson Education Canada

  23. Rates • Rates represent the frequency of an event for a standard-sized unit. Divorce rates, suicide rates, crime rates are examples. • So if we had 104 suicides in a population of 757,465 the suicide rate per 100,000 would be calculated as follows: SR = 104 x 100,000 = 13.73 757,465 There are 13.73 suicides per 100,000 © 2007 Pearson Education Canada

  24. Ratios • A ratio represents a comparison of one thing to another. • So if there are 200 burglaries per 100,000 in the U.S. and 57 per 100,000 in Canada, the U.S./Canadian burglary ratio is: US Burglary Rate = 200 = 3.51 Canadian Burglary Rate 57 © 2007 Pearson Education Canada

  25. Normal Distribution • Much data in the social and physical world are “normally distributed”; this means that there will be a few low values, many more clustered toward the middle, and a few high values. • Normal distributions: • symmetrical, bell-shaped curve • mean, mode, and median will be similar • 68.28% of cases ± 1 standard deviation of mean • 95.46% of cases ± 2 standard deviations of mean © 2007 Pearson Education Canada

  26. Figure 8.2 Normal Distribution Curve © 2007 Pearson Education Canada

  27. Z Scores • A Z score represents the distance from the mean, in standard deviation units, of any value in a distribution. • The Z score formula is as follows: © 2007 Pearson Education Canada

  28. Areas Under the Normal Curve • Can determine what proportion of cases fall between two values or above/below a value Steps: • Draw normal curve, marking mean and SD, and including lines to represent problem • Calculate Z score(s) for the problem • Look up value on Table 8.17, page 230 • Solve problem. Recall that .5 of cases fall above the mean, and .5 below the mean • Convert proportion to percentage, if needed © 2007 Pearson Education Canada

  29. Other Distributions • Not all variables are normally distributed • Bimodal: two overlapping normally distributed plots • weight (females will have lower average rates) • Leptokurtic: little variability • distribution appears tall and peaked • Platykurtic: great deal of variability • distribution appears flat and wide • Having a normal distribution is important for doing tests of statistical significance (Table 8.18) © 2007 Pearson Education Canada

  30. Figure 8.4 Other Distributions © 2007 Pearson Education Canada

  31. Describing Relationships Among Variables Involves three important steps: • Decide which variable is to be treated as dependent variable and independent variable • Decide on the appropriate procedure for examining the relationship • Perform the analysis © 2007 Pearson Education Canada

  32. Methods Selection of statistical method depends upon the level of measurement of the dependent and independent variables • Contingency tables: Crosstabs • Comparing means: means analysis • Correlational analysis: correlation © 2007 Pearson Education Canada

  33. Contingency Tables: Crosstabs • A contingency table cross-classifies cases on two or more variables to show the relation between an independent and dependent variable • Uses a nominal dependent variable and an ordinal or nominal independent variable A standard table looks like the one on the following slide. © 2007 Pearson Education Canada

  34. Table 8.19 Plans to Attend University by Size of Home Community If a test of significance is appropriate for the table, the value for the raw Chi-Square value (which will be introduced in Chapter 9), the degrees of freedom, whether the test is one- or two-tailed, and the probability level should be indicated. © 2007 Pearson Education Canada

  35. Rules for Constructing a Contingency Table • In table titles, name the dependent variable first • Place dependent variable on vertical plane • Place independent variable on horizontal plane • Use variable labels that are clear • Run percentages toward the independent variable • Report percentages to one decimal point • Report statistical test results below table • Interpret the table by comparing categories of the independent variable • Minimize categories in control tables © 2007 Pearson Education Canada

  36. Comparing Means: Means • Used when dependent variable is ratio • Comparison to categories of independent variable (nominal or ordinal) • Both t-test and ANOVA may be used (Chapter 9) Presentation may be as shown on the following slide. © 2007 Pearson Education Canada

  37. Table 8.22 Mean Income by Gender If a test of significance is appropriate for the table, the values for the t-test or F-test value, the degrees of freedom, whether the test is one- or two-tailed, and the probability level should be indicated. © 2007 Pearson Education Canada

  38. Correlational Analysis: Correlation • Correlational analysis is a procedure for measuring how closely two ratio level variables co-vary together • Basis for more advanced procedures: partial correlations, multiple correlations, regression, factor analysis, path analysis and canonical analysis • Advantage: can analyze many variables (multivariate analysis) simultaneously • Relies on having ratio level measures © 2007 Pearson Education Canada

  39. Two Basic Concerns • What is the equation that describes the relation between two variables? • What is the strength of the relation between the two? Two visual estimations procedures • The linear equation: Y = a + bX • Correlation coefficient: r © 2007 Pearson Education Canada

  40. The Linear Equation • The linear equation, Y = a + bX, describes the relation between the two variables • Components: Y - dependent variable (e.g., starting salary) X - independent variable (e.g., years of post-secondary education) a - the constant, which indicates where the regression line intersects the Y-axis b - the slope of the regression line © 2007 Pearson Education Canada

  41. Step 1: Plot the relation on a graph Table 8.24: Sample data set X Y 2 3 3 4 5 4 7 6 8 8 A. The Linear Equation:A Visual Estimation Procedure © 2007 Pearson Education Canada

  42. Step 2: Insert a straight regression line From the regression line one can estimate how much one has to change the independent variable in order to produce a unit change in the dependent variable A. The Linear Equation (cont’d) © 2007 Pearson Education Canada

  43. Step 3: Observe where the regression line crosses the Y axis; this represents the constant in the equation (a = 1.33 on Figure 8.6) Step 4: Draw a line parallel to the X axis and one parallel to the Y axis to form a right-angled triangle Measure the lines; divide the horizontal distance into the vertical distance to compute the b value (72/91 = 0.79) A. The Linear Equation (cont’d) © 2007 Pearson Education Canada

  44. Step 5: If the slope of the regression is such that it is lower on the right-hand side, the b coefficient is negative, meaning the more X, the less Y. If the slope is negative, use a minus sign in your equation Y= a–bX Step 6: Write the equation: Y = 1.33 + 0.79(X) The above formula is our visually estimated equation between the two variables Equation used to predict the value of a Y variable given a value of the X variable Done in regression analysis A. The Linear Equation (cont’d) © 2007 Pearson Education Canada

  45. B. Correlation Coefficient: A Visual Estimation Procedure • Goal: to develop a sense of what correlations of different magnitudes look like • Correlation coefficient (r) is a measure of the strength of the association between two variables • Vary from +1 to –1 • Perfect correlations are rare • Usually presented by a decimal point, as in .98, .56, –.32 • Negative correlations ~ negative slope © 2007 Pearson Education Canada

  46. Figure 8.9 Eight Linear Correlations © 2007 Pearson Education Canada

  47. Figure 8.9 Eight Linear Correlations (cont’d) © 2007 Pearson Education Canada

  48. B. Correlation Coefficient (cont’d) Figure 8.9 graphs 8 relationships • Graphing allows you to visually estimate the strength of the association • The closer the plotted points are to the regression line (e.g., Plots 1 and 2), the higher the correlation (.99 and .85) • Greater spread (e.g., Plots 3 and 4) ~ lower correlation (.53 and .36) • Would be difficult to draw regression line if r < .36 © 2007 Pearson Education Canada

  49. B. Correlation Coefficient (cont’d) • Plots 5 and 6: curvilinear: not linear, hence r = 0 • Procedure not appropriate for curvilinear relations • Plots 7 and 8: problem plots: deviant cases • This is one of the reasons it is important to plot relationships; extreme values indicate a non-linear relationship, therefore linear regression procedure are not appropriate for studying these relationships © 2007 Pearson Education Canada

  50. Calculating the Correlation Coefficient • The estimation of the correlation coefficient takes two kinds of variability into account: • Variations around the regression line • Variations around the mean of Y r2 = 1 – variations around regression variations around mean of Y • Can calculate (see p. 245); computer programs used by most researchers today © 2007 Pearson Education Canada

More Related