1 / 50

Statistics for the Social Sciences

Statistics for the Social Sciences. Psychology 340 Fall 2013 Tuesday, November 19. Chi-Squared Test of Independence. Homework #13 due11/28. Ch 17 # 13, 14, 19, 20. Last Time:. Clarification and review of some regression concepts Multiple regression Regression in SPSS. This Time:.

ike
Download Presentation

Statistics for the Social Sciences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence

  2. Homework #13 due11/28 Ch 17 # 13, 14, 19, 20

  3. Last Time: • Clarification and review of some regression concepts • Multiple regression • Regression in SPSS

  4. This Time: • Review of multiple regression • New Topic: Chi-squared test of independence • Announcements: • Final project due date extended from Dec. 5 to Dec. 6. Must be turned in to psychology department by 4 p.m. • Extra credit due by the start of class (Dec. 5) to receive credit. Evidence of academic dishonesty regarding extra credit will be referred for disciplinary action. • Exam IV (emphasizing correlation, regression, and chi-squared test) is on Tuesday, December 3 • Final exam is on Tuesday, 12/10 at 7:50 a.m.

  5. Multiple Regression Typically researchers are interested in predicting with more than one explanatory variable In multiple regression, an additional predictor variable (or set of variables) is used to predict the residuals left over from the first predictor.

  6. Multiple Regression • Bi-variate regression prediction models Y = intercept + slope (X) + error

  7. Multiple Regression Multiple regression prediction models “residual” “fit” • Bi-variate regression prediction models Y = intercept + slope (X) + error

  8. Multiple Regression Multiple regression prediction models whatever variability is left over First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable

  9. Multiple Regression Predict test performance based on: whatever variability is left over First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable • Study time • Test time • What you eat for breakfast • Hours of sleep

  10. Multiple Regression Predict test performance based on: versus versus • Study time • Test time • What you eat for breakfast • Hours of sleep • Typically your analysis consists of testing multiple regression models to see which “fits” best (comparing R2s of the models) • For example:

  11. Multiple Regression Response variable Total variability it test performance Total study time r = .6 Model #1: Some co-variance between the two variables • If we know the total study time, we can predict 36% of the variance in testperformance R2 for Model = .36 64% variance unexplained

  12. Multiple Regression Model #2: Add test time to the model • Little co-variance between these test performance and test time • We can explain more the of variance in test performance R2 for Model = .37 Response variable Total variability it test performance Total study time r = .6 63% variance unexplained Test time r = .1

  13. Multiple Regression Model #3: No co-variance between these test performance and breakfast food • Not related, so we can NOT explain more the of variance in test performance R2 for Model = .37 Response variable Total variability it test performance breakfast r = .0 Total study time r = .6 63% variance unexplained Test time r = .1

  14. Multiple Regression Model #4: Some co-variance between these test performance and hours of sleep • We can explain more the of variance • But notice what happens with the overlap (covariation between explanatory variables), can’t just add r’s or r2’s R2 for Model = .45 Response variable Total variability it test performance breakfast r = .0 Total study time r = .6 55% variance unexplained Hrs of sleep r = .45 Test time r = .1

  15. Multiple Regression The “least squares” regression equation when there are multiple intercorrelated predictor (x) variables is found by calculating “partial regression coefficients” for each x A partial regression coefficient for x1 shows the relationship between y and x1 while statistically controlling for the other x variables (or holding the other x variables constant)

  16. Multiple Regression The formula for the partial regression coefficient is: b1= (rY1-rY2r12)/(1-r122)*(sY/s1) Where rY1=correlation of x1and y rY2=correlation of x2and y r12=correlation of x1 and x2 sY=standard deviation of y, s1=standard deviation of x1

  17. Multiple Regression Multiple correlation coefficient (R) is an estimate of the relationship between the dependent variable (y) and the best linear combination of predictor variables (correlation of y and y-pred.) R2 tells you the amount of variance in y explained by the particular multiple regression model being tested.

  18. Multiple Regression in SPSS Setup as before: Variables (explanatory and response) are entered into columns • A couple of different ways to use SPSS to compare different models

  19. Regression in SPSS Analyze: Regression, Linear

  20. Multiple Regression in SPSS Method 1:enter all the explanatory variables together Enter: • Predicted (criterion) variable into Dependent Variable field • All of the predictor variables into the Independent Variable field

  21. Multiple Regression in SPSS The variables in the model • r for the entire model • r2 for the entire model • Unstandardized coefficients • Coefficient for var1 (var name) • Coefficient for var2 (var name)

  22. Multiple Regression in SPSS The variables in the model • Coefficient for var1 (var name) • Coefficient for var2 (var name) • r for the entire model • r2 for the entire model • Standardized coefficients

  23. Multiple Regression • Which coefficient to use, standardized or unstandardized? • Unstandardized b’s are easier to use if you want to predict a raw score based on raw scores (no z-scores needed). • Standardized β’s are nice to directly compare which variable is most “important” in the equation

  24. Multiple Regression in SPSS Predicted (criterion) variable into Dependent Variable field • First Predictor variable into the Independent Variable field • Click the Next button • Method 2: enter first model, then add another variable for second model, etc. • Enter:

  25. Multiple Regression in SPSS Method 2 cont: Enter: • Second Predictor variable into the Independent Variable field • Click Statistics

  26. Multiple Regression in SPSS • Click the ‘R squared change’ box

  27. Multiple Regression in SPSS The variables in the first model (math SAT) • Shows the results of two models • The variables in the second model (math and verbal SAT)

  28. Multiple Regression in SPSS The variables in the first model (math SAT) • Shows the results of two models • The variables in the second model (math and verbal SAT) • r2 for the first model • Model 1 • Coefficients for var1 (var name)

  29. Multiple Regression in SPSS The variables in the first model (math SAT) • Coefficients for var1 (var name) • Coefficients for var2 (var name) • Shows the results of two models • The variables in the second model (math and verbal SAT) • r2 for the second model • Model 2

  30. Multiple Regression in SPSS The variables in the first model (math SAT) • Shows the results of two models • The variables in the second model (math and verbal SAT) • Change statistics: is the change in r2 from Model 1 to Model 2 statistically significant?

  31. Cautions in Multiple Regression • We can use as many predictors as we wish but we should be careful not to use more predictors than is warranted. • Simpler models are more likely to generalize to other samples. • If you use as many predictors as you have participants in your study, you can predict 100% of the variance. Although this may seem like a good thing, it is unlikely that your results would generalize to any other sample and thus they are not valid. • You probably should have at least 10 participants per predictor variable (and probably should aim for about 30).

  32. New (Final) Topic Chi-Squared Test of Independence

  33. Young (under 30) Old (over 30) A manufacturer of watches takes a sample of 200 people. Each person is classified by age and watch type preference (digital vs. analog). The question: is there a relationship between age and watch preference? Chi-Squared Test for Independence

  34. A manufacturer of watches takes a sample of 200 people. Each person is classified by age and watch type preference (digital vs. analog). Young (under 30) Old (over 30) The question: is there a relationship between age and watch preference? Chi-Squared Test for Independence

  35. Statistical analysis follows design We have finished the top part of the chart! Focus on this section for rest of semester

  36. A manufacturer of watches takes a sample of 200 people. Each person is classified by age and watch type preference (digital vs. analog). The question: is there a relationship between age and watch preference? Chi-Squared Test for Independence

  37. A manufacturer of watches takes a sample of 200 people. Each person is classified by age and watch type preference (digital vs. analog). The question: is there a relationship between age and watch preference? Chi-Squared Test for Independence Step 1: State the hypotheses • H0: Preference is independent of age (“no relationship”) • HA: Preference is related to age (“there is a relationship”) Observed scores

  38. The critical chi-squared value is 5.99 Chi-Squared Test for Independence Step 2: Compute your degrees of freedom & get critical value df = (#Columns - 1) * (#Rows - 1) = (3-1) * (2-1) = 2 • Go to Chi-square statistic table (B-8) and find the critical value • For this example, with df = 2, and a = 0.05

  39. Chi-Squared Test for Independence Step 3: Collect the data.Obtain row and column totals (sometimes called the marginals) and calculate the expected frequencies Observed scores

  40. Spot check: make sure the row totals and column totals add up to the same thing Chi-Squared Test for Independence Step 3: Collect the data.Obtain row and column totals (sometimes called the marginals) and calculate the expected frequencies Observed scores

  41. Expected scores 70 56 14 30 24 6 Digital Analog Undecided Under 30 Over 30 Chi-Squared Test for Independence Step 3: Collect the data.Obtain row and column totals (sometimes called the marginals) and calculate the expected frequencies Observed scores

  42. Expected scores 70 56 14 30 24 6 Digital Analog Undecided Under 30 Over 30 Chi-Squared Test for Independence Step 3: Collect the data.Obtain row and column totals (sometimes called the marginals) and calculate the expected frequencies Observed scores “expected frequencies” - if the null hypothesis is correct, then these are the frequencies that you would expect

  43. Chi-Squared Test for Independence Step 3: compute the χ2 • Find the residuals (fo - fe) for each cell

  44. Computing the Chi-square Step 3: compute the χ2 • Find the residuals (fo - fe) for each cell

  45. Computing the Chi-square Step 3: compute the χ2 • Square these differences • Find the residuals (fo - fe) for each cell

  46. Computing the Chi-square Step 3: compute the χ2 • Square these differences • Find the residuals (fo - fe) for each cell • Divide the squared differences by fe

  47. Computing the Chi-square Step 3: compute the χ2 • Square these differences • Find the residuals (fo - fe) for each cell • Divide the squared differences by fe • Sum the results

  48. A manufacturer of watches takes a sample of 200 people. Each person is classified by age and watch type preference (digital vs. analog). The question: is there a relationship between age and watch preference? Chi-Squared, the final step Step 4: Compare this computed statistic (38.09) against the critical value (5.99) and make a decision about your hypotheses here we reject the H0 and conclude that there is a relationship between age and watch preference

  49. In SPSS Analyze => Descriptives => Crosstabs Select the two variables (usually they are nominal or ordinal) you want to examine and click the arrow to move one into the “rows” and one into the “columns” box. Click on “statistics” button, and check the “Chi-square” box. Click “continue.” Click “OK.”

  50. SPSS Output Look at the “Chi-square tests” box. The top row of this box gives results for “Pearson’s Chi-Square” • “Value” is the value of the χ2 statistic, • “df” is the degrees of freedom for the test • “Asymp. Sig. (2-sided)” is the probability (p-value) associated with the test. • The chi-squared distribution, like the F-distribution, is “squared” so 1-tailed test is not possible.

More Related