My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z

Please click in Set your clicker to channel 41 My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z

Introduction to Statistics for the Social SciencesSBS200, COMM200, GEOG200, PA200, POL200, SOC200Lecture Section 001, Fall, 2011Room 201 Physics-Atmospheric Sciences (PAS)10:00 - 10:50 Mondays & Wednesdays + Lab Session Welcome Please double check – All cell phones other electronic devices are turned off and stowed away http://www.youtube.com/watch?v=oSQJP40PcGI

Homework #10 (Due October 31st) Complete Homework Hypothesis testing with z and t-scores Available on class website Homework #11 (Due November 2nd) Complete Homework Hypothesis testing 2-sample t-scores Available on class website Exam 3 - November 7th Study guide is available online

Use this as your study guide By the end of lecture today10/26/11 Logic of hypothesis testing Steps for hypothesis testing for t-tests How are t-tests similar to z-tests How are t-tests different from z tests Levels of significance (Levels of alpha) what does alpha of .05 mean? what does p < 0.05 mean? what does alpha of .01 mean? what does p < 0.01 mean? How is a two-tailed t-test different from a one-tailed t-test

Please read: Chapters 10 – 12 in Lind book and Chapters 2 – 4 in Plous book: (Before the next exam) Lind Chapter 10: One sample Tests of Hypothesis Chapter 11: Two sample Tests of Hypothesis Chapter 12: Analysis of Variance Plous Chapter 2: Cognitive Dissonance Chapter 3: Memory and Hindsight Bias Chapter 4: Context Dependence Chapter 12: Analysis of Variance WILL NOT appear on Exam 3

A quick re-visit with the law of large numbers

Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true signal (e.g. mean) As the number of observations (n) increases or the number of times the experiment is performed, the signal will become more clear (static cancels out) With only a few people any little error is noticed (becomes exaggerated when we look at whole group) With many people any little error is corrected (becomes minimized when we look at whole group) http://www.youtube.com/watch?v=ne6tB2KiZuk

Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule • Alpha level? (α= .05 or .01)? • Critical statistic (e.g. critical z) value? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed z is bigger than critical z then reject null (It is a “significant difference” and p < 0.05) Step 5: Conclusion - tie findings back in to research problem

Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses How is a t score same as a z score? How is a t score different than a z score? Step 2: Decision rule • Alpha level? (α= .05 or .01)? • Critical statistic (e.g. z or t) value? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed z (or t) is bigger than critical z (or t) then reject null Population versus sample standard deviation Population versus sample standard deviation Step 5: Conclusion - tie findings back in to research problem

. . A note on z scores, and t score: • Numerator is always distance between means • (how far away the distributions are or “effect size”) • Denominator is always measure of variability • (how wide or much overlap there is between distributions) Difference between means Difference between means Variability of curve(s)(within group variability) Variabilityof curve(s)

. A note on variability versus effect size Difference between means Difference between means Variability of curve(s)(within group variability) Variabilityof curve(s)

. A note on variability versus effect size Difference between means Difference between means . Variability of curve(s)(within group variability) Variabilityof curve(s)

. Effect size is considered relativeto variability of distributions 1. Larger variance harder to find significant difference Treatment Effect x Treatment Effect 2. Smaller variance easier to find significant difference x

. Effect size is considered relativeto variability of distributions Treatment Effect x Difference between means Treatment Effect x Variability of curve(s)(within group variability)

Comparing z score distributions with t-score distributions z-scores Similarities include: Using bell-shaped distributions to make confidence interval estimations and decisions in hypothesis testing Use table to find areas under the curve (different table, though – areas often differ from z scores) t-scores • Summary of 2 main differences: • We are now estimating standard deviation • from the sample • (We don’t know population standard deviation) • We have to deal with degrees of freedom

Comparing z score distributions with t-score distributions Differences include: • We use t-distribution when we don’t know standard deviation • of population, and have to estimate it from our sample Critical t (just like critical z) separates common from rare scores Critical t used to define both common scores “confidence interval” and rare scores “region of rejection

Comparing z score distributions with t-score distributions Differences include: • We use t-distribution when we don’t know standard deviation • of population, and have to estimate it from our sample 2) The shape of the sampling distribution is very sensitive to small sample sizes (it actually changes shape depending on n) Please notice: as sample sizes get smaller, the tails get thicker. As sample sizes get bigger tails get thinner and look more like the z-distribution

Comparing z score distributions with t-score distributions Please note: Once sample sizes get big enough the t distribution (curve) starts to look exactly like the z distribution (curve) scores Differences include: • We use t-distribution when we don’t know standard deviation • of population, and have to estimate it from our sample 2) The shape of the sampling distribution is very sensitive to small sample sizes (it actually changes shape depending on n) 3) Because the shape changes, the relationship betweenthe scores and proportions under the curve change (So, we would have a different table for all the different possible n’s but just the important ones are summarized in our t-table)

Interpreting t-table We use degrees of freedom (df) to approximate sample size Technically, we have a different t-distribution for each sample size This t-table summarizes the most useful values for several distributions This t-table presents useful values for distributions (organized by degrees of freedom) Each curve is based on its own degrees of freedom (df) - based on sample size, and its own table tying together t-scores with area under the curve n = 17 n = 5 . Remember these useful values for z-scores? 1.64 1.96 2.58

Area betweentwo scores Area between two scores Area beyond two scores (out in tails) Area beyond two scores (out in tails) Area in each tail (out in tails) Area in each tail (out in tails) df

Area betweentwo scores Area between two scores Area beyond two scores (out in tails) Area beyond two scores (out in tails) Area in each tail (out in tails) Area in each tail (out in tails) df Notice with large sample size it is same values as z-score . Remember these useful values for z-scores? 1.64 1.96 2.58

Comparison of z and t • For very small samples, t-values differ substantially from the normal. • As degrees of freedom increase, the t-values approach the normal z-values. • For example, for n = 31, the degrees of freedom are: • What would the t-value be for a 90% confidence interval? n - 1= 31 – 1 = 30 df

Degrees of Freedom Degrees of Freedom (d.f.) is a parameter based on the sample size that is used to determine the value of the t statistic. Degrees of freedom tell how many observations are used to calculate s, less the number of intermediate estimates used in the calculation.

Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule • Alpha level? (α= .05 or .01)? • One or two tailed test? • Balance between Type I versus Type II error • Critical statistic (e.g. z or t or F or r) value? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed z (or t) is bigger then critical z (or t) then reject null Step 5: Conclusion - tie findings back in to research problem

Hypothesis testing:one sample t-test Is the mean of my observed sample consistent with the known population mean or did it come from some other distribution? We are given the following problem: 800 students took a chemistry exam. Accidentally, 25 students got an additional ten minutes. Did this extra time make a significant difference in the scores? The average number correct by the large class was 74. The scores for the sample of 25 was Please note: In this example we are comparing our sample mean with the population mean (One-sample t-test) 76, 72, 78, 80, 73 70, 81, 75, 79, 76 77, 79, 81, 74, 62 95, 81, 69, 84, 76 75, 77, 74, 72, 75

Hypothesis testing H1: = 74 µ = 74 Ho: Step 1: Identify the research problem / hypothesis Did the extra time given to this sample of students affect their chemistry test scores Describe the null and alternative hypotheses One tail or two tail test? µ

Hypothesis testing Step 2: Decision rule = .05 n = 25 Degrees of freedom (df) = (n - 1) = (25 - 1) = 24 two tail test

two tail test α= .05 (df) = 24 Critical t(24) = 2.064

Hypothesis testing (x - x) (x - x)2 = 76.44 x 76 72 78 80 73 70 81 75 79 76 77 79 81 74 62 95 81 69 84 76 75 77 74 72 75 76 – 76.44 72 – 76.44 78 – 76.44 80 – 76.44 73 – 76.44 70 – 76.44 81 – 76.44 75 – 76.44 79 – 76.44 76 – 76.44 77 – 76.44 79 – 76.44 81 – 76.44 74 – 76.44 62 – 76.44 95 – 76.44 81 – 76.44 69 – 76.44 84 – 76.44 76 – 76.44 75 – 76.44 77 – 76.44 74 – 76.44 72 – 76.44 75– 76.44 0.1936 19.7136 2.4336 12.6736 11.8336 41.4736 20.7936 2.0736 6.5536 0.1936 0.3136 6.5536 20.7936 5.9536 208.5136 344.4736 20.7936 55.3536 57.1536 0.1936 2.0736 0.3136 5.9536 19.7136 2.0736 = -0.44 =-4.44 =+1.56 =+ 3.56 =-3.44 =-6.44 =+4.56 =-1.44 =+2.56 =-0.44 =+0.56 =+2.56 =+4.56 =-2.44 =-14.44 =+18.56 =+4.56 =-7.44 =+7.56 =-0.44 =-1.44 =+0.56 =-2.44 =-4.44 =-1.44 Step 3: Calculations µ = 74 Σx 1911 = = 25 N N = 25 = 6.01 868.16 24 Σx = 1911 Σ(x- x) = 0 Σ(x- x)2 = 868.16

. Hypothesis testing = 76.44 76.44 - 74 = = 2.03 1.20 Step 3: Calculations µ = 74 N = 25 s = 6.01 76.44 - 74 6.01 critical t critical t 25

Hypothesis testing Step 4: Make decision whether or not to reject null hypothesis Observed t = 2.03 Critical t = 2.064 2.03 is not farther out on the curve than 2.064, so, we do not reject the null hypothesis Step 6: Conclusion: The extra time did not have a significant effect on the scores

What if we had chosen a one-tail test? µ ≤ 74 Ho: Step 1: Identify the research problem Did the extra time given to this sample of students increase their chemistry test scores Prediction is uni-directional Describe the null and alternative hypotheses One tail or two tail test? Prediction is uni-directional µ > 74 H1: Step 2: Decision rule α= .05 Degrees of freedom (df) = (n - 1) = (25 - 1) = 24 α is all at one end so “critical t” changes Critical t (24) = ?????

one tail test α= .05 (df) = 24 Critical t(24) = 1.711

What if we had chosen a one-tail test? µ ≤ 74 Ho: = .05 Step 1: Identify the research problem Did the extra time given to this sample of students increase their chemistry test scores Describe the null and alternative hypotheses One tail or two tail test? µ > 74 H1: Step 2: Decision rule Degrees of freedom (df) = (n - 1) = (25 - 1) = 24 Critical t (24) = 1.711

Calculations (exactly same as two-tail test) (x - x)2 (x - x) = 76.44 x 76 72 78 80 73 70 81 75 79 76 77 79 81 74 62 95 81 69 84 76 75 77 74 72 75 76 – 76.44 72 – 76.44 78 – 76.44 80 – 76.44 73 – 76.44 70 – 76.44 81 – 76.44 75 – 76.44 79 – 76.44 76 – 76.44 77 – 76.44 79 – 76.44 81 – 76.44 74 – 76.44 62 – 76.44 95 – 76.44 81 – 76.44 69 – 76.44 84 – 76.44 76 – 76.44 75 – 76.44 77 – 76.44 74 – 76.44 72 – 76.44 75– 76.44 0.1936 19.7136 2.4336 12.6736 11.8336 41.4736 20.7936 2.0736 6.5536 0.1936 0.3136 6.5536 20.7936 5.9536 208.5136 344.4736 20.7936 55.3536 57.1536 0.1936 2.0736 0.3136 5.9536 19.7136 2.0736 = -0.44 =-4.44 =+1.56 =+ 3.56 =-3.44 =-6.44 =+4.56 =-1.44 =+2.56 =-0.44 =+0.56 =+2.56 =+4.56 =-2.44 =-14.44 =+18.56 =+4.56 =-7.44 =+7.56 =-0.44 =-1.44 =+0.56 =-2.44 =-4.44 =-1.44 Step 3: Calculations µ = 74 Σx 1911 = = 25 N N = 25 = 6.01 868.16 24 Σx = 1911 Σ(x- x) = 0 One-tailed test has no effect on calculations stage Σ(x- x)2 = 868.16

. Calculations(exactly same as two-tail test) One-tailed test has no effect on calculations stage = 76.44 76.44 - 74 = = 2.03 1.20 Step 3: Calculations µ = 74 N = 25 s = 6.01 76.44 - 74 6.01 critical t critical t 25

Hypothesis testing = 76.44 Step 3: Calculations µ = 74 t(24) = 2.03 N = 25 s = 6.01 Step 4: Make decision whether or not to reject null hypothesis Observed t = 2.03 Critical t(24) = 1.711 2.0 is farther out on the curve than 1.711, so, we do reject the null hypothesis Step 5: Conclusion: The extra time did have a significant effect on the scores

Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule • Alpha level? (α= .05 or .01)? • One or two tailed test? • Balance between Type I versus Type II error • Critical statistic (e.g. z or t or F or r) value? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed z (or t) is bigger then critical z (or t) then reject null Step 5: Conclusion - tie findings back in to research problem

Hypothesis testing Is this a single sample or two sample test? Is it a z or a t test or an ANOVA? Is it a one-tail test or a two tail test? Step 1: Identify the research problem Did the sheriff keep her promise to reduce response times to less than 30 minutes? Step 2: Describe the null and alternative hypotheses Ho: The response times are not less than 30 minutes H1: The response times are less than 30 minutes As the new chief of police, I am going to reduce response times for traffic accidents. Before I started the average response time was 30 minutes • One tail test • n = 10 (df = 9) • Alpha = .05 • Decision rule: critical t = 1.83

Step 3: Calculations: • Average time for response before 30 minutes • Average time for response after 24 minutes • Observed t = - 1.71 Step 4: Make decision whether or not to reject null hypothesis Observed t = - 1.71 Critical t = - 1.83 -1.71 is not farther out on the curve than -1.83 so, we do not reject the null hypothesis Step 5: Conclusion: There appears to be no significant difference between the sheriff’s times and 30 minutes Or: The new response times are not significantly less than 30 minutes – she did not keep her promise

Hypothesis testing: Did the sheriff keep her promise to reduce response times to less than 30 minutes? Start summary with two means (based on DV) for two levels of the IV notice we are comparing a sample mean with a population mean: single sample t-test Finish with statistical summaryt(9) = -1.71; ns Describe type of test (t-test versus anova) with brief overview of results Or if it had been different results that *were* significant:t(9) = -5.71; p < 0.05 The mean response time for following the sheriff’s new plan was 24 minutes, while the mean response time prior to the new plan was 30 minutes. A t-test was completed and there appears to be no significant difference in the response time following the implementation of the new plan t(9) = -1.71; n.s. n.s. = “not significant” p<0.05 = “significant” n.s. = “not significant” p<0.05 = “significant” Type of test with degrees of freedom Value of observed statistic

Thank you! See you next time!!

My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z