1 / 23

Session 8

Session 8. Tests of Hypotheses. Learning Objectives. By the end of this session, you will be able to set up, conduct and interpret results from a test of hypothesis concerning a population mean

Download Presentation

Session 8

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session 8 Tests of Hypotheses

  2. Learning Objectives By the end of this session, you will be able to • set up, conduct and interpret results from a test of hypothesis concerning a population mean • explain how means from two populations may be compared, and state assumptions associated with the independent samples t-test • interpret computer output from one or two-sample t-tests, present and write up conclusions resulting from such tests • explain the difference between statistical significance and an important result

  3. An illustrative example • Farmers growing maize in a certain area were getting average yields of 2900 kg/ha. • A “new” Integrated Pest Management (IPM) approach was attempted with 16 farmers. • Objective: To determine if the new approach results in an increase in maize yields. • Yields from these 16 farmers (after using IPM) gave mean = 3454 kg/ha, • with standard deviation = 672 kg/ha hence s.e. = 168. • Can we determine whether IPM has really increased maize yields?

  4. Is the yield increase real? • In above example, clearly the sample mean of 3454 kg/ha is greater than 2990 kg/ha • But the question of interest is • “does this result indicate a significant increase in the yield or might it just be a result of the usual random variation of yield” • Hypothesis testing seeks to answer such questions • by looking at the observed change relative to the “noise”, i.e. the standard error in the sample estimate

  5. Null H0 & Alternative H1 Null hypothesis H0:  = 2900 where  is the true mean yield of farmers in the area using the new approach The promoters of the new approach are confident that yields with the new approach cannot possibly decrease. Hence the above null hypothesis needs to be tested against the alternative hypothesis H1:  > 2900

  6. Testing the hypothesis Compute the t test statistic t = ( - )/(s/n) = (3454 – 2900)/(168) = 3.30 which follows a t-distribution with n-1=15 degrees of freedom. Use values of the t-distribution to find the probability of getting a result, which is as extreme, or more extreme than the one (3.30) observed, given H0 is true. The smaller this probability value, the greater is the evidence against the null hypothesis. This probability is called the p-valueor significance level of the test

  7. Analysis in Stata Type db ttesti or look for the One-sample mean comparison calculator on the menu

  8. Results t-value t-probabilities from formulae or table Result from the one-sided test done here

  9. Interpretation and conclusions It is clear from t-tables that the p-value is smaller than 0.01. Using statistical software, we get the exact p-value as 0.0024. This p-value is so small, there is sufficient evidence to reject H0. Conclusion: Use of the new IPM technology has led to an increase in maize yields (p-value=0.0024)

  10. An example: Comparing 2 means As part of a health survey, cholesterol levels of men in a small rural area were measured, including those working in agriculture and those employed in non-agricultural work. Aim: To see if mean cholesterol levels were different between the two groups.

  11. Summary statistics Begin with summarising each column of data. There appears to be a substantial difference between the two means. Our question of interest is: Is this difference showing a real effect, or could it merely be a chance occurrence?

  12. Setting up the hypotheses To answer the question, we set up: Null hypothesis H0: no difference between the two groups (in terms of mean response), i.e. 1 = 2 Alternative hypothesis H1: there is a difference, i.e. 1  2 The resulting test will be two-sided since the alternative is “not equal to”.

  13. Test for comparing means • Use a two-sample (unpaired) t-test • - appropriate with 2 independent samples • Assumptions • - normal distributions for each sample • - constant variance (so test uses a pooled estimate of variance) • - observations are independent • Procedure • - assess how large the difference in means is, relative to the noise in this difference, i.e. the std. error of the difference.

  14. Test Statistic The test statistic is: where s2, the pooled estimate of variance, is given by

  15. Numerical Results The pooled estimate of variance, is : = 1279.5 Hence the t-statistic is: = 41.7/(2x1279.5/10) = 2.61 , based on 18 d.f. Comparing with tables of t18, this result is significant at the 2% level, so reject H0. Note: The exact p-value = 0.018

  16. Results and conclusions Difference of means: 41.7 Standard error of difference: 15.99 95% confidence interval for difference inmeans: (8.09, 75.3). Conclusions: There is some evidence (p=0.018) that the mean cholesterol levels differ between those working in agriculture and others. The difference in means is 42 mg/dL with 95% confidence interval (8.1, 75.3).

  17. Analysis in Stata Input the data and do a t-test Or complete the dialogue as shown below Or type ttesti 10 203.9 33.9 10 162.2 37.6

  18. Results This was a 2-sided test

  19. General reporting the results Take care to report results according to size of p-value. For example, evidence of an effect is : • almost conclusive if p-value < 0.001 and could be said to be strong if p-value < 0.010 • If 0.01< p-value < 0.05, results indicate some evidence of an effect. • If p-value > 0.05, but close to 0.05, it may indicate something is going on, but further confirmatory study is needed.

  20. Significance: further comments e.g. Farmers report that using a fungicideincreased crop yields by 2.7 kg ha-1, s.e.m.=0.41 This gave a t-statistic of 6.6 (p-value<0.001) Recall that the p-value is the probability of rejecting the null hypothesis when it is true. i.e. it is the chance of error in your conclusion that there is an effect due to fungicide!

  21. How important are sig. tests? In relation to the example on the previous slide,we may find one of the following situations fordifferent crops. Mean yields: with and without fungicide. 589.9 587.2  Not an important finding! 9.9 7.2  Very important finding! It is likely that in the first of these results, either too much replication or the incorrect level of replication had been used (e.g. plant level variation, rather than plot level variation used to compare means).

  22. What does non-significance tell us? e.g. There was insufficient evidence in the data todemonstrate that using a fungicide had any effecton plant yields (p=0.128). Mean yields: with and without fungicide. 157.2 89.9 This difference may be an important finding, but thestatistical analysis was unable to pick up this differenceas being statistically significant. HOW CAN THIS HAPPEN? Too small a sample size? High variability in the experimental material? One or two outliers? All sources of variability not identified?

  23. Significance – Key Points • Statistical significance alone is not enough. Consider whether the result is also scientifically meaningful and important. • When a significant result if found, report the finding in terms of the corresponding estimates, their standard errors and C.I.s • (as is done by Stata)

More Related