1 / 81

Confidence Intervals

Confidence Intervals. Confidence Intervals. Data can be described by point estimates mean, standard deviation, etc. Point estimates from a sample are not always equal to population parameters Data can be described by interval estimates shows the variability of the estimate.

olina
Download Presentation

Confidence Intervals

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Confidence Intervals

  2. Confidence Intervals • Data can be described by point estimates • mean, standard deviation, etc. • Point estimates from a sample are not always equal to population parameters • Data can be described by interval estimates • shows the variability of the estimate. • Using the standard error we can see the amount that the estimate will vary from the true value.

  3. Confidence Intervals • Interval estimates are called confidence intervals (CI). • CI define the an upper limit and lower limit associated with a known probability. • These limits are known as confidence limits. • The associated probability of the CI is most commonly 95%, but may be 99% or 90%

  4. Confidence Intervals • Confidence limits set the boundaries that are likely to include the population mean. • Thus, we can conclude that in general, we are 95% confident that the true mean of the population is found within these limits.

  5. Standard Error • The standard error is defined as • We expect that the mean is within one standard error of m quite often. • SE is a measure of the precision of x as an estimate of m. • The smaller SE the more precise the estimate • SE includes two factors that affect the precision of the measurement n and sd

  6. Standard Deviation vs Standard Error • Standard deviation describes the dispersion of the data. • The variability from one data point to the next • Standard error (SE) describes the uncertainty in the mean of the data that is a result of sampling error. • The variability associated with the sample mean

  7. Calculating Confidence Intervals • Recall that 95% of the area under a standard curve is between z = ±1.96. 95.45% -1.96 1.96

  8. Calculating Confidence Intervals • The general formula is: • P = 0.95 • Lower limit = x - 1.96( n ) • Upper limit = x + 1.96 ( n )

  9. Calculating CI of two samples • We use the t-distribution. • The t distribution describes the distribution of the sample mean when the variance is also estimated from sample data. • Thus, the formula for the CI in these cases is:

  10. Example: Problem • To assess the effectiveness of hormone replacement therapy on bone mineral density, 94 women between the age of 45 and 64 were given estrogen medication. After taking the medication for 36 months the bone mineral density was measured for each of the women in the study. The average density was 0.878 g/cm2 with a standard deviation of 0.126 g/cm2. Calculate a 95% CI for the mineral bone density of this population.

  11. Example: SE and t • Recall that SE is: • t(a, df) = t(0.025, 93) = 1.990

  12. Example: Calculations

  13. Example: Conclusion • The 95% confidence limits are: • lower: 0.852 g/cm2; upper: 0.904 g/cm2 • We are 95% confident that the average bone density of all women age 45 to 64 who take this hormone replacement medication is between 0.852 g/cm2 and 0.904 g/cm2.

  14. Example: Conclusions (cont’d) • For a 95% confidence intervals we believe that 95% of the samples drawn form the population would have a mean that fall within the confidence limits

  15. Other Confidence Limits • For a 99% or 90% CI the calculations and interpretations are similar. • What CI is going to give the widest or narrowest interval? • CI can be established for any parameter • mean, proportion, relative risk, odds ratio, etc.

  16. Using CI to Test Hypotheses • Diastolic blood pressure of 12 people before and after administration of a new drug. • Paired t-test • Hypotheses: H0: md > 0; Ha: md < 0 • xd = -3.1 • sd = 4.1

  17. Using CI to Test Hypotheses

  18. Using CI to Test Hypotheses • Conclusion – since zero does not fall within the interval we can conclude with 95% certainty that there is a significant decrease in blood pressure after taking the new drug. • If we did a paired t-test the conclusions would be the same.

  19. True Population Mean (m) ] [ 1 2 3 4 5 6 ] [ ] [ ] [ CI for different samples ] [ ] [ ] [ 7 8 9 10 11 ] [ ] [ ] [ ] [ Visual Representation of CI

  20. ANOVA: Analysis of Variance Single Factor

  21. Objectives • Know the assumptions for an ANOVA • When is ANOVA used rather than a t-test • Set up ANOVA tables and understand the relationships between the values within the table • Compute the F-ratio and appropriate degrees of freedom • Know how and when to use a two factor ANOVA • Apply Tukey’s multiple comparison procedure

  22. ANOVA vs. t-test • A statistical method of comparing means of different groups. • A single factor ANOVA for two groups produces the same p-value as an independent t-test • The t-test is inappropriate for more than two groups – increases probability of a Type I error • Using a t-test to test the means of each pair leads to problems regarding the proper level of significance.

  23. ANOVA vs. t-test • ANOVA is not limited to two groups • Can appropriately handle comparisons of several means from several groups • Thus, ANOVA overcomes the difficulty of doing multiple t-tests • The sampling distribution used is the F distribution.

  24. ANOVA: assumptions • The observations are independent • one observation is not correlated with another observation. • Variance of the various groups are homogeneous • ANOVA is a robust test that is not as sensitive to departures from normality and homogeneity, especially when sample sizes are large and nearly equal for each group.

  25. ANOVA: Characteristics • ANOVA analyzes the variance of the groups to evaluate differences in the mean. • Within group • Measures the variance of observations within each group • variance due to “chance” • Between groups • measure the variance between the groups • variance due to treatment or chance

  26. ANOVA: Characteristics • It can be shown that when means of each group are equal, the within and the between group variance is equal. • The F-statistic is the ratio of the estimated variance

  27. ANOVA : the F distribution • The ratio follows an F distribution • The F statistic has two sets of degrees of freedom. • For between groups -- (I - 1); where I is the number of groups • For within groups -- I(J - 1); where J is the number of observations in each group

  28. ANOVA: single factor • Let I = the number of population samples • Let J = the number of observations in each sample • Thus the data consist of IJ observations • The overall or grand mean is:

  29. ANOVA: single factor • Now it is necessary to compute the sums of squares for the treatment -- SSTr (between group); error--SSE (within group), and the total-- SST. • Sum of the squared deviations between groups • The total sums of squares measures the amount of variation about the grand mean • With algebraic manipulation we find that: SST = SSTr + SSE

  30. ANOVA: sums of squared When completing the ANOVA table usually only SSTr and SST are calculated. SSE is found by SSE = SST - SSTr

  31. ANOVA: mean sums of square • After calculating the sums of squares, F is simple the ratio of the mean squares of both the treatment and error. • The mean squares is the sums of squares divided by the appropriate degrees of freedom.

  32. Sources of variation Degrees of freedom SS MS F Treatment Error Total I - 1 I(J - 1) IJ - 1 SSTr SSE SST MSTr MSE MSTr MSE ANOVA: single factor table

  33. ANOVA: Example • An experiment was conducted to examine various modes of medication delivery. A total of 15 subjects diagnosed with the flu were enrolled and the length of time until alleviation of major symptoms was measured for three groups: • Group A received an inhaled version, • Group B received an injection, and • Group C received an oral dose.

  34. Single factor example x =

  35. Single factor example: set up • H0: all three means are equal or 1 = 2 = 3 • Ha: at least one mean is different •  = 0.05 • Critical value: F(, df) given I-1 = 2 and I(J-1) = 12, F(0.05, 2,12) = 3.89

  36. Single factor example: calculating sums of squares

  37. Sources of variation Degrees of freedom SS MS F Treatments Error Total 2 12 14 2,190.3 2,962.8 5,153.7 1,095.5 246.9 4.44 Single factor example: completing the table

  38. Single factor example: decision and conclusions • Compare Fstat to Fcrit: 4.45 > 3.89, therefore fail to reject H0. • There evidence suggest that the time it takes to alleviate major flu symptoms differed significantly due to the mode of medication delivery.

  39. Where is the Difference? • Recall the hypotheses of the ANOVA • Ho is that all the means are equal • Ha is that at least on is not. • If we fail to reject Ho the analysis is complete. • What does it mean when Ho is rejected • at least one mean is different • Which m's are different from one another. • if only two treatment levels. • three or more treatment levels

  40. Finding the difference • We must do a post hoc analysis. • a test that is done after the ANOVA • The purpose is to determine the location of the difference. • Different of post hoc test are available and are discussed in the text. • These test include Bonferroni, Sceffe, Student Newman-Keuls, and Tukey' HSD.

  41. Tukey’s HSD • Where,  = significance level I = number of groups J = number of observation per treatment MSE = mean square error (or within group MS)

  42. Using the Tukey’s HSD • All the information, except Q, needed to find w is located in the ANOVA table • Q is determined by using the studentized range distribution, a, I and dfwithin. • Once w is determined order all treatment level means in ascending order • Underline those values that differ by less than w. • Treatment means not underlined correspond to treatments that are significantly different.

  43. Example • Using the previous example, we no want to find which form(s) of medication really is different form the others. • To start we will order the means

  44. The ANOVA • Does this data indicate that the amount of time it takes a student to nod off is dependent on the statistical topic being studied? Since the computed F-statistic of 21.09 is greater than the critical value of F(0.05, 3, 16) = 3.24 we reject Ho.

  45. Computing the Tukey’s • There are I = 3 treatments and the degrees of freedom for the error is 12; thus, from the table Q(0.05, 3, 12) = 3.77.  • Computing the Tukey value we get:

More Related