1 / 45

Chapter 5

Chapter 5. Inferences Regarding Population Central Values. Inferential Methods for Parameters. Parameter : Numeric Description of a Population Statistic : Numeric Description of a Sample Statistical Inference : Use of observed statistics to make statements regarding parameters

aaron
Download Presentation

Chapter 5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 5 Inferences Regarding Population Central Values

  2. Inferential Methods for Parameters • Parameter: Numeric Description of a Population • Statistic: Numeric Description of a Sample • Statistical Inference: Use of observed statistics to make statements regarding parameters • Estimation: Predicting the unknown parameter based on sample data. Can be either a single number (point estimate) or a range (interval estimate) • Testing: Using sample data to see whether we can rule out specific values of an unknown parameter with a certain level of confidence

  3. Estimating with Confidence • Goal: Estimate a population mean based on sample mean • Unknown: Parameter (m) • Known: Approximate Sampling Distribution of Statistic • Recall: For a random variable that is normally distributed, the probability that it will fall within 2 standard deviations of mean is approximately 0.95

  4. Estimating with Confidence • Although the parameter is unknown, it’s highly likely that our sample mean (estimate) will lie within 2 standard deviations (aka standard errors) of the population mean (parameter) • Margin of Error: Measure of the upper bound in sampling error with a fixed level (we will typically use 95%) of confidence. That will correspond to 2 standard errors:

  5. Confidence Interval for a Mean m • Confidence Coefficient (1-a): Probability (based on repeated samples and construction of intervals) that a confidence interval will contain the true mean m • Common choices of 1-a and resulting intervals:

  6. 1-a 0

  7. 1-a m

  8. Philadelphia Monthly Rainfall (1825-1869)

  9. 4 Random Samples of Size n=20, 95% CI’s

  10. Factors Effecting Confidence Interval Width • Goal: Have precise (narrow) confidence intervals • Confidence Level (1-a) Increasing 1-a implies increasing probability an interval contains parameter implies a wider confidence interval. Reducing 1-a will shorten the interval (at a cost in confidence) • Sample size (n): Increasing n decreases standard error of estimate, margin of error, and width of interval (Quadrupling n cuts width in half) • Standard Deviation (s): More variable the individual measurements, the wider the interval. Potential ways to reduce s are to focus on more precise target population or use more precise measuring instrument. Often nothing can be done as nature determines s

  11. Precautions • Data should be simple random sample from population (or at least can be treated as independent observations) • More complex sampling designs have adjustments made to formulas (see Texts such as Elementary Survey Sampling by Scheaffer, Mendenhall, Ott) • Biased sampling designs give meaningless results • Small sample sizes from nonnormal distributions will have coverage probabilities (1-a) typically below the nominal level • Typically s is unknown. Replacing it with the sample standard deviation s works as a good approximation in large samples

  12. Selecting the Sample Size • Before collecting sample data, usually have a goal for how large the margin of error should be to have useful estimate of unknown parameter (particularly when comparing two populations) • Let E be the desired level of the margin of error and s be the standard deviation of the population of measurements (typically will be unknown and must be estimated based on previous research or pilot study) • The sample size giving this margin of error is:

  13. Hypothesis Tests • Method of using sample (observed) data to challenge a hypothesis regarding a state of nature (represented as particular parameter value(s)) • Begin by stating a research hypothesis that challenges a statement of “status quo” (or equality of 2 populations) • State the current state or “status quo” as a statement regarding population parameter(s) • Obtain sample data and see to what extent it agrees/disagrees with the “status quo” • Conclude that the “status quo” is not true if observed data are highly unlikely (low probability) if it were true

  14. Elements of a Hypothesis Test (I) • Null hypothesis (H0): Statement or theory being tested. Stated in terms of parameter(s) and contains an equality. Test is set up under the assumption of its truth. • Alternative Hypothesis (Ha): Statement contradicting H0. Stated in terms of parameter(s) and contains an inequality. Will only be accepted if strong evidence refutes H0 based on sample data. May be 1-sided or 2-sided, depending on theory being tested. • Test Statistic (T.S.): Quantity measuring discrepancy between sample statistic (estimate) and parameter value under H0 • Rejection Region (R.R.): Values of test statistic for which we reject H0 in favor of Ha • P-value: Probability (assuming H0 true) that we would observe sample data (test statistic) this extreme or more extreme in favor of the alternative hypothesis (Ha)

  15. Example: Interference Effect • Does the way items are presented effect task time? • Subjects shown list of color names in 2 colors: different/black • yi is the difference in times to read lists for subject i: diff-blk • H0: No interference effect: mean difference is 0 (m = 0) • Ha: Interference effect exists: mean difference > 0 (m > 0) • Assume standard deviation in differences is s = 8 (unrealistic*) • Experiment to be based on n=70 subjects How likely to observe sample mean difference  2.39 if m = 0?

  16. P-value 0 2.39

  17. Elements of a Hypothesis Test (II) • Type I Error: Test resulting in rejection of H0 in favor of Ha when H0 is in fact true • P(Type I error) = a (typically .10, .05, or .01) • Type II Error: Test resulting in failure to reject H0 in favor of Ha when in fact Ha is true (H0 is false) • P(Type II error) = b (depends on true parameter value) • 1-Tailed Test: Test where the alternative hypothesis states specifically that the parameter is strictly above (below) the null value • 2-Tailed Test: Test where the alternative hypothesis is that the parameter is not equal to null value (simultaneously tests “greater than” and “less than”)

  18. Test Statistic • Parameter: Population mean (m ) under H0 is m0 • Statistic (Estimator): Sample mean obtained from sample measurements is • Standard Error of Estimator: • Sampling Distribution of Estimator: • Normal if shape of distribution of individual measurements is normal • Approximately normal regardless of shape for large samples • Test Statistic: (labeled simply as z in text) Note: Typically s is unknown and is replaced by s in large samples

  19. Decision Rules and Rejection Regions • Once a significance (a) level has been chosen a decision rule can be stated, based on a critical value: • 2-sided tests: H0: m = m0Ha: m  m0 • If test statistic (zobs) > za/2 Reject Ho and conclude m > m0 • If test statistic (zobs) < -za/2 Reject Ho and conclude m < m0 • If -za/2 < zobs < za/2 Do not reject H0: m = m0 • 1-sided tests (Upper Tail): H0: m  m0Ha: m > m0 • If test statistic (zobs) > za Reject Ho and conclude m > m0 • If zobs < za Do not reject H0: m  m0 • 1-sided tests (Lower Tail): H0: m  m0Ha: m < m0 • If test statistic (zobs) < -za Reject Ho and conclude m < m0 • If zobs > -za Do not reject H0: m  m0

  20. Computing the P-Value • 2-sided Tests: How likely is it to observe a sample mean as far of farther from the value of the parameter under the null hypothesis? (H0: m = m0Ha: m  m0) After obtaining the sample data, compute the mean and convert it to a z-score (zobs) and find the area above |zobs| and below -|zobs| from the standard normal (z) table • 1-sided Tests: Obtain the area above zobs for upper tail tests (Ha:m > m0) or below zobs for lower tail tests (Ha:m < m0)

  21. Interference Effect (1-sided Test) • Testing whether population mean time to read list of colors is higher when color is written in different color • Data: yi: difference score for subject i (Different-Black) • Null hypothesis (H0): No interference effect (H0: m 0) • Alternative hypothesis (Ha): Interference effect (Ha: m > 0) • n = 70 subjects in experiment, reasonably large sample Conclude there is evidence of an interference effect (m > 0)

  22. Interference Effect (2-sided Test) • Testing whether population mean time to read list of colors is effected (higher or lower) when color is written in different color • Data: Xi: difference score for subject i (Different-Black) • Null hypothesis (H0): No interference effect (H0:m = 0) • Alternative hypothesis (Ha): Interference effect (+ or -) (Ha: m 0) Again, evidence of an interference effect (m > 0)

  23. Equivalence of 2-sided Tests and CI’s • For given a , a 2-sided test conducted at a significance level will give equivalent results to a (1-a)level confidence interval: • If entire interval > m0, P-value < a , zobs > za/2 (conclude m > m0) • If entire interval < m0, P-value < a , zobs < -za/2 (conclude m < m0) • If interval contains m0, P-value > a , -za/2< zobs < za/2 (don’t conclude mm0) • Confidence interval is the set of parameter values that we would fail to reject the null hypothesis for (based on a 2-sided test)

  24. Power of a Test • Power - Probability a test rejects H0 (depends on m) • H0 True: Power = P(Type I error) = a • H0 False: Power = 1-P(Type II error) = 1-b • Example (Using context of interference data): • H0: m = 0 HA: m > 0 • s2=64 n=16 • Decision Rule: Reject H0 (at a=0.05 significance level) if:

  25. Power of a Test • Now suppose in reality that m = 3.0 (HA is true) • Power now refers to the probability we (correctly) reject the null hypothesis. Note that the sampling distribution of the sample mean is approximately normal, with mean 3.0 and standard deviation (standard error) 2.0. • Decision Rule (from last slide): Conclude population mean interference effect is positive (greater than 0) if the sample mean difference score is above 3.29 • Power for this case can be computed as:

  26. Power of a Test • All else being equal: • As sample size increases, power increases • As population variance decreases, power increases • As the true mean gets further from m0 , power increases

  27. Power of a Test Distribution (H0) Distribution (HA) Fail to reject H0 Reject H0 .5576 .4424 .95 .05

  28. Power Curves for sample sizes of 16,32,64,80 and varying true values m from 0 to 5 with s = 8. • For given m, power increases with sample size • For given sample size, power increases with m

  29. Sample Size Calculations for Fixed Power • Goal - Choose sample size to have a favorable chance of detecting important difference from m0 in 2-sided test: H0:m = m0 vs Ha:m m0 • Step 1 - Define an important difference to be detected (D): • Case 1:s approximated from prior experience or pilot study - difference can be stated in units of the data • Case 2:s unknown - difference must be stated in units of standard deviations of the data • Step 2 - Choose the desired power to detect the desired important difference (1-b, typically at least .80). For 2-sided test:

  30. Example - Interference Data • 2-Sided Test: H0:m = 0vs Ha:m  0 • Set a = P(Type I Error) = 0.05 • Choose important difference of |m-m0|=D=2.0 • Choose Power=P(Reject H0|D=2.0) = .90 • Set b = P(Type II Error) = 1-Power = 1-.90 = .10 • From study, we know s8 Would need 169 subjects to have a .90 probability of detecting effect

  31. Potential for Abuse of Tests • Should choose a significance (a) level in advance and report test conclusion (significant/nonsignificant) as well as the P-value. Significance level of 0.05 is widely used in the academic literature • Very large sample sizes can detect very small differences for a parameter value. A clinically meaningful effect should be determined, and confidence interval reported when possible • A nonsignificant test result does not imply no effect (that H0 is true). • Many studies test many variables simultaneously. This can increase overall type I error rates

  32. Family of t-distributions • Symmetric, Mound-shaped, centered at 0 (like the standard normal (z) distribution • Indexed by degrees of freedom (df), the number of independent observations (deviations) comprising the estimated standard deviation. For one sample problems df = n-1 • Have heavier tails (more probability over extreme ranges) than the z-distribution • Converge to the z-distribution as df gets large • Tables of critical values for certain upper tail probabilities are available (Table 3, p. 679)

  33. Inference for Population Mean • Practical Problem: Sample mean has sampling distribution that is Normal with mean m and standard deviation s / n (when the data are normal, and approximately so for large samples). s is unknown. • Have an estimate of s , s obtained from sample data. Estimated standard error of the sample mean is: When the sample is SRS from N(m , s) then the t-statistic (same as z- with estimated standard deviation) is distributed t with n-1 degrees of freedom

  34. Probability Cri t ical Values Degrees of Freedom Critical Values

  35. One-Sample Confidence Interval for m • SRS from a population with mean m is obtained. • Sample mean, sample standard deviation are obtained • Degrees of freedom are df= n-1, and confidence level (1-a) are selected • Level (1-a) confidence interval of form: Procedure is theoretically derived based on normally distributed data, but has been found to work well regardless for large n

  36. 1-Sample t-test (2-tailed alternative) • 2-sided Test: H0: m = m0Ha: mm0 • Decision Rule (ta/2 such that P(t(n-1) ta/2)=a/2): • Conclude m>m0 if Test Statistic (tobs) is greater than ta/2 • Conclude m<m0 if Test Statistic (tobs) is less than -ta/2 • Do not conclude Conclude mm0 otherwise • P-value: 2P(t(n-1) |tobs|) • Test Statistic:

  37. P-value (2-tailed test) -|tobs| |tobs|

  38. 1-Sample t-test (1-tailed (upper) alternative) • 1-sided Test: H0: m = m0Ha: m>m0 • Decision Rule (ta such that P(t(n-1) ta)=a): • Conclude m>m0 if Test Statistic (tobs) is greater than ta • Do not conclude m>m0 otherwise • P-value: P(t(n-1) tobs) • Test Statistic:

  39. P-value (Upper Tail Test)

  40. 1-Sample t-test (1-tailed (lower) alternative) • 1-sided Test: H0: m = m0Ha: m<m0 • Decision Rule (ta obtained such that P(t(n-1) ta)=a): • Conclude m<m0 if Test Statistic (tobs) is less than -ta • Do not conclude m<m0 otherwise • P-value: P(t(n-1) tobs) • Test Statistic:

  41. P-value (Lower Tail Test)

  42. Example: Mean Flight Time ATL/Honolulu • Scheduled flight time: 580 minutes • Sample: n=31 flights 10/2004 (treating as SRS from all possible flights • Test whether population mean flight time differs from scheduled time • H0: m = 580 Ha: m 580 • Critical value (2-sided test, a = 0.05, n-1=30 df): t.025=2.042 • Sample data, Test Statistic, P-value:

  43. Inference on a Population Median • Median: “Middle” of a distribution (50th Percentile) • Equal to Mean for symmetric distribution • Below Mean for Right-skewed distribution • Above Mean for Left-skewed dsitribution • Confidence Interval for Population Median: • Sort observations from smallest to largest (y(1)...y(n)) • Obtain Lower (La/2)and Upper (Ua/2) “Bounds of Ranks” • Small Samples: Obtain Ca(2),n from Table 5 (p. 682) • Large Samples:

  44. Example - ATL/HNL Flight Times • n=31, • Small-Sample: C.05(2),31=9 • Large-Sample:

More Related