1 / 60

Today’s lesson Review of Examination 1

Today’s lesson Review of Examination 1. Basic summary statistics Definition of alpha and beta One sample test beta and sample size Distribution of Sum of iid values Two sample test Confidence interval for E(W-Y) Computer output questions. Basic summary statistics.

Download Presentation

Today’s lesson Review of Examination 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Today’s lesson Review of Examination 1 • Basic summary statistics • Definition of alpha and beta • One sample test beta and sample size • Distribution of Sum of iid values • Two sample test Confidence interval for E(W-Y) • Computer output questions

  2. Basic summary statistics • Sample average well handled. • Sample variance needs some work. • P-value is crucial--don’t give away points.

  3. 2A. Find sample variance. • A random sample of four was taken from the random variable W. The four observed values were 280, 210, 200, and 190. The sample average was 220. What is the value of the unbiased estimate of the variance based on this sample?

  4. Solution to 2A. • Recognize that the problem has asked for the usual estimate of the variance. • Find the deviations from the mean: (280-220)=60, -10, -20, -30. • Check that the deviations from the mean sum to zero. • Square the deviations from the mean: 3600, 100, 400, 900.

  5. Solution to 2A. • Sum the squared deviations from the mean: 5000. • Divide by n-1 (here 3) to get correct answer: 5000/3=1666.67.

  6. Problem 3A. • The observed significance level (p-value) reported in a computer printout for a statistical test was 0.086. Which of the following is a correct decision for this result? • Usual options.

  7. Solution to 3A. • Remember: when the p-value is less than or equal to alpha, reject at the alpha level of significance. • When the p-value is greater than alpha, accept at the alpha level of significance. • P-value of 0.086 is greater than 0.05 (hence accept at 0.05, also at 0.01) • P-value of 0.086 is less than 0.10 (hence reject at 0.10); option c is correct.

  8. Definition of alpha and beta. • Description tells that E11 is test statistic; eleven question true-false test using the number of errors as the test statistic. The rejection region is E11 less than or equal to 3. The usual tables of cdf’s follow.

  9. Problem 4A. • What is alpha, the probability of a Type I error for the rejection rule E11 less than or equal to 3?

  10. Solution to Problem 4A. • Alpha=Pr0{Reject H0}=Pr0{E11 less than or equal to 3}=F0(3). • By table look-up, find F0(3)=0.113281. • This is alpha. The correct answer is 0.113281.

  11. Problem 5A. • What is beta, the probability of a Type II error, for the rejection rule E11 less than or equal to 3 when the examination is administered to a student who has a 0.05 probability of incorrectly answering each question?

  12. Solution to 5A. • Recognize that calculation of beta requires you to use the cdf of the alternative. • Beta=Pr1{Accept H0}=Pr1{E11 greater than or equal to 4}=1- Pr1{E11 less than or equal to 3}=1-F1(3). • Use the tables to find that F1(3)=0.9984. • Beta=1-0.9984=0.0016. • The correct answer is 0.0016.

  13. Comments on alpha and beta. • Alpha (0.1133) is very much larger than beta (0.0016). • This imbalance indicates that this rejection rule is not very wise for this problem. • Changing the rejection rule to E11 less than or equal to 2 would be a wise change (check out alpha and beta for this rule).

  14. Problem 6A. • In a test of the null hypothesis that a student is a random guesser against the alternative the student is better than a random guesser using the results of the eleven question examination, what is the observed significance level (that is, p-value) for a student who gives two incorrect answers to the eleven questions?

  15. Solution to Problem 6A. • Since the rejection rule is a left sided region (reject when E11 less than or equal to 3), the left sided p-value should be reported. • Left-sided p-value=Pr0{E11 less than or equal to 2}=F0(2) • The correct answer is F0(2)=0.032715. • The value 2 was given as the number of errors made in the statement of the problem.

  16. Story for Questions 7B-9B. • A research team will test the null hypothesis that E(Y)=2500 at the 0.01 level of significance against the alternative that E(Y)>2500. When the null hypothesis is true, Y has a normal distribution with standard deviation 800. They will take a random sample of 64 observations and use the sample mean to test the null hypothesis.

  17. Problem 7B. • What is the critical value for this test? • Solution: The critical value is • E0 sign |zα|σ0/n0.5. • Here, E0=2500, sign is positive because the test is right-sided, |zα|=2.326, σ0/n0.5=800/640.5 • That is, cv=2500+2.326*100=2732.6. • The correct answer is 2732.6

  18. Problem 8B. • What is the probability of a Type II error when E(Y)=2800, σY=800, and α=0.01? • Solution: β=Pr1{Accept H0}=Pr1{Sample mean of 64 < 2732.6}. • This is a normal probability problem in which the sample mean of 64 is normal with mean 2800 and standard deviation 100.

  19. Problem 8B. • Find probability by standardizing: • Pr{[(mean-E(mean))/se(mean)]<[(2732.6-2800)/100]}=Pr{Z<-0.674}. • This is the cdf of the standard normal at the argument -0.674. Look up in table to find that the answer is 0.25. • The correct answer is 0.25.

  20. Problem 9B. • What is the smallest value of n, the sample size, so that the probability of a Type II error is no more than 0.01 when E(Y)=2800, σY=800, and α=0.01? • Solution: In the notation of the formula, E0=2500, E1=2800, σ0= σ1=800, |zα|=|zβ|=2.326. • Then, n0.5=12.4, and the correct answer for n is 154.

  21. Story for Questions 10B-13B. • The winnings W in one play of a game of chance is a normally distributed random variable with expected value -$300 and standard deviation $2000.

  22. Question 10B. • What is the probability that a gambler will lose money in one play of this game of chance? • Solution: recognize that you are asked to calculate Pr{W<0}. • This is a normal probability; put the inequality in standard score form.

  23. Question 10B. • Pr{[(W-EW)/σW]<[(0-(-300))/2000]} • =Pr{Z<0.15}=Φ(0.15)=0.5596. • The correct answer is 0.5596.

  24. Question 11B. • What are the expected total winnings after 100 independent plays of this game of chance? • Solution: The basic principle is that E(Sn)=nE(W). • E(S100)=100(-$300)=-$30,000. • The correct answer is -$30,000.

  25. Question 12B. • What is the standard deviation of the total winnings after 100 independent plays of this game of chance? • Solution: Recall the basic fact that σ(Sn)=n0.5σW. • Here, σ(S100)=1000.5($2000)=$20,000. • The correct answer is $20,000.

  26. Question 13B. • What is the probability that a gambler will have total winnings that are less than zero after 100 independent plays of this game of chance? • Solution: Recognize that you have to calculate the Pr{S100<0}. • Recognize that S100 is normally distributed with mean -$30,000 and standard deviation $20,000.

  27. Question 13 B. • Calculate every normal probability by putting the inequality in standard score form: • That is, Pr{[(S100-E(S100))/σ(S100)]< [(0-(-30,000)/20000]} • This is Pr{Z<1.5}=Φ(1.5). • Do a table lookup to find that the correct answer is 0.9332.

  28. Story for Questions 14C to 16C • Each patient in a study will take a specified medicine, and the patient’s response to that medicine will be measured. Forty patients will be randomly assigned to two groups of twenty each. Group 1 will receive an experimental medicine. The random variable X denotes a patient’s response to the experimental medicine and is normally

  29. Story for Questions 14C to 16C • distributed with unknown expected value E(X) and unknown standard deviation σ. Group 2 will receive the best available medicine. The random variable B denotes a patient’s response to this medicine and is normally distributed with unknown expected value E(B) and unknown standard deviation σ.

  30. Story for Questions 14C to 16C • The null hypothesis is this experiment is that E(X-B)=0, and the alternative hypothesis is that E(X-B)>0. • The experiment was run. The observed sample averages were 642.4 in the X group and 529.8 in the B group. The observed standard deviations were 233.7 for the X group, and 348.0 for the B group. The resulting pooled estimate of σ was 296.5.

  31. Question 14C. • What is the standard deviation of the difference of the two means? • Solution: Use the formula for the variance of the difference of two random variables. • variance(X mean of 20)=σ2/20. • variance(B mean of 20)=σ2/20. • Covariance(X mean, B mean)=0, since this is a randomized experiment.

  32. Question 14C. • Variance(X mean of 20 - B mean of 20) = var(X mean of 20)+var(B mean of 20) - 2covariance(X mean, B mean)= • (σ2/20)+ (σ2/20)-2(0)= σ2/10 • The standard deviation is the square root of the variance=(0.10)0.5σ=0.316σ • The correct answer is 0.316σ.

  33. Question 15C. • Which of the following is the correct decision for accepting or rejecting the null hypothesis based on the sample averages and standard deviations given in the common paragraph? • Usual options.

  34. Solution to 15C. • The test statistic is the x sample average-b sample average=642.4-529.8=112.6; this is positive and in the direction supportive of the alternative. • The estimated standard deviation (standard error) of the test statistic is 0.316*296.5=93.699. • The t-statistic is (112.6-0)/93.699=1.20.

  35. Solution to 15C. • Next, you have to stretch the normal theory critical values to account for the estimated standard deviation 296.5 having 38 degrees of freedom (40-2 df). • 2.326 is stretched to about 2.43; 1.645 to about 1.686; and 1.282 to 1.304. • The t-value of 1.20 is to the left of the 0.10 critical value of 1.304; hence accept at 0.10. • D is the correct answer.

  36. Question 16C. • What is the 95 percent confidence interval for E(X-B). • Solution: • Center the confidence interval at the x mean minus the b mean=112.6. • The sampling margin of error is the product of the stretch of 1.960 for 38 df (about 2.026) and the standard error (93.699). It equals 189.8

  37. Solution to 16C Continued. • The 95 percent CI for E(X-B) is 112.6 plus and minus 189.8. • The correct answer is that the 95 percent confidence interval for E(X-B) ranges from -77.2 to 302.4.

  38. Story for 17C to 19 C. • I used the Explore command in SPSS to summarize 100 values of a variable L1MHLOD determined from a simulation study of a dominant trait that affected all families in a simulated genetic study. I reported descriptives output, histogram and box and whiskers plot. Use the output to answer the following three questions.

  39. Question 17C • Which of the following is a correct decision about the two tests of null hypotheses about E(L1MHLOD)? • I: Null: E(LIMHLOD)=1, alpha=0.05; Alt: E(LIHMLOD) not equal to 1. • II: Null: E(LIMHLOD)=2, alpha=0.05; Alt: E(LIHMLOD) not equal to 2. • Usual options.

  40. Solution to 17C. • Read the descriptives output to find that the 95 percent CI for the mean ranges from 1.5230 to 2.4318. • 1 is not in the 95% CI for the mean; hence reject I. • 2 is in the 95% CI for the mean; hence accept II. • The correct answer is C.

  41. Question 18C. • Does the distribution of L1MHLOD appear to be normal? Support your answer with specific references to values of statistics and plots.

  42. Solution to 18C. • Examine the histogram of L1MHLOD; observe that it is very skew with a number of outliers; hence it appears not to be normal. • Standardized skewness is (4.045-0)/0.241 (from descriptives output)=16.8, way out of range.

  43. Solution to 18C. • Standardized kurtosis is (18.172-0)/0.478=38.0, also way out of range. • Every indication points that the data does not appear to be normal. • The correct answer is “NO”.

  44. Question 19C. • Are there outliers or other unusual patterns in the distribution of L1MHLOD? • Solution: Look at the box and whiskers plot and note that there are values indicated beyond the whiskers. • Correct answer: YES, there are outliers. • Also remark on the four apparently disconnected values in the histogram.

  45. Story for Questions 20D to 22D. • I used a paired t-test to compare L1MLOD to L1MKCEXP for 100 replicates of a study of recessive genetic trait that affected all families in the simulated study. The objective of the analysis was to determine whether one of the two statistics came closer to the trait locus than the other. Computer output followed.

  46. Question 20D • Which of the following is a correct decision about the test of the following null hypothesis? The null is that E(L1MLOD)=E(L1MKCEXP), and the alternative is that E(L1MLOD) is not equal to E(L1MKCEXP). • Usual options.

  47. Solution to 20D. • Find the significance level (2-sided) is the paired samples test output. Here it is 0.000. • Use this as usual. • Do not use the sig of the paired samples correlation, 0.402. • The correct answer is A, reject at the 0.01 level of significance.

  48. Question 21D. • Which of the following is a correct decision about the following two tests of about E(L1MLOD)-E(L1MKCEXP)? • I. Null: E(L1MLOD)-E(L1MKCEXP)=-1, alpha=0.05; Alt: E(L1MLOD)-E(L1MKCEXP) not equal to -1. • II. Null: E(L1MLOD)-E(L1MKCEXP)=0, alpha=0.05; Alt: E(L1MLOD)-E(L1MKCEXP) not equal to 0.

  49. Solution to 21D. • Find the 95% CI for the difference in the paired samples test output, 4.19 to 7.23. • Use the paired samples statistics to confirm that the difference in the output is for E(L1KCEXP)-E(L1MLOD). • NOTE THAT THE QUESTION ASKS ABOUT THE REVERSE ORDER.

  50. Solution to 21D. • Check -1*(-1) (reverse the sign of the values!); 1 is not in the 95% CI for the expected difference; hence reject I. • Check -1*0=0. It is not in as well. • The correct answer is D, Reject both null hypotheses.

More Related