Stat 31, Section 1, Last Time. Statistical Inference Confidence Intervals: Range of Values to reflect uncertainty Bracket true value in 95% of repetitions Choice of sample size Choose n to get desired error Hypothesis Testing Yes – No questions, under uncertainty.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Approximate Reading for Today’s Material:
Pages 400-416, 425-428
Approximate Reading for Next Class:
Pages 431-439, 450-471
E.g. A fast food chain currently brings in profits of $20,000 per store, per day. A new menu is proposed. Would it be more profitable?
Test: Have 10 stores (randomly selected!) try the new menu, let = average of their daily profits.
Note: Can never make a definite conclusion,
Instead measure strength of evidence.
Reason: have to deal with uncertainty
But: Can quantify uncertainty
Approach I: (note: different from text)
Choose among 3 Hypotheses:
H+: Strong evidence new menu is better
H0: Evidence in inconclusive
H-: Strong evidence new menu is worse
(If you pay careful attention)
Base decision on best guess:
Will quantify strength of the evidence using probability distribution of
E.g. Choose H+
How to draw line?
(There are many ways,
here is traditional approach)
Insist that H+ (or H-) show strong evidence
I.e. They get burden of proof
(Note: one way of solving
gray area problem)
Suppose observe: ,
Note , but is this conclusive?
or could this be due to natural sampling variation?
(i.e. do we risk losing money from new menu?)
Assess evidence for H+ by:
H+ p-value = Area
Computation in EXCEL:
Class Example 22, Part 1:
P-value = 0.094
i.e. About 10%
Is this “small”?
(where do we draw the line?)
View 1: Even under H0, just by chance, see values like , about 10% of the time,
But where is the boundary line?
View 2: Traditional (and even “legal”) cutoff, called here the yes-no cutoff:
Say evidence is strong,
when P-value < 0.05
E.g. your airplane safe to fly,
E.g. often called strongly significant
View 3: Personal idea about cutoff,
called gray level (vs. yes-no above)
P-value < 0.01: “quite strong evidence”
0.01 < P-value < 0.1: “weaker evidence
but stronger for smaller P-val.”
0.1 < P-value: “very weak evidence, at
View 3: gray level (vs. yes-no above)
Note: only about interpretation of P-value
E.g.: When P-value is given:
HW: 6.40 & (d) give gray level interp.
(no, no, relatively weak evidence)
6.41 & (d) give gray level interp.
(yes, not, moderately strong evidence)
P-value of 0.094 for H+,
Is “quite weak evidence for H+”,
i.e. “only a mild suggestion”
This happens sometimes: not enough information in data for firm conclusion
Flip side: could also look at “strength of evidence for H-”.
Expect: very weak, since saw
H- P-value = $20,000 $21,000
Class Example 24, part 1
H- P-value = 0.906
>> ½, so no evidence at all for H-
A practical issue:
May want to gather more data…
Could prove new menu clearly better
(since more data means more
could overcome uncertainty)
Suppose this was done, i.e. n = 10 is replaced by n = 40, and got the same:
Expect: 4 times the data ½ of the SD
Impact on P-value?
Class Example 24, Part 2
How did it get so small, with only ½ the SD?
mean = $20,000, observed $21,000
P-value = 0.094 P-value = 0.004
For each of the problems:
(a. 0.942, 0.058, b. 0.234, 0.766,
c. 0.234, 0.766, d. 0.015, 0.985)
(a. H- don’t dispute b. H- not safe
c. H- not safe d. H- safe)
(a. H- moderate evidence against
b. H- no strong evidence
c. H- seems to go other way
d. H- strong evidence, almost very strong)
An amazing movie clip:
Thanks to Trent Williamson
Hypo Testing Approach II:
(more conventional & is version in text)
Idea: only one of H+ and H- is usually relevant, so combine other with H0
Approach II: New Hypotheses
Null Hypothesis: H0 = “H0 or ”
Alternate Hypothesis: HA = opposite of
Note: common notation for HA is H1
Gets “burden of proof”, I might accidentally put this
i.e. needs strong evidence to prove this
Weird terminology: Firm conclusion is called “rejecting the null hypothesis”
Basics of Test: P-value =
Note: same as H0 in H+, H0, H- case,
so really just same as above
Recall: New menu more profitable???
Hypo testing setup:
Same as before.
See: Class Example 24, part 3:
HW: 6.55, 6.61
Interpret with bothyes-no and gray level
“Significant at the 5% level” =
= P-value < 0.05
“Test Statistic z” = N(0,1) cutoff
Hypo Testing Approach III:
Main idea: when either of H+ or H- is conclusive, then combine them
E.g. Is population mean equal to a given value, or different?
Note either bigger or smaller is strong evidence
Hypo Testing Approach III:
“Alternative Hypothesis” is:
HA = “H+ or H-”
General form: Specified Value
Note: “ ” always goes in HA, since cannot have “strong evidence of =”.
i. e. cannot be sure about difference between and + 0.000001
while can have convincing evidence for “ ”
(recall HA gets “burden of proof”)
Basis of test:
why this distribution
observed value of
“more conclusive” is the two tailed area
Two Sided Viewpoint:
mutually exclusive “or” rule
See Class Example 24, part 4
So no strong evidence,
Either yes-no or gray-level
Shortcut: by symmetry
2 tailed Area = 2 x Area
See Class Example 24, part 4
HW: 6.62 - interpret both yes-no & gray-level
(-2.20, 0.0278, rather strong evidence)
A “paradox” of 2-sided testing:
Can get strange conclusions
(why is gray level sensible?)
Fast food example: suppose gathered more data, so n = 20, and other results are the same
One-sided test of:
P-value = … = 0.031
Part 5 ofhttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg24.xls
Two-sided test of:
P-value = … = 0.062
Have strong evidence
But no evidence !?!
(shouldn’t bigger imply different?)
(so be careful with it!)
(so near boundary could make
a difference, as happened here)
(1-sided has stronger evidence,
Lesson: 1-sided vs. 2-sided issues need careful:
(choice does affect answer)
(idea of being tested
depends on this choice)
Better from gray level viewpoint
CAUTION: Read problem carefully to distinguish between:
One-sided Hypotheses - like:
Two-sided Hypotheses - like:
E.g. Text book problem 6.34:
In each of the following situations, a significance test for a population mean, is called for. State the null hypothesis, H0 and the alternative hypothesis, HA in each case….
An experiment is designed to measure the effect of a high soy diet on bone density of rats.
= average bone density of high soy rats
= average bone density of ordinary rats
(since no question of “bigger” or “smaller”)
Student newspaper changed its format. In a random sample of readers, ask opinions on scale of -2 = “new format much worse”, -1 = “new format somewhat worse”, 0 = “about same”, +1 = “new a somewhat better”, +2 = “new much better”.
= average opinion score
E.g. 6.34b (cont.)
No reason to choose one over other, so do two sided.
Note: Use one sided if question is of form: “is the new format better?”
The examinations in a large history class are scaled after grading so that the mean score is 75. A teaching assistant thinks that his students have a higher average score than the class as a whole. His students can be considered as a sample from the population of all students he might teach, so he compares their score with 75.
= average score for all students of this TA
E.g. Textbook problem 6.36
Translate each of the following research questions into appropriate and
Be sure to identify the parameters in each hypothesis (generally useful, so already did this above).
A researcher randomly divides 6-th graders into 2 groups for PE Class, and teached volleyball skills to both. She encourages Group A, but acts cool towards Group B. She hopes that encouragement will result in a higher mean test for group A.
= mean test score for Group A
= mean test score for Group B
Recall: Set up point to be proven as HA
Researcher believes there is a positive correlation between GPA and esteem for students. To test this, she gathers GPA and esteem score data at a university.
= correlation between GPS & esteem
A sociologist asks a sample of students which subject they like best. She suspects a higher percentage of females, than males, will name English.
= prop’n of Females preferring English
= prop’n of Males preferring English
HW on setting up hypotheses: