Be humble in our attribute, be loving and varying in our attitude, that is the way to live in heaven. Applied Statistics Using SAS and SPSS. Topic: One Way ANOVA By Prof Kelly Fan, Cal State Univ, East Bay. Statistical Tools vs. Variable Types. Example: Battery Lifetime.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Be humble in our attribute, be loving and varying in our attitude, that is the way to live in heaven.
Applied Statistics Using SAS and SPSS
Topic: One Way ANOVA
By Prof Kelly Fan, Cal State Univ, East Bay
1 2 3 4 5 6 7 8
1.8 4.2 8.6 7.0 4.2 4.2 7.8 9.0
5.0 5.4 4.6 5.0 7.8 4.2 7.0 7.4
1.0 4.2 4.2 9.0 6.6 5.4 9.8 5.8
5.8
2.6 4.6 5.8 7.0 6.2 4.6 8.2 7.4
Data: Y = LIFETIME (HOURS)
BRAND
3 replications per level
(Brand is, of course, represented as “categorical”)
“LEVEL” OF BRAND
1 2 • • • • • • • • C
1
2
•
•
•
•
n
Y11 Y12 • • • • • • •Y1c
Yij = i + ij
i = 1, . . . . . , C
j = 1, . . . . . , n
Y21
•
•
•
•
•
•
YnI
•
•
•
•
•
Yij
Ync
• • • • • • • •
HO: Level of X has no impact on Y
HI: Level of X does have impact on Y
HO: 1 = 2 = • • • • 8
HI: not all j are EQUAL
Analysis of Variance for life
Source DF SS MS F P
brand 7 69.12 9.87 3.38 0.021
Error 16 46.72 2.92
Total 23 115.84
Estimate of the common variances^2
S = 1.709 R-Sq = 59.67% R-Sq(adj) = 42.02%
Normality plot: normal scores vs. residuals
From the Battery lifetime data:
Residual plot: fitted values vs. residuals
From the Battery lifetime data:
Multiple Comparison
Procedures
Once we reject H0: ==...c in favor of H1: NOT all ’s are equal, we don’t yet know the way in which they’re not all equal, but simply that they’re not all the same. If there are 4 columns, are all 4 ’s different? Are 3 the same and one different? If so, which one? etc.
These “more detailed” inquiries into the process are called MULTIPLE COMPARISON PROCEDURES.
Errors (Type I):
We set up “” as the significance level for a hypothesis test. Suppose we test 3 independent hypotheses, each at = .05; each test has type I error (rej H0 when it’s true) of .05. However,
P(at least one type I error in the 3 tests)
= 1-P( accept all ) = 1 - (.95)3 .14
3, given true
In other words, Probability is .14 that at least one type one error is made. For 5 tests, prob = .23.
Question - Should we choose = .05, and suffer (for 5 tests) a .23 OVERALL Error rate (or “a” or aexperimentwise)?
OR
Should we choose/control the overall error rate, “a”, to be .05, and find the individual test by 1 - (1-)5 = .05, (which gives us = .011)?
The formula
1 - (1-)5 = .05
would be valid only if the tests are independent; often they’re not.
[ e.g., 1=22=3, 1= 3
IF accepted & rejected, isn’t it more likely that rejected? ]
2
3
1
1
2
3
When the tests are not independent, it’s usually very difficult to arrive at the correct for an individual test so that a specified value results for the overall error rate.
Categories of multiple comparison tests
- “Planned”/ “a priori” comparisons (stated in advance, usually a linear combination of the column means equal to zero.)
“Post hoc”/ “a posteriori” comparisons (decided after a look at the data - which comparisons “look interesting”)
“Post hoc” multiple comparisons (every column mean compared with each other column mean)
A financial firm would like to determine if brokers they use to execute trades differ with respect to their ability to provide a stock purchase for the firm at a low buying price per share. To measure cost, an index, Y, is used.
Y=1000(A-P)/A
where
P=per share price paid for the stock;
A=average of high price and low price per share, for the day.
“The higher Y is the better the trade is.”
CoL: broker
1
12
3
5
-1
12
5
6
2
7
17
13
11
7
17
12
3
8
1
7
4
3
7
5
4
21
10
15
12
20
6
14
5
24
13
14
18
14
19
17
}
R=6
Five brokers were in the study and six trades
were randomly assigned to each broker.
Analyze>>General Linear Model>>Univariate…
Homogeneous Subsets
Conclusion : 3, 1 2 4 5
???
Conclusion : 3, 1 2, 4, 5
Broker 1 and 3 are not significantly different but they are significantly different to the other 3 brokers.
Broker 2 and 4 are not significantly different, and broker 4 and 5 are not significantly different, but broker 2 is different to (smaller than) broker 5 significantly.
Conclusion : 3, 1 2 4 5
Comparisons to Control
Dunnett’s test
Designed specifically for (and incorporating the interdependencies of) comparing several “treatments” to a “control.”
Col
Example:
1 2 3 4 5
}
R=6
6 12 5 14 17
CONTROL
CONTROL
1 2 3 4 5
In our example:
6 12 5 14 17
- Cols 4 and 5 differ from the control [ 1 ].
- Cols 2 and 3 are not significantly different
from control.
Sales
Post Hoc and Priori comparisons