Be humble in our attribute, be loving and varying in our attitude, that is the way to live in heaven...
This presentation is the property of its rightful owner.
Sponsored Links
1 / 34

Applied Statistics Using SAS and SPSS PowerPoint PPT Presentation


  • 38 Views
  • Uploaded on
  • Presentation posted in: General

Be humble in our attribute, be loving and varying in our attitude, that is the way to live in heaven. Applied Statistics Using SAS and SPSS. Topic: One Way ANOVA By Prof Kelly Fan, Cal State Univ, East Bay. Statistical Tools vs. Variable Types. Example: Battery Lifetime.

Download Presentation

Applied Statistics Using SAS and SPSS

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Applied statistics using sas and spss

Be humble in our attribute, be loving and varying in our attitude, that is the way to live in heaven.


Applied statistics using sas and spss

Applied Statistics Using SAS and SPSS

Topic: One Way ANOVA

By Prof Kelly Fan, Cal State Univ, East Bay


Statistical tools vs variable types

Statistical Tools vs. Variable Types


Example battery lifetime

Example: Battery Lifetime

  • 8 brands of battery are studied. We would like to find out whether or not the brand of a battery will affect its lifetime. If so, of which brand the batteries can last longer than the other brands.

  • Data collection: For each brand, 3 batteries are tested for their lifetime.

  • What is Y variable? X variable?


Applied statistics using sas and spss

1 2 3 4 5 6 7 8

1.8 4.2 8.6 7.0 4.2 4.2 7.8 9.0

5.0 5.4 4.6 5.0 7.8 4.2 7.0 7.4

1.0 4.2 4.2 9.0 6.6 5.4 9.8 5.8

5.8

2.6 4.6 5.8 7.0 6.2 4.6 8.2 7.4

Data: Y = LIFETIME (HOURS)

BRAND

3 replications per level


Statistical model

Statistical Model

(Brand is, of course, represented as “categorical”)

“LEVEL” OF BRAND

1 2 • • •  •  •  • • • C

1

2

n

Y11 Y12 • • • • • • •Y1c

Yij = i + ij

i = 1, . . . . . , C

j = 1, . . . . . , n

Y21

YnI

Yij

Ync

•   •  •   •    •   •    •    • 


Hypotheses setup

Hypotheses Setup

HO: Level of X has no impact on Y

HI: Level of X does have impact on Y

HO: 1 = 2 = • • • • 8

HI: not all j are EQUAL


One way anova

ONE WAY ANOVA

Analysis of Variance for life

Source DF SS MS F P

brand 7 69.12 9.87 3.38 0.021

Error 16 46.72 2.92

Total 23 115.84

Estimate of the common variances^2

S = 1.709 R-Sq = 59.67% R-Sq(adj) = 42.02%


Review

Review

  • Fitted value = Predicted value

  • Residual = Observed value – fitted value


Diagnosis normality

Diagnosis: Normality

  • The points on the normality plot must more or less follow a line to claim “normal distributed”.

  • There are statistic tests to verify it scientifically.

  • The ANOVA method we learn here is not sensitive to the normality assumption. That is, a mild departure from the normal distribution will not change our conclusions much.

Normality plot: normal scores vs. residuals


Applied statistics using sas and spss

From the Battery lifetime data:


Diagnosis equal variances

Diagnosis: Equal Variances

  • The points on the residual plot must be more or less within a horizontal band to claim “constant variances”.

  • There are statistic tests to verify it scientifically.

  • The ANOVA method we learn here is not sensitive to the constant variances assumption. That is, slightly different variances within groups will not change our conclusions much.

Residual plot: fitted values vs. residuals


Applied statistics using sas and spss

From the Battery lifetime data:


Applied statistics using sas and spss

Multiple Comparison

Procedures

Once we reject H0: ==...c in favor of H1: NOT all ’s are equal, we don’t yet know the way in which they’re not all equal, but simply that they’re not all the same. If there are 4 columns, are all 4 ’s different? Are 3 the same and one different? If so, which one? etc.


Applied statistics using sas and spss

These “more detailed” inquiries into the process are called MULTIPLE COMPARISON PROCEDURES.

Errors (Type I):

We set up “” as the significance level for a hypothesis test. Suppose we test 3 independent hypotheses, each at = .05; each test has type I error (rej H0 when it’s true) of .05. However,

P(at least one type I error in the 3 tests)

= 1-P( accept all ) = 1 - (.95)3 .14

3, given true


Applied statistics using sas and spss

In other words, Probability is .14 that at least one type one error is made. For 5 tests, prob = .23.

Question - Should we choose = .05, and suffer (for 5 tests) a .23 OVERALL Error rate (or “a” or aexperimentwise)?

OR

Should we choose/control the overall error rate, “a”, to be .05, and find the individual test  by 1 - (1-)5 = .05, (which gives us  = .011)?


Applied statistics using sas and spss

The formula

1 - (1-)5 = .05

would be valid only if the tests are independent; often they’re not.

[ e.g., 1=22=3, 1= 3

IF accepted & rejected, isn’t it more likely that rejected? ]

2

3

1

1

2

3


Applied statistics using sas and spss

When the tests are not independent, it’s usually very difficult to arrive at the correct for an individual test so that a specified value results for the overall error rate.


Categories of multiple comparison tests

Categories of multiple comparison tests

- “Planned”/ “a priori” comparisons (stated in advance, usually a linear combination of the column means equal to zero.)

“Post hoc”/ “a posteriori” comparisons (decided after a look at the data - which comparisons “look interesting”)

“Post hoc” multiple comparisons (every column mean compared with each other column mean)


Applied statistics using sas and spss

  • There are many multiple comparison procedures. We’ll cover only a few.

  • Post hoc multiple comparisons

  • Pairwise comparisons: Do a series of pairwise tests; Duncan and SNK tests

  • (Optional) Comparisons to control: Dunnett tests


Example broker study

Example: Broker Study

A financial firm would like to determine if brokers they use to execute trades differ with respect to their ability to provide a stock purchase for the firm at a low buying price per share. To measure cost, an index, Y, is used.

Y=1000(A-P)/A

where

P=per share price paid for the stock;

A=average of high price and low price per share, for the day.

“The higher Y is the better the trade is.”


Applied statistics using sas and spss

CoL: broker

1

12

3

5

-1

12

5

6

2

7

17

13

11

7

17

12

3

8

1

7

4

3

7

5

4

21

10

15

12

20

6

14

5

24

13

14

18

14

19

17

}

R=6

Five brokers were in the study and six trades

were randomly assigned to each broker.


Spss output

SPSS Output

Analyze>>General Linear Model>>Univariate…


Applied statistics using sas and spss

Homogeneous Subsets


Applied statistics using sas and spss

Conclusion : 3, 1 2 4 5

???

Conclusion : 3, 1 2, 4, 5


Applied statistics using sas and spss

Broker 1 and 3 are not significantly different but they are significantly different to the other 3 brokers.

Broker 2 and 4 are not significantly different, and broker 4 and 5 are not significantly different, but broker 2 is different to (smaller than) broker 5 significantly.

Conclusion : 3, 1 2 4 5


Applied statistics using sas and spss

Comparisons to Control

Dunnett’s test

Designed specifically for (and incorporating the interdependencies of) comparing several “treatments” to a “control.”

Col

Example:

1 2 3 4 5

}

R=6

6 12 5 14 17

CONTROL


Applied statistics using sas and spss

CONTROL

1 2 3 4 5

In our example:

6 12 5 14 17

- Cols 4 and 5 differ from the control [ 1 ].

- Cols 2 and 3 are not significantly different

from control.


Exercise sales data

Exercise: Sales Data

Sales


Exercise

Exercise.

  • Find the Anova table.

  • Perform SNK tests at a = 5% to group treatments .

  • Perform Duncan tests at a = 5% to group treatments.

  • Which treatment would you use?


Applied statistics using sas and spss

Post Hoc and Priori comparisons

  • F test for linear combination of column means (contrast)

  • Scheffe test: To test all linear combinations at once. Very conservative; not to be used for a few of comparisons.


  • Login