- 279 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Doing ANOVA and t-tests' - grover

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Doing ANOVA and t-tests

LISA short course by Ciro Velasco-Cruz

Example

In a study, 15 lobsters were randomly selected from recent catches along a certain region of the Maine shore line. The lobsters were weighed to the nearest ounce, with results:

26 14 18 13 22 15 24 21 29 10 12 31 19 16 21

Suppose that for research purposes it is needed that the mean lobster’s weight equal to 15 ounces. It is known that lobster weight is normally distributed with both mean and standard deviation unknown.

SAS for coding

The data step

data lobsters_w;

input type weigth @@;

datalines;

1 26 1 14 1 18 1 13 1 22

1 15 1 24 1 21 1 29 1 10

1 12 1 31 1 19 1 16 1 21

;

SAS for coding

Exploratory data analysis:

procmeans data=lobsters_w mean std max min median;

var weigth;

run;

procboxplot data=lobsters_w;

title'BoxPlot for one sample t-test example';

plot (weigth)*type/ cframe = vligb

cboxes = dagr

cboxfill = ywh;

inset mean max min /CFILL = WHITE

header = "Summary"

CTEXT = RED;

run;

SAS coding

Data analysis:

procttest data=lobsters_w h0=15;

title 'One sample t test example';

var weigth;

run;

SAS OUTPUT

Conclusion: Since the p-value is <0.05, we reject the Null Hypothesis, that the mean=15, at 5% of level of significance.

Two Sample t-test example

An animal scientist is interested in comparing two different topical treatments (A, B) against osteoarthritis in the leg joints of horses. Seven horses with the illness are available at the animal clinic. For each horse it is randomly determined which of the front legs receives treatment A and which treatment B. After four weeks of treat., the horses’ mobility is measured.

Assuming that they were two independent samples, we can perform our tests.

SAS data step

data horses;

input trt horse mobility @@ ;

cards;

1 1 48.2 1 2 44.6 1 3 49.7 1 4 40.5

1 5 54.6 1 6 47.1 1 7 46.8 2 1 41.5

2 2 40.1 2 3 44.0 2 4 41.2 2 5 49.8

2 6 41.7 2 7 51.4

;

SAS E.D.A.

procmeans data=horses mean std max min median;

class trt;

var mobility;

run;

procboxplot data=horses;

title'BoxPlot for two sample t-test example';

plot (mobility)*trt/ cframe = vligb

cboxes = dagr

cboxfill = ywh;

insetgroup mean max min q1 q2 q3/header = 'Summary by Treatme ctext = red;

run;

Conclusion

- About Variance: Since the p-value is larger than 5%, we conclude that the variances are indeed equal.
- About means: Since p-value for this test is larger to 5% too, we conclude that the means are equal.

Paired t test example

- Let’s consider the last example. Since treatment A and B were both measured on the same horse. Measurements of mobility are not independent within horses. Then the right way to analyze the data is by Paired t test.
- Idea: we look at the difference between the response from trts A and B:

Di=YiA-YiB

One Way Anova

An experiment was conducted to study the growth of plant tissue in the presence of hormone solutions containing various growth inhibiting substances. For each solution, 10 independent tissues cultures were prepared and the growth of the plant tissue was recorded in mm.

This experiment has One factor and 5 levels. Each has 10 replications.

SAS data step

data peasection;

input trtmnt growth @@;

label trtmnt= 1:'Control'

2:'Sol.1'

3:'Sol.2'

4:'Mixture'

5:'Sol.3';

datalines;

1 7.84 1 8.69 1 8.11 1 8.35 1 7.74

1 7.69 1 7.98 1 7.64 1 8.57 1 8.32

2 6.78 2 6.69 2 6.95 2 6.64 2 6.41

2 6.69 2 6.72 2 6.57 2 6.67 2 7.07

3 6.79 3 6.79 3 6.79 3 6.61 3 6.43

3 6.69 3 6.57 3 6.49 3 7.05 3 6.72

4 6.64 4 6.57 4 6.78 4 6.48 4 6.54

4 6.36 4 6.67 4 6.26 4 6.67 4 6.68

5 7.31 5 7.65 5 7.26 5 7.39 5 6.98

5 7.46 5 7.32 5 7.13 5 7.07 5 7.25

;

SAS coding

procboxplot data=peasection;

title'BoxPlot for one-way ANOVA example';

plot growth*trtmnt/ cframe = vligb

cboxes = dagr

cboxfill = ywh;

insetgroup mean stddev q1 q2 q3/header = 'Summary by Treatment'

ctext = red;

run;

SAS glm anyway

procglm data=peasection;

class trtmnt;

model growth=trtmnt;

lsmeans trtmnt /pdiff adjust=tukey ;

contrast 'our first contrast with contrast' trtmnt -10-102;

estimate 'our first contrast with estimate' trtmnt -10-102;

output out=residuals p=yhat r=res;

run;

SAS output

Note that: -(8.093+6.693)+2*7.282= -.222

Remedies

- Transform the response:

Log(var(y))=Co+q*log(mean)

- g(y)=y^(1-q/2) if q different to 2
- g(y)=log(y) q=2 and y>0
- g(y)=log(y+shift) q=2 if some y <=0
- Use analysis for Gaussian data with unequal variances: Satterthwaite’s approximation or Welch (for one-way anova)

SAS E.D.A.

procmeans data=peasection noprint;

var growth;

by trtmnt;

output out=varmeans var= vargro mean=meangro;

run;

data varmeans;set varmeans; vargro=log(vargro);meangro=log(meangro);

procgplot data=varmeans;

plot vargro*meangro;

run;

procreg data=varmeans;

model vargro=meangro;

run;

SAS trans. And analysis code

data trans;

set peasection;

yt=growth**-2.69881;

;

procglm data=trans;

class trtmnt;

model yt=trtmnt;

means trtmnt /hovtest=levene(type=square);

output out=resi r=res;

run;

procboxplot data=resi;

title'BoxPlot for one-way ANOVA example';

plot res*trtmnt/ cframe = vligb

cboxes = dagr

cboxfill = ywh;

insetgroup mean stddev q1 q2 q3/header = 'Summary by Treatment'

ctext = red;

run;

Two-way ANOVA fixed factors

An educational researcher was interested in the

factors noise and solitude as they affect study conditions. Each subject in an experiment was asked to study an essay on American history for 15 minutes and then was tested on a 25 item quiz, the number of correct items being the score. The subjects differed, however, in the conditions under which they were allowed to study

Factor Solitude with 2 levels: Alone and not alone (w/stooge)

Factor Noise with 3 levels: no noise, soft background music, and loud rock and roll music.

There are 3 replication of each treatment combination.

SAS data step

data QuizScores;

input Solitude $ Noise $ Score @@;

datalines;

Alone None 10 Alone None 6 Alone None 14

Alone Soft 21 Alone Soft 21 Alone Soft 16

Alone Loud 5 Alone Loud 15 Alone Loud 7

Stooge None 6 Stooge None 11 Stooge None 1

Stooge Soft 6 Stooge Soft 17 Stooge Soft 13

Stooge Loud 1 Stooge Loud 2 Stooge Loud 6

;

SAS E.D.A

procboxplot data=quizscores;

title'BoxPlot for two-way ANOVA example';

plot score*noise(solitude)/ cframe = vligb

cboxes = dagr

cboxfill = ywh;

*inset mean max min/pos=tm header='The overall summary';

insetgroup mean stddev q1 q2 q3/header = 'Summary by Treatment' ctext = red;

run;

procmeans data=quizscores noprint;

by solitude noise;

var score;

output out=meanquizscore mean=meanquiz;

run;

symbol i=j;

symbol2 i=j;

procgplot data=meanquizscore;

plot meanquiz*Noise=solitude;

plot meanquiz*solitude=noise;

run;

Slices

- On this example interaction was not significant. But what we should do if it were?

There are a way to come out with this problem: SLICES.

Since main effects could be either significant or not at the presence of interaction, we need to test how they change at a given level of a treatment.

In SAS, we use the following statement to obtain the slices:

lsmeans “interaction”/slice=treatment;

SAS two way ANOVA random factor

An experiment was performed to examine the effect of time Aging on the strength of cement. From a large number of mixes three cement mixes were randomly selected and six specimens were produced form each mix. After two days three randomly selected specimens from each mix were tested for strength with a load test and the other three specimens were tested after seven days.

This is a two-way classification with factor Cement Mix (three levels) and Time (2 levels) The levels of factor Time were predetermined. The three levels of cement mixes were randomly selected from a large number of mixes, thus Cement Mix factor is Random.

SAS data input

data YieldLoads;

input Aging $ Mix Load @@;

datalines;

2-Days 1 574 2-Days 1 564 2-Days 1 550

2-Days 2 524 2-Days 2 573 2-Days 2 551

2-Days 3 576 2-Days 3 540 2-Days 3 592

7-Days 1 1092 7-Days 1 1086 7-Days 1 1065

7-Days 2 1028 7-Days 2 1073 7-Days 2 998

7-Days 3 1066 7-Days 3 1045 7-Days 3 1055

;

SAS code

procglm data=yieldloads;

class aging mix;

model load = aging mix aging*mix;

random mix aging*mix /test;

run;

OR USING:

procmixed data=yieldloads;

class aging mix;

model load= aging;

random mix mix*aging;

run;

MANOVA example

A researcher randomly assigns 33 subjects to one of three groups:

G1 receives technical dietary information interactively from an on-line website.

G2 receives the same information in from a nurse practitioner

G3 receives the information from a video tape made by the same nurse practitioner

The researcher looks at three different ratings of the presentation, difficulty, useful and importance, to determine if there is a difference in the modes of presentation. In particular, the researcher is interested in whether the interactive website is superior because that is the most cost-effective way of delivering the information.

SAS code

procglm data=manovaex;

class group;

model useful difficulty importance = group;

contrast '1 vs 2&3' group 2 -1 -1;

contrast '2 vs 3' group 01 -1;

manova h=_all_;

run;

Note: go to the manova.sas example

Download Presentation

Connecting to Server..