Design of experiment I. Motivations Factorial (crossed) design Interactions Block design Nested. Examples.
If we have two samples then under mild conditions we can use t-test to test if difference between means is significant. When there are more than two sample then using t-test might become unreliable.
An example suppose that we want to test effect of various exercises on weight loss. We want to test 5 different exercises. We recruit 20 men and assign for each exercises four of them. After few weeks we record weight loss. Let us denote i=1,2,3,4,5 as exercise number and j=1,2,3,4 person’s number. Then Yij is weight loss for jth person on the ith exercise programme. It is one-way balanced design. One way because we have only one category (exercise programme). Balanced because we have exactly the same number of men on each exercise programme.
Another example: Now we want to subdivide each exercises into 4 subcategories. For each subcategory of the exercise we recruit four men. We measure weight loss after few weeks - Yijk. Where:
i – exercise category
j – exercise subcategory
k – kth men.
Then Yijkis weight loss for kth men in the jth subcategory of ith category. Number off observations is 5x4x4 = 80. It is two-fold nested design.
We want to test: a) There is no significant differences between categories; b) there is no significant difference between different subcategories. It is two-fold nested ANOVA
We have 5 categories of exercises and 4 categories of diets. We hire for each exercise and category 4 persons. There will be 5x4x4=80 men. It is two way crossed design. Two-way because we have categorised men in two ways: exercises and diets. This model is also balanced: we have exactly same number of men for each exercise-diet.
i – exercise number
j – diet number
k – kth person
Yijk – kth person in the ith exercise and jth diet.
In this case we can have two different types of hypothesis testing. Assume that mean for each exercise-diet combination is ij. If we assume that model is additive, i.e. effects of exercise and diet add up then we have: ij = i+j. iis the effect of ith exercise and j is the effect of diet. Then we want to test following hypotheses: a) ij does not depend on exercise and b) ij does not depend on diet.
Sometimes we do not want to assume additivity. Then we want to test one more hypothesis: model is additive. If model is not additive then there might be some problems of interpretations with other hypotheses. In this case it might be useful to use transformation to make the model additive.
Models used for ANOVA can be made more and more complicated. We can design three, four ways crossed models or nested models. We can combine nested and crossed models together. Number of possible ANOVA models is very large.
Factors: Variable that we want to study effect of is called factor. For example exercise is a factor, diet is another factor. In medical experiments where effect of some drugs are of interest then the drug is a factor.
Levels: Several values of a factor is usually selected and these values are called levels. For example different exercise types are levels of the exercise factor.
Response: Result of of experiment or observations
Randomisation: Individuals or subjects for experiments are chosen randomly. I.e. if we have n combination of factor levels and we need k replications of each combinations then we take from nk individuals randomly (it could be tricky in many situations) and each of them is assigned to the factor combination randomly with uniform distribution.
Orthogonality: If we take a level of one of the factors and sum over other factor levels if the effects of others sum to their average then this design is called orthogonal design. Orthogonal designs make analysis of the result simpler.
In general design of experiment should be done carefully. Usual considerations for design are:
Let us assume that we have two factors and for each of them we set three levels. For example three exercises and three diets. Exercise is one factor and different exercises are levels of this factor, diet is the second factor and different diets are levels of this factor. The number of all combinations is 3x3=9. Two way crossed or three level, two factor factorial design.
Test of hypothesis that different levels of factors have same effect can be done using linear model and anova. Let us consider model for this case
Where b=(m,b1,b2) and X is (for two factor two level experiments with two replications (if we use the formula ~f1+f2 in R where f1 and f2 are two factors)
1 0 0
1 0 0
1 1 0
1 1 0
1 0 1
1 0 1
1 1 1
1 1 1
It means for first two observations we have y = m, for third and fourth m+b1, for fifth and sixth m+b2 and for last two m+b1+ b2. If we use lm command of R then the result will be difference between mean values of responses for factor 1 level 1 and , differences between effect of level 1 of factor 2 and differences between levels two levels of factor 1 and finally differences between effect of level 2 of factor 2 and differences between levels two levels of factor 1
If we want to estimate mean values of effects of each level directly then we can use the formula ~f1+f2-1. The resultant matrix will look like:
1 0 0
1 0 0
0 1 0
0 1 0
1 0 1
1 0 1
0 1 1
0 1 1
Which means that effect of each factor level will be estimated directly
If we want analyse interactions between two factors then we can fit the model:
The matrix used in lm with formula ~f1+f2+f1*f2 will look like (when there are two levels):
1 0 0 0
1 0 0 0
1 1 0 0
1 1 0 0
1 0 1 0
1 0 1 0
1 1 1 1
1 1 1 1
Note that as expected the final column of the matrix is product of the second and third columns.
Usually factorial design is used for two level experiments. I.e. for each factor two levels are selected and then for all possible combination of the levels. For each combination experiment also is replicated (let us say nr times). Then for n factor two level factorial design we need nr 2n experiments in total. As soon as the number of factors become more than three or four the number of experiments needed for this type of design becomes extremely large.
If we increase the number of factors then the number of experiments for full factorial experiment increases dramatically. For example if we have 4 factors then for two levels for each factor we need 16=24 and if the number of factors is 8 then we need 256=28. We need to replicate each experiment and we need test runs. If we consider all sides then the number of experiments can become very large and resources may not be sufficient to perform all experiments. To avoid explosion of the number of experiments fraction of factorial design is used.
An example of fractional designs: Assume that we have three factors and for each of them we use two levels (denote levels -1 and 1). Then the number of experiments is 8. These are: (-1,-1,-1), (-1,-1,1), (-1,1,-1), (-1,1,1), (1,-1,-1), (1,-1,1), (1,1,-1), (1,1,1). If we take (-1,-1,-1), (-1,1,1), (1,1,-1), (1,-1,1) then for each level of each factor we will have exactly same number of experiments. If we use the same number of replication for each combination of factors then we will have orthogonal design also.
In many cases before designing the experiment we know that there are some parameters that affect the result of experiment in the same way. For example if we are interested in effect of some substance to tree growth then we know that such factors as time of year (e.g. summer and winter), location of where tree grows (e.g. England and France) will have some systematic effect. If we would plant trees randomly over time and location then variation due to time and location would mask out the effect of substances we want to study. In the previous lectures in case of tests of two means we mentioned that paired design (see shoes example) would increase signal substantially. Paired design is a special case of more general design of experiment technique - block design.
When we want block design then block becomes one or more additional factors. For example if we want to remove variation due to location we can repeat exactly same experiment on different location.
Now we we have one factor with 4 levels and one parameter for blocking with 3 levels then for randomised block design we would have 4x3 matrix for the experiment. Randomisation takes place within blocks
General rule: block whatever you can, randomise the rest.
Blocking should be done with care. If effects of block factors is not as expected then the results would be less reliable.
Latin square are kxk squares where on each column and each row numbers from 1 to k appear only once. For example 3x3 Latin square could look like.
Column, rows are levels of factors one and two and values cells are levels of the third factor.
There are higher level of extension of latin squares: for three dimension graeco-latin and for four dimension hyper-graeco-latin squares
If we are testing two or more factors then it is important to consider interaction terms first. So the first question we should ask if there is an interaction between factors. As it was mentioned above the model for two factor cases will be:
If we reject hypothesis that “there is no interaction” then interpretation of the effects of factors (main effects) could be unreliable. If it is possible interactions should be removed before continuing with analysis. One way of removing interactions is transforming data. For example if effects are multiplicative then log transformation could make them additive.
One way of designing transformation is using variance stabilising transformation of observations. Transformation should be applied with care. If transformation is found then it might be better to use generalised linear model with the corresponding link function.
Box and Cox transformation has a parametric form:
(y-1)/l if 0
Box and Cox have designed a likelihood for this transformation. This likelihood function is maximised with respect to . Typical plot for boxcox transformations look like as in the picture. It means that 1/y could be good variance stabilising transformation.
Sublevels of factor levels
Sometime it happens that levels of one factor has nothing to do with that of another factor. For example if we want to test performance of schools then we randomly select several schools and from each school we choose several classes and consider results in maths. Each class in each school was taught by some teacher. It is unlikely that the same teacher taught in several schools. So it would not be reasonable to consider effect of school and classes as additive (class from one school has nothing to do with that from another school). This type of experiments are called nested designs
In factorial designs (crossed designs) we analyse interactions between different factors first and try to remove them. In nested design we want to know interactions. The model (two fold nested design) is:
There is no interest on 2. They have no meaning.
There are basically two type of commands in R. First is to fit general linear model and second is analyse results.
Command to fit linear model is lm and is used
Formula defines design matrix. See help for formula. For example for PlantGrowth data (available in R) we can use
data(PlantGrowth) - load data into R from standard package
lmPlant = lm(PlantGrowth$weight~PlantGrowth$group)
Then linear model will be fitted into data and result will be stored in lmPlant
Now we can analyse them
anova(lmPlant) will give ANOVA table.
If there are more than one factor (category) then for two-way crossed we can use
lm(data~f1*f2) - It will fit complete model with interactions
lm(data~f1+f2) - It will fit only additive model
lm(data~f1+f1:f2) - It will fit f1 and interaction between f1 and f2. It is used for nested models.
Other useful commands for linear model and analysis are
summary(lmPlant) – give summary after fitting
plot(lmPlant) - plot several useful plots
Another useful command for ANOVA is
This command gives confidence intervals for some of the coefficients and therefore differences between effects of different factors.
To find confidence intervals between any two given effects one can use bootstrap.
If distribution of a data set is from one of the members of exponential family then the command is
Where family is one of the distributions from exponential family. These are
poisson, binomial, Gamma etc
And all other analysis are done using similar command like anova, summary etc.
Note that if you use glm then anova will give analysis of deviances instead of analysis of variances and will not print probabilities by default. You need to specify it. For example if you have used Poisson or binomial distributions then you can use