Statistics. CSE 807. Experimental Design and Analysis. How to: Design a proper set of experiments for measurement or simulation. Develop a model that best describes the data obtained. Estimate the contribution of each alternative to the performance. Isolate the measurement errors.
E.g., throughput, response time.
E.g., CPU type, memory size, number of disk drivers, workload used, and user’s educational level.
Also called predictor variables or predictors.
E.g., the CPU type has three levels:
68000, 8080, or Z80.
# of disk drives has four levels.
Also called treatment.
E.g., CPU type, memory size only, and number of disk drives.
E.g., the work loads.
E.g., Full Factorial design with 5 replications:
3 X 3 X 4 X 3 X 3 or 324 experiments, each repeated five times.
E.g., users. Generally, no interest in comparing the units.
Goal - minimize the impact of variation among the units.
1. The variation due to experimental error is ignored.
2. Important parameters are not controlled.
3. Effects of different factors are not isolated.
4. Simple one-factor-at-a-time designs are used
5. Interactions are ignored.
6. Too many experiments are conducted.
Better: two phases.
Not statistically efficient.
Wrong conclusions if the factors have interaction.
Can find the effect of all factors.
Too much time and money.
May try 2k design first
May not get all interactions.
Not a problem if negligible interactions.
a. CPU type: 68000, 8086, 80286
b. Operating System type: CPM, MS-DOS, UNIX
c. Disk drive type: A, B, C
How many experiments are required to analyze the performance if
a. There is significant interaction among factors.
b. There is no interaction among factors
c. The interactions are small compared to main effects.
E.g., memory size, the number of disk drives
Performance in MIPS
-1 if 4M bytes memory
1 if 16M bytes memory
-1 if 1M bytes cache
1 if 2M bytes cache
y = q0 + qAxA + qBxB +qABxAxB
15= q0 - qA - qB + qAB
45= q0 + qA - qB - qAB
25= q0 - qA + qB - qAB
75= q0 + qA + qB + qAB
y = 40 + 20xA + 10xB + 5xAxB
Interpretation: Mean performance = 40 MIPS
Effect of memory = 20 MIPS
Effect cache = 10 MIPS
Interaction between memory and cache = 5 MIPS
Model: y = q0 + qAxA + qBxB +qABxAxB
y1 = q0 - qA - qB + qAB
y2 = q0 + qA - qB - qAB
y3 = q0 - qA + qB - qAB
y4 = q0 + qA + qB + qAB
q0 =1/4 (y1 + y2 + y3 + y4)
qA =1/4 (-y1 + y2 - y3 + y4)
qB =1/4 (-y1 - y2 + y3 + y4)
qAB =1/4 (y1 - y2 - y3 + y4)
Notice that effects are linear combinations of responses.
Sum of the coefficients is zero => contrasts.
Notice: qA = Column A x Column y
qB = Column B x Column y
qAB = Column A x Column B x Column y
= sum of squares total (SST)
For a 22 design:
Variation due to
Variation due to
Variation due to interaction
SST = SSA + SSB + SSAB
Fraction explained by
yi = q0 + qAxAi + qBxBi +qABxAixBi
1. The sum of entries in each column is zero:
2. The sum of the squares of entries in each column is 4:
Variation of y
Total variation = 2100
Variation due to memory = 1600 (76%)
Variation due to cache = 400 (19%)
Variation due to interaction = 100 (5%)
Memory interconnection networks:
Omega and Crossbar.
Memory reference patterns:
random and Matrix
1. Number of processors was fixed at 16.
2. Queued requests were not buffered but blocked.
3. Circuit switching instead of packet switching.
4. Random arbitration instead of round robin.
5. Infinite interleaving of memory => no memory back contention.
Factors Used in the Interconnection Network Study
Omega networks = Average + 0.0595
Crossbar networks = Average - 0.0595
Difference between the two = 0.119
k factors at two levels each.
k main effects
Two factor interactions
Three factor interactions...
Three factors in designing a machine:
Number of processors
Number of Processors (C) is the most important factor
Analyze the 23 design:
a. Quantify main effects and all interactions.
b. Quantify percentages of variation explained.
c. Sort the variables in the order of decreasing importance