Download Presentation
## Outline

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**More Experiment DesignCS 239Experimental Methodologies for**System SoftwarePeter ReiherMay 8, 2007**Outline**• Multiplicative experiment design models • 2kr factorial experiment designs • 2k-p fractional factorial designs • Confounding in fractional factorial designs**Multiplicative Models for 22r Experiments**• Assumptions of additive models • Example of a multiplicative situation • Handling a multiplicative model • When to choose multiplicative model • Multiplicative example**Assumptions of Additive Models**• Last time’s analysis used additive model: • yij = q0+ qAxA+ qBxB+ qABxAxB+ eij • Assumes all effects are additive: • Factors • Interactions • Errors • This assumption must be validated!**Example of aMultiplicative Situation**• Testing processors with different workloads • Most common multiplicative case • Consider 2 processors, 2 workloads • Use 22r design • Response is time to execute wj instructions on processor that takes vi seconds/instruction • wj and vi sound like good factors to test • Without interactions, time is yij = viwj**Handlinga Multiplicative Model**• Take logarithm of both sides: yij = viwj so log(yij) = log(vi) + log(wj) • Use additive model on logarithms • XA is log(vi), XB is log(wj) • Choose your high and low levels for each • Resulting model is: • log(yij) = q0 +qA XA+ qB XB+ qAB XA XB +eij • But we care about yij, not log(yij)**Converting Back to yij**• Take antilog of both sides of equation • UA = 10 qA • UB = 10 qB • UAB = 10 qAB**Meaning of aMultiplicative Model**• Model is • Here, mA = 10qA is ratio of MIPS ratings of processors, mB = 10qB is ratio of workload size • Antilog of q0 is geometric mean of responses:where n = 22r**When to Choosea Multiplicative Model?**• Physical considerations (see previous slides) • Range of y is large • Making arithmetic mean unreasonable • Calling for log transformation • Plot of residuals shows large values and increasing spread • Quantile-quantile plot doesn’t look like normal distribution**Multiplicative Example**• Consider additive model of processors A1 and A2 running benchmarks B1 and B2: • Note large range of y values**Multiplicative Model**• Taking logs of everything, the model is:**Summary ofthe Two Models**• Which suggests the time to run a benchmark depends only on the processor speed and benchmark size • Sounds about right**General 2kr Factorial Design**• Simple extension of 22r • Just k factors, not 2 • See Box 18.1 for summary • Always do visual tests • Remember to consider multiplicative model as alternative**Example of 2krFactorial Design**• Consider a 233 design • 3 factors • 2 levels for each • 3 replications of each combination • There will be more factor interaction terms, of course**Allocation of Variation for 233 Design**• Percent variation explained: • 90% confidence intervals**Quantile-Quantile Plot for Means 233**• R2 for this one is .94**Concerns With These Kinds of Designs**• They don’t test all possible levels • Only test two, in fact • Solved by full factorial designs • Which we’ll cover later • They are a lot of work • Especially if there are many factors • Solved by fractional factorial design**Fractional Designs**• What if there are many factors? • You can’t afford to test all combinations • Well, then, test only some of them • How should you determine which combinations to test? • Losing least information**2k-p FractionalFactorial Designs**• Introductory example of a 2k-p design • Preparing the sign table for a 2k-p design • Confounding • Algebra of confounding • Design resolution**What Is A 2k-p FractionalFactorial Design?**• As before, test only two levels of each factor • But instead of testing all 2k factors, • Only test 2k-p of them • The larger p is, the fewer combinations tested • E.g., for k = 5 and p = 2, reduces tests from 32 to 8**Introductory Exampleof a 2k-p Design**• Exploring 7 factors in only 8 experiments • k = 7, p = 4 • Full factorial design would take 128 experiments • Won’t we save time! • Would be nice to know what price we paid • We can’t know everything • But we can get some control**Analysis of 27-4 Design**• Column sums are zero: • Sum of 2-column product is zero: • Sum of column squares is 27-4 = 8 • Orthogonality allows easy calculation of effects:**Effects and Confidence Intervals for 2k-p Designs**• Effects are as in 2k designs: • % variation proportional to squared effects • For standard deviations and confidence intervals: • Use formulas from full factorial designs • Replace 2k with 2k-p**Preparing the Sign Table for a 2k-p Design**• Start by preparing a sign table for k-p factors • Assign first k-p factors as before • Then assign remaining factors • In the place of some (or all) of the combined effects columns**Sign Table for k-p Factors**• Same as table for experiment with k-p factors • I.e., 2(k-p) table • 2k-p rows and 2k-p columns • First column is I, contains all 1’s • Next k-p columns get k-p selected factors • Rest are products of factors**Assigning Remaining Factors**• 2k-p-(k-p)-1 product columns remain • Choose any p columns • Assign remaining p factors to them • Any others stay as-is, measuring interactions**An Example**• Let’s build a 25-2 table • So there are five factors A, B, C, D, and E • But we only want to run 8 experiments • p = 2 • 5-2=3**Running Experiments With This Sign Table**• Use it just as before • Run the set of experiments the table indicates • E.g., run A,B,C,D at low level, E at high level • Then A and E at high, B, C, and D at low • And so on**Calculating Effects With the Sign Table**• Just like before • Multiply experiment results by columns • Add up results • Divide by number of experiments • There are your q values**What Have We Paid?**• The fourth column shows the combined effects of A and B • The fifth column shows the combined effects of A and C • What about all the other effect combinations?**Confounding**• The other combined effects were confounded • The confounding problem • An example of confounding • Confounding notation • Choices in fractional factorial design**The Confounding Problem**• Fundamental to fractional factorial designs • Some effects produce combined influences • Limited experiments mean only some combinations can be calculated • Problem of combined influence is confounding • Inseparable effects called confounded effects**An Example of Confounding**• Consider this 23-1 table: • Extend it with an AB column:**Analyzing theConfounding Example**• Effect of C is same as that of AB: qC = (y1-y2-y3+y4)/4 qAB = (y1-y2-y3+y4)/4 • Formula for qC really gives combined effect: qC+qAB = (y1-y2-y3+y4)/4 • No way to separate qC from qAB • Not a problem if qAB is known to be small**Let’s Go Back to Our Example**• Where are combined effects AD, AE, BC, BD, BE, CD, CE, and DE? • Not to mention ABC, ABD, ABE, ACD, ACE, ADE, BCD, BDE, BCE, CDE, ABCD, ABCE, ACDE, ABDE,BCDE, and ABCDE?**Confounding Notation**• Previous 23-1 confounding is denoted by equating confounded effects:C = AB • Other effects are also confounded in this design:A = BC, B = AC, I = ABC • Last entry indicates ABC is confounded with overall mean, or q0**What Does Confounding Really Mean?**• Each effect is a combination of several effects from a full experiment • Impossible to pull out one from the other • Unless you change design and make more runs • Must be aware of what’s getting confounded**Getting Concrete on The Meaning of Confounding**• Consider our generic 23-1 fractional factorial experiment • What if we’re measuring computer performance? • With three factors: • CPU speed (A) • Memory size (B) • Disk speed (C)**Using Our Fractional Design,**• We have combined the effect of disk speed with the interaction of CPU speed and memory size (C=AB) • And the effect of CPU speed with the combined effects of disk speed and memory size (A=BC) • Among several others