Statistics l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 35

Statistics PowerPoint PPT Presentation


  • 90 Views
  • Uploaded on
  • Presentation posted in: General

Statistics. CSE 807. Experimental Design and Analysis. How to: Design a proper set of experiments for measurement or simulation. Develop a model that best describes the data obtained. Estimate the contribution of each alternative to the performance. Isolate the measurement errors.

Download Presentation

Statistics

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Statistics l.jpg

Statistics

CSE 807


Experimental design and analysis l.jpg

Experimental Design and Analysis

How to:

  • Design a proper set of experiments for measurement or simulation.

  • Develop a model that best describes the data obtained.

  • Estimate the contribution of each alternative to the performance.

  • Isolate the measurement errors.

  • Estimate confidence intervals for model parameters.

  • Check if the alternatives are significantly different.

  • Check if the model is adequate.


Example l.jpg

Example

  • Personal workstation design.

  • Processor:68000, Z80, or 8086.

  • Memory size: 512K, 2M, or 8M bytes.

  • Number of Disks: One, two, three, or four.

  • Workload: Secretarial, managerial, or scientific.

  • User education: High school, college, or Post-graduate level.


Terminology l.jpg

Terminology

  • Response Variable: Outcome.

    E.g., throughput, response time.

  • Factors: Variables that affect the response variable.

    E.g., CPU type, memory size, number of disk drivers, workload used, and user’s educational level.

    Also called predictor variables or predictors.

  • Levels: The value that a factor can assume.

    E.g., the CPU type has three levels:

    68000, 8080, or Z80.

    # of disk drives has four levels.

    Also called treatment.


Terminology cont d l.jpg

Terminology (cont’d)

  • Primary Factors: The factors whose effects need to be quantified.

    E.g., CPU type, memory size only, and number of disk drives.

  • Secondary Factors: “Factors whose impact need not be quantified.

    E.g., the work loads.

  • Replication: Repetition of all or some experiments.


Terminology cont d6 l.jpg

Terminology (cont’d)

  • Design: The number of experiments, the factor level and number of replications for each experiment.

    E.g., Full Factorial design with 5 replications:

    3 X 3 X 4 X 3 X 3 or 324 experiments, each repeated five times.

  • Experimental Unit: Any entity that is used for experiments.

    E.g., users. Generally, no interest in comparing the units.

    Goal - minimize the impact of variation among the units.


Terminology cont d7 l.jpg

Terminology (cont’d)

  • Interaction => Effect of one factor depends upon the level of the other.

Non-interacting Factors

Interacting Factors


Common mistakes in experimentation l.jpg

Common Mistakes in Experimentation

1. The variation due to experimental error is ignored.

2. Important parameters are not controlled.

3. Effects of different factors are not isolated.

4. Simple one-factor-at-a-time designs are used

5. Interactions are ignored.

6. Too many experiments are conducted.

Better: two phases.


Types of experimental designs l.jpg

Types of Experimental Designs

  • Simple Designs: Vary one factor at a time

    • #of Experiments =

      Not statistically efficient.

      Wrong conclusions if the factors have interaction.

      Not recommended.


Types of experimental designs cont d l.jpg

Types of Experimental Designs (cont’d)

  • Full Factorial Design: All combinations.

    • # of Experiments =

      Can find the effect of all factors.

      Too much time and money.

      May try 2k design first


Types of experimental designs cont d11 l.jpg

Types of Experimental Designs (cont’d)

  • Fractional Factorial Designs: Save time and expense.

    Less information.

    May not get all interactions.

    Not a problem if negligible interactions.


A sample fractional factorial design l.jpg

A Sample Fractional Factorial Design.


Exercise l.jpg

Exercise

  • The performance of a System being designed depends upon the following three factors:

    a. CPU type: 68000, 8086, 80286

    b. Operating System type: CPM, MS-DOS, UNIX

    c. Disk drive type: A, B, C

    How many experiments are required to analyze the performance if

    a. There is significant interaction among factors.

    b. There is no interaction among factors

    c. The interactions are small compared to main effects.


2 k factorial designs l.jpg

2k Factorial Designs

  • k factors, each at two levels.

  • Easy to analyze.

  • Helps in sorting out impact of factors.

  • Good at the beginning of study.

  • Valid only if the effect is unidirectional.

    E.g., memory size, the number of disk drives


2 2 factorial designs l.jpg

Cache

Size

Memory size

4M Bytes

16M Bytes

1K

2K

15

25

45

75

22 Factorial Designs

  • Two factors, each at two levels

    Performance in MIPS

-1 if 4M bytes memory

1 if 16M bytes memory

-1 if 1M bytes cache

1 if 2M bytes cache

xA=

xB=


Model l.jpg

Model

y = q0 + qAxA + qBxB +qABxAxB

15= q0 - qA - qB + qAB

45= q0 + qA - qB - qAB

25= q0 - qA + qB - qAB

75= q0 + qA + qB + qAB

y = 40 + 20xA + 10xB + 5xAxB

Interpretation: Mean performance = 40 MIPS

Effect of memory = 20 MIPS

Effect cache = 10 MIPS

Interaction between memory and cache = 5 MIPS


Computation of effects l.jpg

Computation of Effects

Model: y = q0 + qAxA + qBxB +qABxAxB

Substitution:

y1 = q0 - qA - qB + qAB

y2 = q0 + qA - qB - qAB

y3 = q0 - qA + qB - qAB

y4 = q0 + qA + qB + qAB


Computation of effects cont d l.jpg

Computation of Effects (cont’d)

Solution:

q0 =1/4 (y1 + y2 + y3 + y4)

qA =1/4 (-y1 + y2 - y3 + y4)

qB =1/4 (-y1 - y2 + y3 + y4)

qAB =1/4 (y1 - y2 - y3 + y4)

Notice that effects are linear combinations of responses.

Sum of the coefficients is zero => contrasts.

Notice: qA = Column A x Column y

qB = Column B x Column y

qAB = Column A x Column B x Column y


Sign table method l.jpg

Sign Table Method


Allocation of variation l.jpg

Allocation of Variation

  • Importance of a factor = proportion of the variation explained

  • Sample variance of

  • Variation of yNumerator

    = sum of squares total (SST)


Allocation of variation cont d l.jpg

Allocation of Variation (cont’d)

For a 22 design:

Variation due to

Variation due to

Variation due to interaction

SST = SSA + SSB + SSAB

Fraction explained by

Variation  Variance


Derivation l.jpg

Derivation

Model:

yi = q0 + qAxAi + qBxBi +qABxAixBi

Notice

1. The sum of entries in each column is zero:

2. The sum of the squares of entries in each column is 4:


Derivation cont d l.jpg

Derivation (cont’d)

  • 3. The columns are orthogonal (inner product of any two columns is zero):


Derivation cont d24 l.jpg

Derivation (cont’d)

Sample mean


Derivation cont d25 l.jpg

Derivation (cont’d)

Variation of y

Product terms


Example26 l.jpg

Example

Memory-cache study:

Total Variation

Total variation = 2100

Variation due to memory = 1600 (76%)

Variation due to cache = 400 (19%)

Variation due to interaction = 100 (5%)


Case study interconnection net l.jpg

Case Study: Interconnection Net

Memory interconnection networks:

Omega and Crossbar.

Memory reference patterns:

random and Matrix

Fixed factors:

1. Number of processors was fixed at 16.

2. Queued requests were not buffered but blocked.

3. Circuit switching instead of packet switching.

4. Random arbitration instead of round robin.

5. Infinite interleaving of memory => no memory back contention.


2 2 design for interconnection networks l.jpg

22 Design for Interconnection Networks

Factors Used in the Interconnection Network Study

Level

Response


Interconnection network study cont d l.jpg

Para-

meter

Mean Estimate

Variation Explained

T

N

R

T

N

R

q0

qA

qB

qAB

0.5725

0.0595

-0.1257

-0.0346

3.5

-0.5

1.0

0.0

1.871

-0.145

0.413

0.051

17.2%

77.0%

5.8%

20%

80%

0%

10.9%

87.8%

1.3%

Interconnection Network Study (cont’d)


Interpretation of results l.jpg

Interpretation of Results

  • Average throughput = 0.5725

  • Most effective factor = B = reference pattern => The address patterns chosen are very different.

  • Reference pattern explains  0.1257 (77%) of variation

  • Effect of network type = 0.0595

    Omega networks = Average + 0.0595

    Crossbar networks = Average - 0.0595

    Difference between the two = 0.119

  • Slight interaction (0.0346) between reference pattern and network type.


General 2 k factorial designs l.jpg

General 2k Factorial Designs

k factors at two levels each.

2kexperiments.

2keffects:

k main effects

Two factor interactions

Three factor interactions...


2 k design example l.jpg

2k Design Example

Three factors in designing a machine:

Cache size

Memory size

Number of processors


2 k design example cont d l.jpg

Cache

Size

4M Bytes

16M Bytes

1 Proc

2 Proc

1 Proc

2 Proc

1K Byte

2K Byte

14

10

46

50

22

34

58

86

2k Design Example (cont’d)


Analysis l.jpg

=18%+4%+71%+4%+1%+2%+0%

=100%

Number of Processors (C) is the most important factor

Analysis


Exercise35 l.jpg

A1

A2

C1

C2

C1

C2

B1

B2

100

40

15

30

120

20

10

50

Exercise

Analyze the 23 design:

a. Quantify main effects and all interactions.

b. Quantify percentages of variation explained.

c. Sort the variables in the order of decreasing importance


  • Login