Comparing alternative systems

Comparing alternative systems Simultaneous output analyses for more than one system

Introduction • Previously, we considered only one system and decided the run length as well as # of runs (replications) one should have for statistical analyses. • Here, we discuss statistical analyses of several different simulation models that might represent competing system designs or alternate operating policies. • This is much more practical than considering just one system, because in reality an organization is always trying many things simultaneously. • As was shown in previous section, just taking one replication leads to erroneous inferences about the multiple systems as well.

Introduction • Considering multiple systems simultaneously also has additional complexities. • Other issues, such as correlation between outputs of these systems, must be considered. • Another issue is whether to calculate descriptive statistics for each systems individually and then compare or adopt a different method which considers all the models together. • That is, whether to look for individual differences in each run or look for collective difference. • First we will look at comparison between two terminating simulation models and then look at non-terminating case.

Comparing two systems • We consider a special case of comparing two systems on the basis of some performance measure (or expected system response). • We calculate the confidence interval for the difference in the two expectations. • Hypothesis testing approach is not used because confidence interval approach gives us additional information. • For i = 1, 2, let Xi1, Xi2, … Xini be a sample of ni IID observations from system i. • Let µi = E[Xi] be the expected response of the interest. • We want to build confidence interval for the difference ζ = µ1 - µ2.

Comparing two systems: paired t-test • Let n1 = n2 = n. • This means that if n1 > n2, then to make the two samples of equal size, we loose some data from sample 1. • Let X1j be the performance measure from system 1 in the first replication. Correspondingly let X2jbe the measure from second system. • Let Zj = X1j – X2j for j = 1,2 … n. • Then the Zj variables are IID variables and its expected value, E[Zj] = ζ. • And this the quantity for which we want to build the confidence interval.

Comparing two systems: paired t-test • We get that: • And using this, we form the 100(1 – α) percent confidence interval as:

Comparing two systems: paired t-test • If Zj’s are normally distributed, then this confidence interval is exact – it covers ζ with probability 1 – α. • Otherwise we rely on the central limit theorem, which implies that this coverage probability will be near 1 – α for large n. • We did not assume that X1j and X2j are independent. • Actually, allowing positive correlation between X1j and X2j actually reduces the variance of Zj’s. • Neither did we assume that equality of variance Var(X1j) = Var(X2j). • Such confidence interval is called paired t-confidence interval.

Modified two-sample-t-Confidence interval • One may argue that in the previous method, we lost some information because we wanted to keep sample sizes same for both the models. • Also, some more information may have been lost because of pairing-up the two systems right upfront. • We look at this modified method where we need not pair up the observations from the two systems. • However, this method does require that Xij’s be independent. • There are two versions of the two-sample-t-confidence interval: Classical and Modified.

Modified two-sample-t-Confidence interval • In the classical method, we must have the same variance condition satisfied. That is, Var(X1j) = Var(X2j). • If this condition is not satisfied, we may get serious hit in the coverage of the confidence interval. • Note that this is not that severe when the sample sizes are equal. • However, the condition of equality of variances is almost never satisfied. • Hence we use the modified method which does not require the condition.

Modified two-sample-t-Confidence interval • Let • The estimated degrees of freedom is given by: • And the 100(1 – α) confidence interval for ζ is given by:

Comparison based on steady-state performance measures • Here, we can’t easily replicate the models, since the initialization (warm-up) effects may bias the output. • Hence, generally it is more difficult to compare two systems based on steady-state performance measures. • One way of achieving that could be to define: where li is warm-up time for system i and mi is the minimum number of Dijp’s in any replications.

Comparison based on steady-state performance measures • Then we can define the difference variable: Zj = X1j – X2j for j = 1, 2,…n. • Let νi be the steady-state mean behavior of the system i. then Zj variables gives an estimation of ν1 – ν2.

Comparison of more than two systems • While analyzing more than two systems, we have to make several confidence-interval statements simultaneously. • Hence individual confidence levels should be raised upwards so that overall confidence level of all intervals covering their respective targets is at the desired level 1 – α. • We use Bonferroni method. • Suppose that Is is a 100(1 – αs) percent confidence interval for the measure of performance µs (where s = 1, 2, …k.)

Comparison of more than two systems • Then the probability that allk confidence intervals simultaneously contain their respective true measures satisfies: • Suppose that one constructs 90% confidence intervals, that is αs = 0.1, for s, for 10 different systems. • Then the probability that each of the 10 confidence intervals contain their true measure can only be claimed to be greater than or equal to zero. • Thus one cannot have much overall confidence in drawing any conclusions from such study.

Comparison of more than two systems • If we want to make some c number of confidence interval statements, then the trick is to make each separate interval at level 1 – α/c, so that the overall confidence level associated with all intervals covering their targets will be at least 1 – α. • For example, if we want to make c = 10 intervals and get an overall confidence level of 100(1 – α) percent = 90%, then we must make each individual interval at the 99% level. • Clearly for large c, this implies that the individual intervals may become quite wide.

Comparison of more than two systems Comparing with the standard • Suppose that one of the model variant is a “standard,” perhaps representing the existing system or policy. • Say the standard system is “System 1” and other variant systems are 2, 3, …k, the goal then is to construct k – 1 confidence intervals for the k – 1 differences: µ2 – µ1, µ3 – µ1, µ4 – µ1,… µk – µ1 with overall confidence level 1 – α. • Thus we make c = k – 1 individual intervals, so they must each be constructed at level 1 – α/(k – 1).

Comparison of more than two systems Comparing with the standard • Then we can say (with a confidence level of at least 1 – α) that for i = 2, 3, …k, the system i differs from the standard if the interval for µi – µ1 misses 0, and that system i is not significantly different from the standard if this interval contains zero.

Comparison of more than two systems All pair-wise comparisons • Sometimes we may want to compare each system with every other system to detect and quantify any significant pair-wise differences. • There may not an existing system and all k alternatives represent possible implementations that should be treated in the same way. • One approach would be to form confidence intervals for the differences µi2 – µi1 for all i1 and i2 between 1 and k with i1 < i2. • Hence there will be k(k – 1)/2 individual intervals. • Hence each must be made at level (1 – α/[k(k – 1)/2]) in order to have confidence level of at least 1 – α for all the intervals together.

Comparison of more than two systems Selection of the best out of the k systems • Let Xij be the performance measure of interest from the jth replication of the ith system. • For all the selection problems, we assume that all Xij’s are independent of each other. That is, replications for a given alternative are independent, and the runs for each different alternatives are also made independently. • For example, this Xij could be the average total cost per month for the jth replication of policy i for the inventory model. • Let µil be the lth smallest of the population performance measure µi’s so that: µi1 <= µi2 <= µi3…. <= µik. • Our goal is to select a system with the smallest expected response, µi1.

Selection of the best out of the k systems • Let “CS” denote the event of “correct selection.” • The inherent randomness of the observed Xij’s implies that we can never be absolutely sure that we shall make the CS, we would like to be able to pre-specify the probability of CS. • Also, if µi1 and µi2 are very close to each other, we might not case if we erroneously choose system i2 (the one with the meanµi2). • So, we want a method that avoids making a large number of replications to resolve this unimportant difference. • Problem statement: We want Pr{CS} >= P* provided µi1– µi2 >= d*, where the minimum probability P* and the “indifference” tolerance d* is specified.

Selection of the best out of the k systems • One might naturally ask: What if µi1– µi2< d*? • The method specified here guarantees that with probability P*, the expected performance measure of the selected system will be no larger than µi1 + d*. • Thus, we are protected (with a probability P*) against selecting a system with mean that is more than d* worse than that of the best selected system. • The proposed involves “two-staged” sampling from each of the k systems.

Selection of the best out of the k systems • In the first stage we make a fixed number of replications of each system. • Then use the resulting variance estimates to determine how many more replications from each system are necessary in the second stage. • We assume that Xij’s are normally distributed, however we need not assume equality of variances; nor do we have to assume that the population variances are known, that is σi2 = Var(Xij) are known for all i. • In fact, the method is robust even if the normality assumption is violated if the Xij’s are averages.

Selection of the best out of the k systems • In the first stage of sampling, we make n0 > 2 replications for each of the k systems and define the first stage sample mean and variance: • Then we compute the total sample size Ni needed for system i.

Selection of the best out of the k systems • Here, h1 – that depends on k, P*, and n0 – is a constant, and can be obtained from a standard table. • Next, we make Ni – n0 additional replications for system i. We obtain the second stage sample mean: • Then we define the two weights to be used to get the weighted average of first stage and second stage sample means.

Selection of the best out of the k systems • First weight: • Finally, define the weighted sample means

Selection of the best out of the k systems • Finally, we select the system with the smallest weighted sample mean. • The choice of P* and d* depend on the goals and the particular systems under study. • Choice of these should be made considering trade-off between the computing cost or obtaining a large # of replications and large P* and small d*.

Comparing alternative systems

Comparing alternative systems

Presentation Transcript

Comparing Operating Systems

Comparing Political Systems

Alternative Electoral Systems

Comparing Operating Systems

TECHNIQUES FOR COMPARING ALTERNATIVE EDUCATION INVESTMENTS

Comparing Linux File Systems

Comparing Early Childhood Systems

Alternative Cropping Systems…

Comparing Political Systems

Ch. 26 Comparing Economic Systems

Comparing P2P Systems

Alternative Electoral Systems

Comparing Different Home Heating Fuels and Alternative Energy Systems

Identifying Disadvantaged Children: Comparing Alternative Approaches

COMPARING VERTEBRATE BODY SYSTEMS

Comparing Respiratory Systems

Alternative Electoral Systems

Comparing Political Systems

Comparing Economic Systems

Comparing Government Systems

Comparing Alternative Student Loans

Comparing Economic Systems