Example 10.1Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing
Background Information • The manager of Pepperoni Pizza Restaurant has recently begun experimenting with a new method of baking its pepperoni pizzas. • He believes that the new method produces a better-tasting pizza, but he would like to base a decision on whether to switch from the old method to the new method on customer reactions. • Therefore he performs an experiment.
The Experiment • For 100 randomly selected customers who order a pepperoni pizza for home delivery, he includes both an old style and a free new style pizza in the order. • All he asks is that these customers rate the difference between pizzas on a -10 to +10 scale, where -10 means they strongly favor the old style, +10 means they strongly favor the new style, and 0 means they are indifferent between the two styles. • Once he gets the ratings from the customers, how should he proceed?
Hypothesis Testing • This example’s goal is to explain hypothesis testing concepts. We are not implying that the manager would, or should, use a hypothesis testing procedure to decide whether to switch methods. • First, hypothesis testing does not take costs into account. In this example, if the new method is more costly it would be ignored by hypothesis testing. • Second, even if costs of the two pizza-making methods are equivalent, the manager might base his decision on a simple point estimate and possibly a confidence interval.
Null and Alternative Hypotheses • Usually, the null hypothesis is labeled Ho and the alternative hypothesis is labeled Ha. • The null and alternative hypotheses divide all possibilities into two nonoverlapping sets, exactly one of which must be true. • Traditionally, hypotheses testing has been phrased as a decision-making problem, where an analyst decides either to accept the null hypothesis or reject it, based on the sample evidence.
One-Tailed Versus Two-Tailed Tests • The form of the alternative hypothesis can be either a one-tailed or two-tailed, depending on what the analyst is trying to prove. • A one-tailed hypothesis is one where the only sample results which can lead to rejection of the null hypothesis are those in a particular direction, namely, those where the sample mean rating is positive. • A two-tailed test is one where results in either of two directions can lead to rejection of the null hypothesis.
One-Tailed Versus Two-Tailed Tests -- continued • Once the hypotheses are set up, it is easy to detect whether the test is one-tailed or two-tailed. • One tailed alternatives are phrased in terms of “>” or “<“ whereas two tailed alternatives are phrased in terms of “” • The real question is whether to set up hypotheses for a particular problem as one-tailed or two-tailed. • There is no statistical answer to this question. It depends entirely on what we are trying to prove.
Types of Errors • Whether or not one decides to accept or reject the null hypothesis, it might be the wrong decision. • One might reject the null hypothesis when it is true or incorrectly accept the null hypothesis when it is false. • These errors are called type I and type II errors. • In general we incorrectly reject a null hypothesis that is true. We commit a type II error when we incorrectly accept a null hypothesis that is false.
Types of Errors -- continued • These ideas appear graphically below. • While these errors seem to be equally serious, actually type I errors have traditionally been regarded as the more serious of the two. • Therefore, the hypothesis-testing procedure factors caution in terms of rejecting the null hypothesis.
Significance Level and Rejection Region • The real question is how strong the evidence in favor of the alternative hypothesis must be to reject the null hypothesis. • The analyst determines the probability of a type I error that he is willing to tolerate. The value is denoted by alpha and is most commonly equal to 0.05, although alpha=0.01 and alpha=0.10 are also frequently used. • The value of alpha is called the significance level of the test.
Significance Level and Rejection Region -- continued • Then, given the value of alpha, we use statistical theory to determine the rejection region. • If the sample falls into this region we reject the null hypothesis; otherwise, we accept it. • Sample evidence that falls into the rejection region is called statistically significant at the alpha level.
Significance from p-values • This approach is currently more popular than the significance level and rejected region approach. • This approach is to avoid the use of the alpha level and instead simply report “how significant” the sample evidence is. • We do this by means of the p-value.The p-value is the probability of seeing a random sample at least as extreme as the sample observes, given that the null hypothesis is true.
Significance from p-values -- continued • Here “extreme” is relative to the null hypothesis. • In general smaller p-values indicate more evidence in support of the alternative hypothesis. If a p-value is sufficiently small, almost any decision maker will conclude that rejecting the null hypothesis is the more “reasonable” decision.
Significance from p-values -- continued • How small is a “small” p-value? This is largely a matter of semantics but if the • p-value is less than 0.01, it provides “convincing” evidence that the alternative hypothesis is true; • p-value is between 0.01 and 0.05, there is “strong” evidence in favor of the alternative hypothesis; • p-value is between 0.05 and 0.10, it is in a “gray area”; • p-values greater than 0.10 are interpreted as weak or no evidence in support of the alternative.