Statistical Analysis – Chapter 4 Normal Distribution

1 / 25

Statistical Analysis – Chapter 4 Normal Distribution - PowerPoint PPT Presentation

Statistical Analysis – Chapter 4 Normal Distribution . What is the normal curve?. In chapter 2 we talked about histograms and modes

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Statistical Analysis – Chapter 4 Normal Distribution' - shanta

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Statistical Analysis – Chapter 4Normal Distribution

What is the normal curve?
• In chapter 2 we talked about histograms and modes
• A normal distribution is when a set of values for one variable, when displayed in a histogram (or line graph) has one peak (mode) and looks like a bell. Here is an example using height:
Characteristics of the Normal Curve
• Bell shaped, fading at the tails. In other words, more values are in the middle, and odd or unusual values fall at the tails
• All (100%) of the data fits on the curve, with 50% before the mean and 50% after
• 68% of the data falls within -1 and +1 standard deviations of the mean
• 95% of the data falls between -2 and +2 standard deviations
• The percentage of data between any two points is equal to the probability of randomly selecting a value between the two points (remember classical probability from Ch. 3)
Standard Deviations and Z-Score
• Z – scores = the number of standard deviations away from the mean.
• z-score = x - µ

σ

(x = data for which we want to know the z-score)

• We use the characteristics of the normal curve, and the z-score, to find out the probability of a particular event or value occurring (remember classical probability from Chapter 3)
Solving Normal Curve Problems Using Z-Scores

(steps listed at bottom of p. 111)

• Draw a normal curve, showing values for (-2 through +2)
• Shade the area in question
• Calculate the z scores and cutoffs (percentages asked for)
• Use the z-scores and cutoffs to solve the normal curve problem
Find Percentages on the Normal Curve Table

Let’s do these questions as a class…

• What is the percentage of data from z = 0 to z = 0.1?
• What is the percentage of data from z = 0 to z = 2.16?
• What is the percentage of data from z = -1.11 to z = 1.11?
• What is the percentage of data above z = 1.24?
• What is the percentage of data below z = -0.6?

• .0398…39.8%
• .4846…48.46%
• .3665 + .3665 = .733…73.3%
• .50 - .3925 = .1075…10.75%
• .50 - .2257 = .2743…27.43%
Working backwards from percentages…
• When working backwards from percentages, we still use the normal table…but look for the percentage to give us the z-score…
• What is the z-score associated 10.2% of the data?
• What is the z-score(s) for the middle 30% of the normal curve?
• What is the z-score of data in the upper 25% of the normal curve?

• z = 0.26
• z = -.39 to z = .39
• z = 0.67
Let’s do Question 4.2

Use the normal curve table to determine the percentage of data in the normal curve

• Between z = 0 and z = .82
• Above z = 1.15
• Between z = -1.09 and z = .47
• Between z = 1.53 and z = 2.78

Work backward in the normal curve table to solve the following:

• 32% of the data in the normal curve data can be found between z = 0 and z = ?
• Find the z score associated with the lower 5% of the data.
• Find the z scores associated with the middle 98% of the data.

• 29.39%
• 12.51%
• 54.29%
• 6.03%
• Between z = 0 and z = .92, or between z = 0 and z = -.92
Question 4.7

Use the normal curve table to determine the percentage of data in the normal curve

• Between z = 0 and z = .38
• Above z = -1.45
• Above z = 1.45
• Between z = .77 and z = 1.92
• Between z = -.25 and z = 2.27
• Between z = -1.63 and z = -2.89

Work backward in the normal curve table to solve the following.

• 15% of the data in the normal curve can be found between z = 0 and z = ?
• Find the z score associated with the upper 73.57% of the data.
• Find the z scores associated with the middle 95%
• 14.80%
• 92.65%
• 7.35%
• 19.32%
• 58.71%
• 4.97%
• z = .39 or -.39
• z = -.63
• Between z = -1.96 and z = +1.96
Binomial Distributions and Sampling

Binomial means two categories in a population…

• Males and females
• Sports game players vs. Non sports game players
• Incomes over 40,000 vs. incomes under 40,000

Quick note: Remember…for binomial distributions, we would visualize this data through a pie chart…because we do not have enough categories for a histogram…

Sampling from a Two-Category Population
• With two-category populations, we can describe the population by p – the percentage of values in one category
• This is the same p from the last chapter on probability (classical probability)…

P(event) ≈ s (number of chances for success)

n (total equally likely possibilities)

• We know (actually….statisticians know) that if we randomly sampled from a population, then

ps ≈ p

Sampling Distribution
• In order to know the odds of getting certain values from this particular binomial sample, we have to know the sampling distribution from this population.
• Under certain conditions, the sampling distribution for a binomial value is normal (i.e. the distribution follows the normal curve).
• When the sampling distribution is normal, then we can make predictions using our table and our z-scores
Sampling from a Binomial Distribution
• Suppose, we defined a population (full time FIT students who either shop at Hot Topic), and we have made our measure of interest into a binomial distribution – those who shop at Hot Topic and those who do not.
• Suppose over the last 10 years, marketers have surveyed the FIT population hundreds of times and found that Hot Topic shoppers are p = .13. (those who are non-Hot Topic shoppers is p = .87)
Sampling from a Binomial Distribution
• But suppose sometime later, your manager asks you to lead another study. But this time, you don’t have enough money to survey the whole population, and you have to get a sample.
• We can assume, because so many studies have been done in the past that the true value of Hot Topic shoppers is p = .13. Thus, because we know that ps ≈ p, your sample should have approximately the same value.
Sampling from a Binomial Distribution
• For each sample, we can use the number sampled, and the p value from the population to predict the total number of Hot Topic shoppers. This is called the expected value.
• Expected value = np
• Thus, if we collected a sample of 200 FIT students, how many students would we expect to be Hot Topic shoppers?

np = (200)(.13) = 26

• This expected value is the mean of your sample
Binomial Distribution and the Normal Curve
• Now, we need to decide if we can use the normal curve to solve problems…
• If (np) > 5 and n(1 – p)>5…then the sampling distribution will be normally distributed.
• So, our sample was 200 students.

Is (np) > 5?

Is n(1 – p)>5?

• Yes…and yes.

np = (200)(.13) = 26

n(1 – p) = (200)(1 - .13) = (200)(.87) = 174

Binomial Distribution and the Normal Curve
• What do we mean that a sampling distribution is normal?
• Just like someone’s age is one value among many ages that we tally to make a histogram, we can tally many samples, get the p values of those sample, and construct histograms from these means.
• If we took say, 1000 samples, and tallied the p values for Hot Topic shoppers, then those values, when turned into a histogram, should form a normal curve. Just like if we took the heights of a 1000 women, and tallied those values to get a normal curve.
How to use the Binomial Distribution and the Normal Curve
• Get the mean (µ)…the mean is the expected value (np)
• Get the standard deviation (σ) = √np(1 – p)
• Draw a normal curve using mean and standard dev
• Use the “continuity correction factor,” and add +/- half a unit to the value we want to solve for
• Get the z-scores = x - µ

σ

• Use the normal curve table to solve the problem
Why the “continuity correction factor”?
• This is only for discrete values (where values occupy only distinct points.) For example, in our study, there is no such thing as a “half” or “3/4” Hot Topic shopper. Either you are a shopper or not. Looking at how histograms are presented, you can see why we have to use the correction factor.
• Probability of getting a value equal to or greater than (=>), then you must subtract a half-unit
• Probability of getting a value equal to or lesser than (=<), you must add a half unit.
• Probability of getting the exact value, you must get the Z-scores for a half-unit above and a half-unit below
Now let’s answer a Hot Topic Question…

If you collected a sample of 200 FIT students…

• What is the probability that 13 will be Hot Topic shoppers?
• What is the probability that you will have 30 or more Hot Topic shoppers?
• What is the probability that you will have 25 or less Hot Topic shoppers?

Question

• What is the probability that 13 will be Hot Topic shoppers?
• What is the probability that you will have 30 or more Hot Topic shoppers?
• What is the probability that you will have 25 or less Hot Topic shoppers?

• Get the mean (µ) = expected value = np = (200)(.13) = 26
• Get the standard deviation (σ) = √np(1 – p) = √26(1 - .13) = √26(.87) = √22.62 ≈ 4.76
• Draw a normal curve using mean and standard dev.
• Use the continuity correction factor to correct x. (a) 12.5 and 13.5, (b) 29.5, (c) 25.5
• Get the z-scores. (a) -2.83 and -2.62, (b) .735, (c)-.105
• Solve the problem… (a) 4977 - .4956 = .002, or 2% (b) .50 - .2704 ≈ .23, or 23%, (c) .50 - .0596 = .4404
Now let’s do question 4.16 as a class…

In a marketing population of phone calls, 3% produced a sale. If this population proportion (p = 3%) can be applied to future phone calls, then out of 500 randomly monitored phone calls,

• How many would you expect to produce a sale?
• What is the probability of getting 11 to 14 sales?
• What is the probability of getting 12 or less sales?
• 15
• 32.93%
• 25.46%