Loading in 5 sec....

Continuous Random VariablesPowerPoint Presentation

Continuous Random Variables

- By
**adli** - Follow User

- 92 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Continuous Random Variables' - adli

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Consider the following table of sales, divided into intervals of 1000 units each,

We’re going to divide the relative frequencies by the width of the cells (which here is 1000). This will make the graph have an area of 1.

Graph

f(x) = p(x)

0.00030

0.00025

0.00020

0.00015

0.00010

0.00005

0

0 1000 2000 3000 4000 5000 6000 7000 sales

The area of each bar is the frequency of the category, so the total area is 1.

0.00030

0.00025

0.00020

0.00015

0.00010

0.00005

0

0 1000 2000 3000 4000 5000 6000 7000 sales

GraphHere is the frequency polygon.

sales

If we make the intervals 500 units instead of 1000, the graph would probably look something like this:The height of the bars increases and decreases more gradually.

If we made the intervals infinitesimally small, the bars and the frequency polygon would become smooth, looking something like this:

This what the distribution of a continuous random variable looks like.

This curve is denoted f(x) or p(x) and is called the probability density function.

f(x) = p(x)

sales

pmf versus pdf the frequency polygon would become smooth, looking something like this:

- For a discrete random variable, we had a probability mass function (pmf).
- The pmf looked like a bunch of spikes, and probabilities were represented by the heights of the spikes.
- For a continuous random variable, we have a probability density function (pdf).
- The pdf looks like a curve, and probabilities are represented by areas under the curve.

f(x) = p(x) the frequency polygon would become smooth, looking something like this:

a b

sales

Pr(a < X < b)A continuous random variable has an infinite number of possible values & the probability of any one particular value is zero.

If X is a continuous random variable, possible values & the probability of any one particular value is zero.which of the following probabilities is largest?(Hint: This is a trick question.)

- 1. Pr(a < X < b)
- 2. Pr(a ≤ X < b)
- 3. Pr(a < X ≤ b)
- 4. Pr(a ≤ X ≤ b)

They’re all equal.

They differ only in whether they include the individual values a and b, and any one particular value has zero probability!

Properties of possible values & the probability of any one particular value is zero.probability density functions (pdfs)

- 1. f(x) ≥ 0 for values of x
- This means that when we draw the pdf curve, while it may be on the left side of the vertical axis (have negative values of x), it can not go below the horizontal axis, where f would be negative.
- Pr( - ∞ < X < ∞) = 1
- The total area under the pdf curve, which corresponds to the total probability, is 1.

Example possible values & the probability of any one particular value is zero.

- f(x) = 2 if 1 ≤ x ≤ 1.5 and
- f(x) = 0 otherwise

Example possible values & the probability of any one particular value is zero.

- f(x) = 2 if 1 ≤ x ≤ 1.5 and
- f(x) = 0 otherwise

This function satisfies both the properties of pdfs.

First, it’s never negative.

Second, the total area under the curve is (1/2) (2) = 1.

f(x)

2.0

0 1.0 1.5 x

Cumulative Distribution Function for a Continuous Random Variable

- F(x) = Pr(X ≤ x) = area under the f(x) curve up to where X=x.

f(x) Variable

2.0

0 1.0 1.5 x

Rectangle Example: What is F(1.2)?- F(1.2) = Pr(X ≤ 1.2)
- = the area under the pdf up to where x is 1.2.

f(x) Variable

2.0

0 1.0 1.2 1.5 x

Rectangle Example: What is F(1.2)?- F(1.2) = Pr(X ≤ 1.2)
- = the area under the pdf up to where x is 1.2.

f(x) Variable

2.0

0 1.0 1.2 1.5 x

Rectangle Example: What is F(1.2)?- F(1.2) = Pr(X ≤ 1.2)
- = the area under the pdf up to where x is 1.2.

= (0.2) (2.0)

= 0.4

The most famous distribution is the VariableNormal or Gaussian distribution.

- Its probability density function (pdf) is

m is the mean of the distribution, s is the standard deviation,

It is sometimes denoted N (m, s2), which means the normal distribution with a mean of m and a variance of s2.

m Variable1m2m3

If you have three normal distributions with the same standard deviation (same spread), but different means (different averages), they would look like this:

If you had the same mean but different standard deviations, it would look like this:

large standard deviation

If you had the same mean but different standard deviations, it would look like this:

medium standard deviation

large standard deviation

If you had the same mean but different standard deviations, it would look like this:

small standard deviation

middle standard deviation

largest standard deviation

Keep in mind that the areas are all the same, since they all equal 1.

Recall that if a random variable has mean it would look like this:m and standard deviation s, then (X-m)/s has mean 0 and standard deviation 1.

- If X is normally distributed, then (X-m)/s will be standard normal, N(0,1), normal with mean 0 and variance 1.

This theorem is extremely useful.

It means that we don’t need to use the messy normal formula.

We can standardize any normal distribution and look up probabilities in tables for the standard normal distribution.

Using the standard normal table is not difficult, but it takes practice to get accustomed to it.

- The table in your text book gives probabilities that the standard normal (often called the Z) is between zero and a positive number, that is, Pr(0 ≤ Z ≤ a).
- Some tables are set up differently, so you need to notice how a table is computed when you use it.

Z table: You get the integer part & the 1 takes practice to get accustomed to it.st decimal from the left column & the second decimal from the top row.

?

0.4957

Example: Pr(0 ≤ Z ≤ 2.63)

= 0.4957

0 2.63 Z

Do not memorize a lot of rules. takes practice to get accustomed to it. You just need to remember 2 easy facts.

- The graph is symmetric about 0.
- The total area under the curve is 1.

0 Z

Example takes practice to get accustomed to it.

- Pr(Z < 1.85) = 0.5+ 0.4678 = 0.9678

0.5

0.4678

0 1.85 Z

Example takes practice to get accustomed to it.

- Pr(Z > 1.85) = 0.5 - 0.4678 = 0.0322

0.4678

0.0322

0 1.85 Z

Example takes practice to get accustomed to it.

- Pr(Z < -1.85) = 0.5 - 0.4678 = 0.0322

0.4678

0.0322

-1.85 0 1.85 Z

Example takes practice to get accustomed to it.

- Pr(Z > -1.85) = 0.5 + 0.4678 = 0.9678

0.4678

-1.85 0 1.85 Z

Example takes practice to get accustomed to it.

- Pr(-1< Z < 2) = 0.3413 + 0.4772 = 0.8185

0.3413 0.4772

-1.00 0 2.00 Z

Example takes practice to get accustomed to it.

- Pr(1< Z < 2) = 0.4772 - 0.3413 = 0.1359

0.3413

0.4772

-1.00 0 2.00 Z

Example takes practice to get accustomed to it.

- What is the value of k such that Pr(0 < Z < k) = 0.4750 ?

0.4750

0 k Z

Example takes practice to get accustomed to it.

- What is the value of k such that Pr(0 < Z < k) = 0.4750 ?

0.4750

0 1.96 Z

Example takes practice to get accustomed to it.

- What is the value of a such that Pr(Z < a) = 0.9207 ?

0.9207

0 a Z

Example takes practice to get accustomed to it.

- What is the value of a such that Pr(Z < a) = 0.9207 ?

0.4207 = 0.9207 – 0.5000

0.9207

0 a Z

Example takes practice to get accustomed to it.

- What is the value of a such that Pr(Z < a) = 0.9207 ?

0.4207 = 0.9207 – 0.5000

0.9207

0 1.41 Z

Example takes practice to get accustomed to it.

- What is the value of b such that Pr(Z > b) = 0.0250 ?

0.0250

0 b Z

Example takes practice to get accustomed to it.

- What is the value of b such that Pr(Z > b) = 0.0250 ?

0.4750 = 0.5000 – 0.0250

0.0250

0 b Z

Example takes practice to get accustomed to it.

- What is the value of b such that Pr(Z > b) = 0.0250 ?

0.4750 = 0.5000 – 0.0250

0.0250

0 1.96 Z

0 1.00 Z takes practice to get accustomed to it.

Example: If X is N(2, 9), determine Pr(X ≤ 5).0.3413

Useful Fact takes practice to get accustomed to it.

- The distribution of the individual observation is the same as the distribution of the population from which it was drawn.
- For example, if the mean height of a population of men is 70 inches, then the expected value or mean of a randomly selected man will be 70 inches.
- Also, if 5% of the population of men is over 78 inches, then the probability that a randomly selected man will be over 78 inches tall is 5%.

-1.33 0 Z takes practice to get accustomed to it.

Example: Suppose that women’s heights are normally distributed with mean 64 inches & standard deviation 3 inches. What is the probability that a randomly selected woman is under five feet tall?

0.4082

Sample Distribution of takes practice to get accustomed to it.

the probability distribution of all possible values of that could occur when a sample of size n is taken from some specified population

Example: Suppose we have a population of chips, takes practice to get accustomed to it.1/3 of which have a 1 on them, 1/3 have a 2, & 1/3 have a 3.

- Show in table form the distribution of the sample mean (with n=2, sampled with replacement).
- Graph the distribution of the sample mean.
- Graph the distribution of the original population of chips.
- What are the mean & variance of the original population?
- What are the mean & variance of the sample mean?

Show in table form the distribution of the sample mean (with n=2, sampled with replacement).

- samplesample mean
- 1,1 1.0
- 1,2 1.5
- 2,1 1.5
- 1,3 2.0
- 3,1 2.0
- 2,2 2.0
- 2,3 2.5
- 3,2 2.5
- 3,3 3.0

sample meanprobability

1.0 1/9

1.5 2/9

2.0 3/9

2.5 2/9

3.0 1/9

Probability n=2, sampled with replacement).

3/9

2/9

1/9

0 0.5 1.0 1.5 2.0 2.5 3.0

sample mean

Graph the distribution of the sample mean.sample meanprobability

1.0 1/9

1.5 2/9

2.0 3/9

2.5 2/9

3.0 1/9

Probability n=2, sampled with replacement).

1/3

0 1 2 3 x

Graph the distribution of the original population of chips.Since there are three equally likely values (1, 2, and 3), the distribution looks like this.

What are the mean & variance n=2, sampled with replacement). of the original population?

- xp(x)xp(x)
- 1 1/3 1/3
- 2 1/3 2/3
- 3 1/3 3/3
- m=6/3=2

What are the mean & variance n=2, sampled with replacement). of the original population?

- x2x2p(x)
- 1 1/3
- 4 4/3
- 9 9/3
- E(X2) =14/3

xp(x)xp(x)

1 1/3 1/3

2 1/3 2/3

3 1/3 3/3

m=6/3=2

V(X) = E(X2)- [E(X)]2

= 14/3 – 22

= 2/3

What are the mean & variance of the sample mean ? n=2, sampled with replacement).

- 1.0 1/9 1/9
- 1.5 2/9 3/9
- 2.0 3/9 6/9
- 2.5 2/9 5/9
- 3.0 1/9 3/9

What are the mean & variance of the sample mean ? n=2, sampled with replacement).

- 1.0 1/9 1/9
- 1.5 2/9 3/9
- 2.0 3/9 6/9
- 2.5 2/9 5/9
- 3.0 1/9 3/9

1.00 0.111

2.25 0.500

4.00 1.333

6.25 1.389

9.00 1.000

In our chip example, we found that n=2, sampled with replacement).

The expected value (or mean) of the original population and the sample mean are the same.

However, the variance of the sample mean is smaller than the variance of the original population.

In general, n=2, sampled with replacement).

The mean of the original population and the mean of the sample are the same.

However, as long as the sample size is more than one, the variance of the sample mean is smaller than the variance of the original population.

Intuition: Suppose that each person in the class randomly sampled 50 men from a population of men whose average height is 70 inches. Each student then calculated his/her sample mean.

- If you averaged together all the sample means, you’d expect to get something very close to 70 inches.
- The expected value or mean of the original population and the expected value of the sample mean are the same.

Why does the sample mean have a smaller variance than the original population?

- Suppose 5% of the population is taller than 78 inches (6’6”).
- Then there’s a 0.05 probability of selecting at random an individual with a height over 6’6” inches.
- It is much less likely that you will select at random 50 men whose average is over 6’6”.
- You get a mixture of tall guys & short guys, so your average tends to be pretty close to the average for the population.
- So the distribution of the sample mean clusters more tightly about the average than does the distribution of the original population.
- That is, the variance of the sample mean is smaller than the variance of the original population.

Central Limit Theorem original population?

- As the sample size n increases, the distribution of the sample mean of a random sample from a population (not necessarily normal) with mean m and variance s2 approaches normal with mean m and variance s2/n.

Example: Suppose the grades of a large class have a mean of 72 and a standard deviation of 9.

- a. What is the probability that the average grade of a random sample of 25 students will be above 77?
- If the population is normal, what is the probability that an individual student drawn at random will have a grade over 77?
- Notice that in part b, we need to assume normality but we didn’t in part a.
- This is because the central limit theorem assures us of an approximately normal distribution for the sample mean of a reasonably large sample, but not for the distribution of a single observation.
- To be assured of that, we need to know that the distribution of the original population was normal.

0 2.78 Z 72 and a standard deviation of 9.

What is the probability that the average grade of a random sample of 25 students will be above 77?= 0.5 - 0.4973

= 0.0027

0.4973

0 0.56 Z 72 and a standard deviation of 9.

If the population is normal, what is the probability that an individual student drawn at random will have a grade over 77?= 0.5 - 0.2123

= 0.2877

0.2123

Notice 72 and a standard deviation of 9.

- The probability of selecting an individual at random who has a grade five points above the class mean is much greater than the probability of randomly selecting a sample of 25 students who have an average that is five points above the class mean. (0.2877 versus 0.0027)

Problem 72 and a standard deviation of 9.

- Suppose we are sampling (without replacement) and we sample the entire population.
- Then our sample mean will always be the same as the population mean. No spread. Zero variance.
- If we don’t sample the entire population but do sample a large part of it, our variance will not be zero but it will be very small.
- Thus, as the sample size n approaches the population size N, the variance of the sample mean approaches zero.
- Our formula for the variance of the sample mean was s2/n.
- s2/n approaches zero as n approaches infinity, not as n approaches N.
- So we need to make some adjustment when we sample a large part of our population.

Finite Population Correction Factor 72 and a standard deviation of 9.

When we sample a substantial part of our population (more than 5%), we need to multiply the standard deviation of our sample mean by this factor:

So the standard deviation of the sample mean which was:

becomes:

The standard deviation of the sample mean adjusted for a large sample

Notice that if you sample just one observation (n=1), the square root part becomes 1, and the formula becomes the old formula:

If you sample the entire population (n=N), the formula has a value of 0.

So the adjustment works the way it should.

-2.02 0 Z large sample

Example: What is the probability that the mean of a sample of 36 observations, from a population of 300, will be less than 14, if the population mean & population standard deviation are 15.30 & 4.10 respectively?

- If we sample more than 5% of our population, we need to use the finite population correction factor, and here we are sampling 12%.

= 0.5 – 0.4783

= 0.0217

0.4783

Up until now, as long as our sample was not too large, we have used this standardization formula:

However, what if we don’t know what the population standard deviation s is?

The next best thing is the sample standard deviation:

But when we use s instead of s, the result is not a normally distributed variable.

It has what is called a t or student’s t distribution.

Our t distribution looks very similar to the Z distribution.

Both are symmetric, bell-shaped, & have a mean of zero.

But while the Z has a variance of 1, the t has a variance of (n-1)/(n-3), which is greater than one. So the t is wider than the Z.

So, the numbers are different & we need to use a different table.

The t distribution is tabulated based on the number of “degrees of freedom.”

n

The number of degrees of freedom, is denoted by dof, df,or the Greek letter nu:

In this context, the number of degrees of freedom is n-1, where n is the number of observations.

For each number of degrees of freedom, there is a different t distribution.

The degrees of freedom are often indicated as a subscript on the t.

For example, t15 is a t distribution with 15 degrees of freedom.

When the sample size is large, the Z and t distributions are virtually indistinguishable.

When we have at least 100 observations, we will be able to use values from the Z table for our t.

Some people use the Z to approximate the t when there are 30 or more observations. When we’re working with s, I prefer to use the t until 100 observations.

When you take Intermediate Statistics (EC252), make sure you know what your instructor uses.

The t distribution is set up differently from the Z table. virtually indistinguishable.

- The table in your text book gives probabilities that the t is greater than a specified positive number, that is, Pr(t ≥ a).
- Some tables are set up differently, so you need to notice how a table is computed when you use it.

0 3.365 t virtually indistinguishable.5

Example: For 6 observations or 5 df, Pr(t5 > 3.365) = 0.01

-2.064 0 t virtually indistinguishable.24

Example: What is the probability that the sample mean is less than 800, if the population mean is 843.344 and the sample standard deviation is 105, based on a sample of 25.

Using Excel to solve t distribution problems virtually indistinguishable.

- On an Excel spreadsheet, you can get the t distribution as follows:
- click insert, and then click function
- select statistical as the category of function, scroll down to the tdist or tinv function, and click on it
- fill in the information in the dialog box .

Using Excel to solve t distribution problems virtually indistinguishable.

- tdist asks for the number x on the horizontal axis, the number of degrees of freedom, & whether you want the area of 1 or 2 tails. It then gives you that area or probability.
- tinv asks for the number of degrees of freedom and the probability or area that you want in 2 tails. It then gives you the number on the horizontal axis.

-2.064 0 2.064 t virtually indistinguishable.24

Example: Suppose that you wanted to use Excel to find the probability that a t with 24 degrees of freedom is less than -2.064.- Note that the area to the left of -2.064 is the same as the area to the right of +2.064.
- Following the procedure described above, select tdist. Specify 2.064 as x, 24 for degrees of freedom, and 1 as the number of tails.
- Excel will provide you the probability value of 0.025.

-k 0 k t virtually indistinguishable.24

Example: Suppose instead you wanted to use Excel to find out for what value of k Pr(t24 > k) is equal to 0.025.- Notice that if the area to the right of k is 0.025, the total area cut off by k and –k is 0.05.
- Following the procedure described above, select tinv. Specify 0.05 for probability for 2 tails, and 24 for degrees of freedom.
- Excel will provide you the value on the horizontal axis of 2.064.

Download Presentation

Connecting to Server..