probability models for distributions of discrete variables n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Probability Models for Distributions of Discrete Variables PowerPoint Presentation
Download Presentation
Probability Models for Distributions of Discrete Variables

Loading in 2 Seconds...

play fullscreen
1 / 85

Probability Models for Distributions of Discrete Variables - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

Probability Models for Distributions of Discrete Variables. Randomly select a college student. Determine x , the number of credit cards the student has. x = # of cards p ( x ) = probability of x occurring. A population is a collection of all units of interest.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Probability Models for Distributions of Discrete Variables' - aysel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide2

Randomly select a college student. Determine x, the number of credit cards the student has.

x = # of cards p(x) = probability of x occurring

slide3

A population is a collection of all units of interest.

Example: All college students

A sample is a collection of units drawn from the population.

Example: Any subcollection of college students.

Probabilities go with populations.

Scientific studies randomly sample from the entire population.

Each unit in the sample is chosen randomly.

The entire sample is random as well.

Populations / Samples

slide4

For discrete data, a population and a sample are summarized the same way (for instance, as a table of values and accompanying relative frequencies).

A probability distribution (or model) for a discrete variable is a description of values, with each value accompanied by a probability.

Probability Models and Populations

slide5

Definitions of Probability

2. the probability of an event is the long term (technically forever) relative frequency of occurrence of the event, when the experiment is performed repeatedly under identical starting conditions.

3. The probability of an event is the relative frequency of units in the population for which the event applies.

To aggregate these meanings:

The probability associated with an event is its relative frequency of occurrence over all possible ways the phenomena can take place.

Probability Models and Populations

all models are wrong some are useful george box industrial statistician
“All models are wrong. Some are useful.”

George Box

-industrial statistician

Probability Models

slide7
A probability distribution for a discrete variable is tabulated with a set of values, x and probabilities, p(x).

Probabilities

Must be nonnegative.

slide8
A probability distribution for a discrete variable is tabulated with a set of values, x and probabilities, p(x).
  • Probabilities
  • Must be nonnegative.
  • Must sum to 1.
    • Within rounding error.
slide9
The mean  of a probability distribution is the mean value observed for all possible outcomes of the phenomena.
slide11

Idealized data set n = 100

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5

Mean = 1.80 SD = 1.44

slide13

Idealized data set n = 1000

0 0 0 0 0 0 0 … 0 (200)

1 1 1 1 1 1 1 1 1 1 … 1 (300)

2 2 2 2 2 2 … 2 (200)

3 3 3 3 … 3 (150)

4 4 … 4 (100)

5 … 5 (50)

Mean = 1.80 SD = 1.44

slide14

Values for the mean and standard deviation don’t depend on the number of data values; they depend instead on the relative location of the data values – they depend on the distribution in relative frequency terms.

slide15

The mean  of a probability distribution is the mean value observed for all possible outcomes of the phenomena.

Formula:

 is synonymous with “population mean”

SUM symbol

Greek letter “myou”

slide16

Multiply each value by its probability

Sum the products

Mean = 1.80

slide17

The standard deviation  of a probability distribution is the standard deviation of the values observed for all possible outcomes of the phenomena.

Formula:

 denotes “population standard deviation”

Greek letter “sigma”

slide19

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5

Mean = 1.80 SD = 1.44

Mean – SD = 0.56 Mean + SD = 3.24

65 / 100 = 65%

slide20

Mean = 1.80 SD = 1.44

Mean – SD = 0.56

Mean + SD = 3.24

0.30 + 0.20 + 0.15 = 0.65

slide22
x = # children in randomly selected college student’s family.

0.2194 = 21.94% of all college students come from a 1 child family.

guess at mean above 2 right skew mean mode
Guess at mean? Above 2

(right skew  mean > mode).

slide24

To determine the mean, multiply values by probabilities,

xp(x)

and sum these.

55/10 = 5.50 is not the mean

1.000/10 = 0.10 is not the mean

slide25
To determine the variance, multiply squared deviations from the mean by probabilities,

(x – )2p(x)

and sum these.

slide26

The standard deviation is the square root of the variance.

Examining the data set consisting of # of children in the family recorded for all students: The mean is 2.743; the standard deviation is 1.468.

slide32

Determine the probability a student is from a family with more than 5 siblings.

P(x > 5) = 0.0317

+ 0.0124

+ 0.0043

+ 0.0005

+ 0.0003

slide33

Determine the probability a student is from a family with more than 5 siblings.

P(x > 5) = 0.0317

+ 0.0124

+ 0.0043

+ 0.0005

+ 0.0003

= 0.0492

slide34

Determine the probability a student is from a family with more than 5 siblings.

P(x > 5) = 0.0492

4.92% of all college students come from families with more than 5 children (they have 4 or more brothers and sisters).

slide35
Determine the probability a student is from a family with at most 3 siblings.

P(x 3) = 0.2194

+ 0.2806

+ 0.2329

= 0.7329

slide36

Determine the probability a student is from a family with at least 7 siblings.

P(x 7) = 0.0124

+ 0.0043

+ 0.0005

+ 0.0003

= 0.0175

Good idea: Take the reciprocal of a small probability…

1/.0175 = 57.1  1 in 57 students

slide37

Determine the probability a student is from a family with fewer than 5 siblings.

P(x< 5) = 0.2194

+ 0.2806

+ 0.2329

+ 0.1442

= 0.8771

slide38

at most 3 at least 7

 

less than or equal to 3 greater than or equal to 7

 

no more than 3 no fewer/less than 7

 

x 3x  7

slide39
Determine the probability a student’s number of siblings falls within 1 standard deviation of the mean.

Guess?

0.68

slide40

Determine the probability a student’s number of siblings falls within 1 standard deviation of the mean.

Mean= 2.743

SD  = 1.468

1 SD below the mean

2.743 – 1.468 = 1.275

1 SD above the mean

2.743 + 1.468 = 4.211

slide41

Determine the probability a student’s number of siblings falls within 1 standard deviation of the mean.

1 SD below the mean = 1.275

1 SD above the mean = 4.211

Values are within 1 SD of the mean if they are between these.

slide42

Determine the probability a student’s number of siblings falls within 1 standard deviation of the mean.

1 SD below the mean = 1.275

1 SD above the mean = 4.211

Values are within 1 SD of the mean if they are between these.

slide43

Determine the probability a student’s number of siblings falls within 1 standard deviation of the mean.

1 SD below the mean = 1.275

1 SD above the mean = 4.211

Values are within 1 SD of the mean if they are between these.

The probability of being between these:

0.2806 + 0.2329 + 0.1442 = 0.6577

slide44

Determine the probability a student’s number of siblings falls within 2 standard deviations of the mean.

Guess? 0.95

2 SD below the mean

1.275 – 1.468 = -0.193

2 SD above the mean

4.211+ 1.468 = 5.679

Between -0.193 and 5.679.

slide45

Determine the probability a student’s number of siblings falls within 2 standard deviations of the mean.

Between -0.193 and 5.679.

(Equivalent to 5 or fewer.)

slide46

Determine the probability a student’s number of siblings falls within 2 standard deviations of the mean.

Between -0.193 and 5.679.

(Equivalent to 5 or fewer.)

We know an outcome more than 5 has probability 0.0492.

slide47

Determine the probability a student’s number of siblings falls within 2 standard deviations of the mean.

Between -0.193 and 5.679.

(Equivalent to 5 or fewer.)

We know an outcome more than 5 has probability 0.0492.

The probability of an outcome at most 5 is 1 – 0.0492 = 0.9508.

slide48

Determine the probability a student’s number of siblings falls within 2 standard deviations of the mean.

Between -0.193 and 5.679.

0.9508.

slide49

A company monitors pollutants downstream of discharge into a stream.

Data were collected on 200 days from a point 1 mile downstream of the plant on Stream A.

Data were collected on 100 days from a point 1 miles downstream of the plant on Stream B.

Pollutant Particles in Streamwater

how do means compare what are the means how do sds compare what are the sds
How do means compare?

(What are the means?)

How do SDs compare?

(What are the SDs?)

similar means similar standard deviations similar everything except n s
Similar Means.

Similar Standard Deviations.

(Similar everything except ns.)

stream b mean 1 775 sd 1 242 stream a mean 1 770 sd 1 340
Stream B

Mean = 1.775

SD = 1.242

Stream A

Mean = 1.770

SD = 1.340

here is the probability distribution for the number of diners seated at a table in a small caf
Here is the probability distribution for the number of diners seated at a table in a small café.

a) Fill in the blank

slide55

Here is the probability distribution for the number of diners seated at a table in a small café.

a) Fill in the blank

slide56

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the mean 
    • Start by computing xp(x) for each row.
slide57

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the mean 
    • Start by computing xp(x) for each row.
slide58

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the mean 
    • Start by computing xp(x) for each row.
slide59

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the mean 
    • Start by computing xp(x) for each row.
slide60

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the mean 
    • Start by computing xp(x) for each row.
slide61

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the mean 
    • Sum these.
slide62

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the mean 
    • Sum these.
  •  = 3.00
slide63

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the standard deviation 
    • Start by computing
    • ( x –  ) 2p(x)
    • for each row.
slide64

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the standard deviation 
    • Start by computing
    • ( x –  )2p(x)
    • for each row.
    •  = 3
slide65

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the standard deviation 
    • Start by computing
    • ( x –3)2p(x)
    • for each row.
    •  = 3
slide66

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the standard deviation 
    • Start by computing
    • ( x –3) 2p(x)
    • for each row.
    •  = 3
slide67

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the standard deviation 
    • Start by computing
    • ( x –3) 2p(x)
    • for each row.
    •  = 3
slide68

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the standard deviation 
    • Start by computing
    • (x–3) 2p(x)
    • for each row.
    •  = 3
slide69

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the standard deviation 
    • Sum these
slide70

Here is the probability distribution for the number of diners seated at a table in a small café.

  • b) Determine the standard deviation 
    • Sum these
    • Variance = 1.00
    • SD:  = 1.00
slide71
This framework makes it possible to obtain fairly good approximations to means and standard deviations from a histogram of continuous data.

[Optional] Application

here are waiting times between student arrivals in a class there are 21 students 20 waits
Here are waiting times between student arrivals in a class. There are 21 students (20 waits).

Example

Approximate the mean and median. How do they compare?

for each class determine its frequency and corresponding midpoint
For each class, determine its frequency and corresponding midpoint.

Example: Mean

Frequency = 10

Midpoint = 5

find the value with 50 below and 50 above1
Find the value with 50% below and 50% above.

Example: Median

10 of 20 = 50% below 10

Median  10.00

Mean  14.00

Range  44

S.D.  11

slide84

1.3 1.9 1.9 2.5 2.6 3.0 3.6 3.7 5.9 9.7 10.4 10.6 11.2 13.5 15.9 21.4 27.5 29.8 33.6 43.5

Approximations: Actual Values:

Median  10.0.05 Median =

Mean  14.0 Mean =

Range  44 Range =

SD  11 SD =

Example: Data / Exact Values

slide85

1.3 1.9 1.9 2.5 2.6 3.0 3.6 3.7 5.9 9.7 10.4 10.6 11.2 13.5 15.9 21.4 27.5 29.8 33.6 43.5

Approximations: Actual Values:

Median  10.0.05 Median = 10.05

Mean  14.0 Mean = 12.68

Range  44 Range = 42.2

SD  11 SD = 12.31

Example: Data / Exact Values