360 likes | 597 Views
Random Variables. Intro to discrete random variables. Random Variables. “A random variable is a numerical valued function defined over a sample space” What does this mean in English? If Y rv then Y takes on more than 1 numerical value Sample space is set of possible values of Y
E N D
Random Variables Intro to discrete random variables statistical processes
Random Variables • “A random variable is a numerical valued function defined over a sample space” • What does this mean in English? • If Y rv then Y takes on more than 1 numerical value • Sample space is set of possible values of Y • What are examples of random variables? • Let Y face showing on die ={1,2, …, 6} statistical processes
Random Variables Deterministic variables Continuous random variables Discrete random variables VariablesA Simple Taxonomy Variables are but models Variables statistical processes
Random VariablesA Simple Example • Variables model physical processes • Let S sales; C costs; P profit P = S - C • Suppose all variables deterministic • S = 25 and C = 15, P = 10 • Suppose S is a rv = {25, 30} • What is P? • RVs may be used just as deterministic variables • How shall we describe the behavior of a rv? statistical processes
Developing RV Standard ModelsDistribution Functions • Distribution functions assign probability to every real numbered value of a rv • Probability Mass Function (PMF) assigns probability to each value of a discrete rv • Probability Density Function (PDF) is a math function that describes distribution for a continuous rv • Standard models convenient for describing physical processes • Example of PMF: Let T project duration (a rv) • t1 = 4 weeks; p(T = t1) = p(t1) = 0.2 • t2 = 5 weeks; p(T = t2) = p(t2) = 0.3 • t3 = 6 weeks; p(T = t3) = p(t3) = 0.5 Note conventions! statistical processes
Characteristic Measures for PMFsCentral Tendency • Central tendency of a pmf • Mean or average • What is E(T) for project duration example? • = 4(0.2) + 5 (0.3) + 6(0.5) = 5.3 weeks • What if C = f(T), where C costs • Is C a random variable? • What is E(C)? statistical processes
Expected value of a function of y, a discrete rv Let g(y) be function of y Suppose C = g(T) = 5T + 3, find E(C) E(C) = [5(4)+3]0.2 + [5(5)+3]0.3 + [5(6)+3]0.5 = 29.5 Let d = constant E(d)= constant E(dy)= dE(y) E() is a linear operator E(X + Y) = E(X) + E(Y), where X & Y are rv Mean of a Discrete RVInteresting Characteristics statistical processes
Variance of a discrete rv Previously defined variance for population & sample Random VariableVariance - A Measure of Dispersion statistical processes
Mean and VarianceInterpretation • Mean • Expected value of the random variable • Variance • Expected value of distance2 from mean statistical processes
Discrete Random VariablesUseful Models • Examine frequently encountered models • Be sure to understand • Process being modeled by random variable • Derivation of pmf • Use of Excel • Calculating pmf • Graphing pmf statistical processes
Binomial Distribution FunctionSetting the Stage • Bernoulli rv • Models process in which an outcome either happens or does not • A binary outcome • What are examples? • Formal description • Trial results in 1 of 2 mutually exclusive outcomes • Outcomes are exhaustive • P(S) = p ; P(F) = q ; p + q = 1.0 statistical processes
How can we derive these? Probability Mass FunctionBernoulli RV statistical processes
We know that and We also know that So it follows that Deriving the mean and variance of a Bernoulli Random Variable • Deriving the mean of a rv: statistical processes
Deriving the variance of a random variable statistical processes
Binomial DistributionProblem Description Problem: • Given n trials of a Bernoulli rv, what is probability of y successes? • Why is y a discrete rv? • Simple example Toss coin 3 times, find P(2 heads) n = 3 ; y = 2 P(H, H, T) = (.5)(.5)(.5) = 0.125 Could also be (H,T,H) or (T, H, H) P(2 heads) = 0.125 + 0.125 + 0.125 = 0.375 statistical processes
Returning to the P(2 heads) Binomial Distribution FunctionGeneralizing From Simple Example • Recall 2 heads in three tosses • How many different ways is this possible? • Combination of three things taken two at a time statistical processes
y n Probability of y successes # of combinations Probability of n - y failures Binomial Distribution FunctionCreating the Model • Key assumption • Each trial an independent, identical Bernoulli variable • E(y) = np • Var(y) = npq statistical processes
Binomial Distribution FunctionSimple Problem Have 20 coin tosses • Find probability that will have 10 or more heads • Set up the problem and will then solve • Let • n = 20 • y = # of heads • p = q = 0.50 • Want p(y 10) • Will solve manually and using Excel statistical processes
Binomial Example: Manual solution • But remember! This is just for y = 10. We must do this for y = 11, 12, …, 20 as well and then sum all the values! statistical processes
Multinomial DistributionGeneralizing the Binomial Distribution Problem Events E1, …, Ek occur with probabilities p1, p2, …, pk . Given n independent trials probability E1 occurs y1 times, … Ek occurs yk times. • Why is this a more general case than the Binomial? • Can you describe an example? statistical processes
Need to understand convention Note there are k random variables This is called a joint distribution. Formula for MultinomialUnderstand Relationship to Binomial j = npj j2 = npjqj = npj(1-pj) statistical processes
Extending the BinomialTwo Special Cases • Recall Binomial distribution • What problem does it model? • Given n independent trials, p = p(success) • Geometric distribution • Define y as rv representing first success • Negative Binomial • Define y as rv representing rth success statistical processes
Geometric Distribution • Recall problem statement for geometric • Suppose p = 0.2, what is p(Y=3)? • Only possible order is FFS • p(Y=3) = (.8)(.8)(.2) • Generalizing simple example • p(y) = pqy-1 ; = 1/p ; 2 = q / (p2) • What is implicit assumption about largest value of y? statistical processes
- æ ö y 1 - ç ÷ = r y r p ( y ) p q ç ÷ - r 1 è ø r rq m = = s 2 2 p p Let p = 0.5 & r = 2, do we get reasonable results? Negative Binomial Distribution Problem Have series of Bernoulli trials, want probability of waiting until yth trial to get rth success statistical processes
HypergeometricAn Extension to the Binomial • Suppose have 10 transformers, know 1 is defective • p(defective) = 0.1 • Let y = # of defectives in a sample of n • Suppose pick 3 transformers, find p(y=2) • Can I use the Binomial distribution??? • Does the p stay constant through all trials?? statistical processes
Transformer Example • What do you note about example: • p(defective) changed during sampling process • # of trials n large with respect to N • What if N >> n ? • Would p(defective) change during sampling process? • Process called sampling without replacement • Binomial assumes infinite population OR sampling with replacement. Why? • If we cannot use Binomial then what? Hypergeometric Probability Distribution statistical processes
1) Why is y a rv? 2) What do we mean by p(y)? 3) What is r/N ? Hypergeometric Distribution N # in population n # in sample r # of Successes in population y # of Successes in sample statistical processes
Poisson ProcessA Useful Model • In a Poisson process • Events occur purely randomly • Over long term rate is constant • What is implication of the above? • Memoryless process • What are some processes modeled as Poisson processes? statistical processes
# of defects in an 8x8 sheet of plywood A Poisson Process is a Rate # of cars passing a fixed point in one minute statistical processes
Why does this make sense? Note particularly interesting relationship Note must be for the same unit of measure! Poisson Probability Distribution Where, y # of occurrences in a given unit mean # of occurrences in a given unit e 2.71828… statistical processes
Discrete Random VariablesExcel Special Functions Special Functions HYPGEOMDIST BINOMDIST NEGBINOMDIST POISSON Are there others? Excel statistical processes
Class 3 Readings & Problems • Reading assignment • M & S • Chapter 4 Sections 4.1 - 4.10 • Recommended problems • M & S Chapter 4 • 59, 69, 84, 87, 88, 90, 96, 98, 100 statistical processes