65 Views

Download Presentation
## Unit 2 : Random Variables and their Distributions

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Unit 2 : Random Variables and their Distributions**Wenyaw Chan Division of Biostatistics School of Public Health University of Texas - Health Science Center at Houston**Random Variable**• Random Variable: • A numeric function that assigns probabilities to different events in a sample. • Discrete Random Variable: • A random variable that assumes only a finite or denumerable number of values. • The probability mass function of a discrete random variable X that assumes values x1, x2,… is p(x1), p(x2), …., where p(xi)=Pr[X= xi]. • Continuous Random Variable: • A random variable whose possible values cannot be enumerated.**Example: Flip a coin 3 times**• Random Variable • X = # of heads in the 3 coin tosses • Probability Mass Function • P(X=3) = P{(HHH)} =1/8 • P(X=2) = P{HHT, HTH, THH}= 3/8 • P(X=1) = P{HTT,THT, TTH} = 3/8 • P(X=0) = P{TTT} = 1/8 • X is a discrete random variable with probability (mass) function**Random Variable**Expected value of X : Variance of X : Standard Deviation of X: =**Random Variable**• Note : • Cumulative Distribution Function • of X : Pr(X<=x) = F(x)**Binomial Distribution**• Examples of the binomial distribution have a common structure: • n independent trials • each trial has only two possible outcomes, called “success” and “failure”. • Pr (success) = p for all trials**Binomial Distribution**• If X= # of successful trials in these n trials, then X has a binomial distribution. • k=0,1,2,….,n • where • Example: Flip a coin 10 times**Properties of Binomial Distribution**• If X~ Binomial (n, p), then E(X) = np Var (X) = np(1-p)**Poisson Distribution**k=0,1,2,….. If X~ Poisson (), then EX = and VarX = **Poisson Process**• Assumption 1: • Pr {1 event occurs in a very small time interval [0,t)}t • Pr {0 event occurs in a very small time interval [0, t)}1-t • Pr{more than one event occurs in a very small time interval [0, t)}0 • Assumption 2: • Probability that the number of events occur per unit time is the same through out the entire time interval • Assumption 3: • Pr {one event in [t1,t2) | one event in [t0, t1)} = Pr {one event in [t1, t2)}**Poisson Distribution**• X=The number of events occurred in the time period t for the above process with parameter, then mean=t and where k= 0,1,2,… and e= 2.71828 E(X)=Var(X)=t**Poisson approximation to Binomial**• If X~ Binomial (n, p), n is large and p is small, then**Continuous Probability Distributions**• Probability density function (p.d.f.) (of a random variable): • a curve such that the area under the curve between any two points a and b, equals • Prob[a x b ]= ∫ a x bf(x)dx**Continuous Probability Distributions**• Cumulative distribution function: Pr(x a)**Continuous Probability Distributions**• The expected value of a continuous random variable X is ∫xf(x)dx, where f(x) is the p.d.f. of X. • The definition for the variance of a continuous random variable is the same as that of a discrete random variable, i.e. Var(X)=E(X2)- (EX)2=∫(x-µ)2f(x)dx, where µ=E(X).**The Normal Distribution (The Gaussian distribution)**• The p.d.f. of a normal distribution exp , - < x < **The Normal Distribution**• figures: a bell-shaped curve symmetric about • Notation: X~N(, 2 ) : mean 2 : variance**The Normal Distribution**• N(0,1) is the standard normal distribution • If X~ N(0,1), then • ~ : “is distributed as” , • : c.d.f. for the standard normal r.v. • Note: • The point of inflection is a point where the slope of the curve changes its direction.**Properties of the N(0,1)**• 1. (-x) = 1-(x) • 2. • About 68% of the area under the standard normal curve lies between –1 and 1. • About 95% of the area under the standard normal curve lies between –2 and 2. • About 99% of the area under the standard normal curve lies between –2.5 and 2.5.**Properties of the N(0,1)**• If X~ N(0,1) and P(X< Zu)=u, 0 u 1 then Zu is called the 100uth percentile of the standard normal distribution. 95th %tile=1.645, 97.5th %tile=1.96, 99th %tile=2.33**Properties of the N(0,1)**• If X~ N(, 2), then • This property allows us to calculate the probability of a non-standard normal random variable.**Other Distributions--- t distribution**• Let X1, ….Xn be a random sample from a normal population N(, σ2). Then has a t distribution with n-1 degrees of freedom (df).**Other Distributions---Chi-square distribution**• Let X1, ….Xn be a random sample from a normal population N(0, 1). Then has a chi-square distribution with n degrees of freedom (df).**Other Distributions---F distribution**• Let U and V be independent random variables and each has a chi-square distribution with p and q degrees of freedom respectively. Then has a Fdistribution with p and q degrees of freedom (df).**Covariance and Correlation**• The covariance between two random variables is defined by Cov(X,Y)=E[(X-µX)(Y-µY)]. • The correlation coefficient between two random variables is defined by ρ=Corr(X,Y)=Cov(X,Y)/(σX σ Y).**Variance of a Linear Combination**• Var(c1X1 + c2X2)