Download Presentation
## Random Variable

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Random Variable**• A random variable X is a function that assign a real number, X(ζ), to each outcome ζin the sample space of a random experiment. • Domain of the random variable -- S • Range of the random variable -- Sx • Example 1: Suppose that a coin is tossed 3 times and the sequence of heads and tails is noted. Sample space S={HHH,HHT,HTH,HTT,THH,THT,TTH, TTT} X :number of heads in three coin tosses. ζ : HHH HHT HTH THH HTT THT TTH TTT X(ζ): 3 2 2 2 1 1 1 0 Sx={0,1,2,3}**Probability of random variable**• Example 2: The event {X=k} ={k heads in three coin tosses} occurs when the outcome of the coin tossing experiment contains k heads. P[X=0]=P[{TTT}]=1/8 P[X=1]=P[{HTH}]+P[{THT}]+P[{TTH}]=3/8 P[X=2]=P[{HHT}]+P[{HTH}]+P[{THH}]=3/8 P[X=3]=P[{HHH}]=1/8 • Conclusion: B⊂SX A={ζ: X(ζ) in B} P[B]=P[A]=P[ζ: X(ζ) in B]. Event A and B are referred to as equivalent events. All numerical events of practical interest involves {X=x} or {X in I}**Events Defined by Random Variable**• If X is a r.v. and x is a fixed real number, we can define the event (X=x) as (X=x)={ζ: X(ζ)=x)} (X=x)={ζ: X(ζ)=x)} (X=x)={ζ: X(ζ)=x)} (x1<X≤x2)={ζ: x1<X(ζ)≤x2} These events have probabilities that are denoted by P[X=x]=P{ζ: X(ζ}=x} P[X=x]=P{ζ: X(ζ}=x} P[X=x]=P{ζ: X(ζ}=x} P[x1<X≤x2]=P{ζ: x1<X(ζ)≤x2}**Distribution Function**The cumulative distribution function (cdf) of a random variable X is defined as the probability of events {X≤ x}: Fx(x)=P[X≤ x] for -∞< x ≤ +∞ In terms of underlying sample space, the cdf is the probability of the event {ζ: X(ζ)≤x}. • Properties:**A typical example of cdf**• Tossing a coin 3 times and counting the number of heads**Two types of random variables**• A discrete random variable has a countable number of possible values. X: number of heads when trying 5 tossing of coins. The values are countable • A continuous random variable takes all values in an interval of numbers. X: the time it takes for a bulb to burn out. The values are not countable.**Consider the r.v. X defined in example 2.**Example of cdf for discrete random variables**Discrete Random Variable And Probability Mass Function**• Let X be a r.v. with cdf FX(x). If FX(x) changes value only in jumps and is constant between jumps, i.e. FX(x) is a staircase function, then X is called a discrete random variable. • Suppose xi < xj if i<j. P(X=xi)=P(X≤xi) - P(X≤xj)= FX(xi) - FX(xi-1) Let px(x)=P(X=x) The function px(x) is called the probability mass function (pmf) of the discrete r.v. X. • Properties of px(x):**Example of pmf for discrete r.v.**• Consider the r.v. X defined in example 2.**Continuous Random variable and Probability Density function**• Let X be a r.v. with cdf FX(x) . If FX(x) is continuous and also has a derivative dFX(x) /dx which exist everywhere except at possibly a finite number of points and is piecewise continuous, then X is called a continuous random variable. • Let • The function fX(x) is called the probability density function (pdf) of the continuous r.v. X . fX(x) is piecewise continuous. • Properties:**Conditional distribution**• Conditional probability of an event A given event B is defined as • Conditional cdfFX(x|B) of a r.v. X given event B is defined as • If X is discrete, then the conditional pmfpX(x|B) is defined by • If X is continuous r.v., then the conditional pdffX(x|B) is defined by**Mean and variance**• Mean: The mean (or expected value) of a r.v. X, denoted by μX or E(X), is defined by • Moment: The nth moment of a r.v. X is defined by • Variance: The variance of a r.v. X, denoted by σX2or Var(X), is defined by**Expectation of a Function of a Random variable**• Given a r.v. X and its probability distribution (pmf in the discrete case and pdf in the continuous case), how to calculate the expected value of some function of X, E(g(X))? • Proposition: (a) If X is a discrete r.v. with pmf pX(x), then for any real-valued function g, (b) If X is a continuous r.v. with pdf fX(x), then for any real-valued function g,**Limit Theorem**• Markov's Inequality: If X is a r.v. that takes only nonnegative values, then for any value a>0, • Chebyshev's Inequality: If X is a random variable with mean μand variance σ2, then for any value k>0**Application of Limit theorem**• Suppose we know that the number of items produced in a factory during a week is a random variable with mean 500. • (a) What can be said about the probability that this week's production will be at least 1000? • (b) If the variance of a week's production is known to equal 100, then what can be said about the probability that this week's production will be between 400 and 600? • Solution: Let X be number of item that will be produced in a week. (a) By Markov's inequality, P{X≥1000}≤E[X]/1000=0.5 (b) By Chebyshev's inequality, P{|X-500|≥100}≤ σ2/(100)2=0.01 P {|X-500|<100}≥1-0.01=0.99.**Some Special Distribution**• Bernoulli Distribution • Binomial Distribution • Poisson Distribution • Uniform Distribution • Exponential Distribution • Normal (or Gaussian) Distribution • Conditional Distribution • ……**Bernoulli Random Variable**An experiment with outcome as either a "success" or as a "failure" is performed. Let X=1 if the outcome is a "success" andX=0 if it is a "failure". If the pmf is given as following, such experiments are called Bernoulli trials, X is said to be a Bernoulli random variable. Note: 0 ≤ p ≤ 1 Example: Tossing coin once. The head and tail are equally likely to occur, thus p=0.5. pX(1)=P(H)=0.5, pX(1)=P(T)=0.5.**Binomial Random Variable**• Suppose n independent Bernoulli trails, each of which results in a "success" with probability p and in a "failure with probability 1-p, are to be performed. Let X represent the number of success that occur in the n trials, then X is said to be a binomial random variable with parameters (n,p). Example: Toss a coin 3 times, X=number of heads. p=0.5**Geometric Random Variable**• Suppose the independent trials, each having probability p of being a success, are performed until a success occurs. Let X be the number of trails required until the first success occurs, then X is said to be a geometric random variable with parameter p. Example: Consider an experiment of rolling a fair die. The average number of rolls required in order to obtain a 6:**Poisson Random Variable**• A r.v. X is called a Poisson random variable with parameter λ(>0) if its pmf is given by An important property of the Poisson r.v. is that it may be used to approximate a binomial r.v. when the binomial parameter n is large and p is small. Let λ=np**Uniform Random Variable**A uniform r.v.X is often used when we have no prior knowledge of the actual pdf and all continuous values in some range seem equally likely.**Exponential Random Variable**The most interesting property of the exponential r.v. is "memoryless". X can be the lifetime of a component.**Gaussian (Normal) Random Variable**An important fact about normal r.v. is that if X is normally distributed with parameter μ and σ2, then Y=aX+b is normally distributed with paramter a μ+b and (a2 σ2); Application: central limit theorem-- the sum of large number of independent r.v.'s,under certain conditions can be approximated b a normal r.v. denoted by N(μ;σ2)**The Moment Generating Function**The important property: All of the moment of X can be obtained by successively differentiation.**Application of Moment Generating Function**• The Binomial Distribution (n,p)**Entropy**• Entropy is a measure of the uncertainty in a random experiment. • Let X be a discrete r.v. with SX={x1,x2, …,xk} and pmf pk=P[X=xk]. Let Ak denote the event {X=xk}. Intuitive facts: the uncertainty of Ak is low if pk is close to one, and it is high if pk is close to zero. Measure of uncertainty:**Entropy of a random variable**• The entropy of a r.v. X is defined as the expected value of the uncertainty of its outcomes: The entropy is in units of ''bits'' when the logarithm is base 2 Independent fair coin flips have an entropy of 1 bit per flip. A source that always generates a long string of A's has an entropy of 0, since the next character will always be an 'A'.**Entropy of Binary Random Variable**• Suppose r.v. X with Sx={0,1}, p=P[X=0]=1-P[X=1]. (Flipping a coin). • The HX=h(p) is symmetric about p=0.5 and achieves its maximum atp=0.5; • The uncertainty of event (X=0) and (X=1) vary together in complementary manner. • The highest average uncertainty occurs when p(0)=p(1)=0.5;**Reduction of Entropy Through Partial Information**• Entropy quantifies uncertainty by the amount of information required to specify the outcome of a random experiment. • Example: If r.v. X equally likely takes on the values from set {000,001,010,…,111} (Flipping coins 3 times), given the event A={X begins with a 1}={100,101,110,111}, what is the change of entropy of r.v.X ?**Extending discrete entropy to the continuous case:**differential entropy • Quantization method: Let X be a continuous r.v. that takes on values in the interval [a b]. Divide [a b] into a large number K of subintervals of length ∆. Let Q(X) be the midpoint of the subinterval that contains X. Find the entropy of Q. • Let xk be the midpoint of the kth subinterval, then P[Q= xk]=P[X is in kth subinterval]=P[xk-∆/2<X< xk+∆/2]≈ fX(xk) ∆ Trade off: ∆→0, HQ→∞ Differential Entropy is defined as**The Method of Maximum Entropy**The maximum entropy method is a procedure for estimating the pmf or pdf of a random variable when only partial information about X, in the form of expected values of functions of X, is available. Discrete case: X being a r.v. with Sx={x1,x2,…,xk} and unknown pmf px(xk). Given the expected value of some function g(X) of X: