STAT 552 PROBABILITY AND STATISTICS II

1 / 51

# STAT 552 PROBABILITY AND STATISTICS II - PowerPoint PPT Presentation

STAT 552 PROBABILITY AND STATISTICS II. INTRODUCTION Short review of S551. WHAT IS STATISTICS?.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' STAT 552 PROBABILITY AND STATISTICS II' - netis

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### STAT 552PROBABILITY AND STATISTICS II

INTRODUCTION

Short review of S551

WHAT IS STATISTICS?
• Statistics is a science of collecting data, organizing and describing it and drawing conclusions from it. That is, statistics is a way to get information from data. It is the science of uncertainty.
BASIC DEFINITIONS
• POPULATION: The collection of all items of interest in a particular study.
• SAMPLE: A set of data drawn from the population;

a subset of the population available for observation

• PARAMETER: A descriptive measure of the

population, e.g., mean

• STATISTIC: A descriptive measure of a sample
• VARIABLE: A characteristic of interest about each

element of a population or sample.

STATISTIC
• Statistic (or estimator) is any function of a r.v. of r.s. which do not contain any unknown quantity. E.g.
• are statistics.
• are NOT.
• Any observed or particular value of an estimator is an estimate.
Sample Space
• The set of all possible outcomes of an experiment is called a sample space and denoted byS.
• Determining the outcomes.
• Build an exhaustive list of all possible outcomes.
• Make sure the listed outcomes are mutually exclusive.
RANDOM VARIABLES
• Variables whose observed value is determined by chance
• A r.v. is a function defined on the sample space S that associates a real number with each outcome in S.
• Rvs are denoted by uppercase letters, and their observed values by lowercase letters.
DESCRIPTIVE STATISTICS
• Descriptive statistics involves the arrangement, summary, and presentation of data, to enable meaningful interpretation, and to support decision making.
• Descriptive statistics methods make use of
• graphical techniques
• numerical descriptive measures.

PROBABILITY

POPULATION

SAMPLE

STATISTICAL

INFERENCE

PROBABILITY: A numerical value expressing the degree of uncertainty regarding the occurrence of an event. A measure of uncertainty.
• STATISTICAL INFERENCE: The science of drawing inferences about the population based only on a part of the population, sample.
Probability

P : S  [0,1]

Probability domain range

function

THE CALCULUS OF PROBABILITIES
• If P is a probability function and A is any set, then

a. P()=0

b. P(A)  1

c. P(AC)=1  P(A)

ODDS
• The odds of an event A is defined by
• It tells us how much more likely to see the

occurrence of event A.

ODDS RATIO
• OR is the ratio of two odds.
• Useful for comparing the odds under two different conditions or for two different groups, e.g. odds for males versus females.
CONDITIONAL PROBABILITY
• (Marginal) Probability: P(A): How likely is it that an event A will occur when an experiment is performed?
• Conditional Probability: P(A|B): How will the probability of event A be affected by the knowledge of the occurrence or nonoccurrence of event B?
• If two events are independent, then P(A|B)=P(A)
BAYES THEOREM
• Suppose you have P(B|A), but need P(A|B).
Independence
• A and B are independent iff
• P(A|B)=P(A) or P(B|A)=P(B)
• P(AB)=P(A)P(B)
• A1, A2, …, An are mutually independent iff

for every subset j of {1,2,…,n}

E.g. for n=3, A1, A2, A3 are mutually independent iff P(A1A2A3)=P(A1)P(A2)P(A3) and P(A1A2)=P(A1)P(A2) and P(A1A3)=P(A1)P(A3) and P(A2A3)=P(A2)P(A3)

DISCRETE RANDOM VARIABLES
• If the set of all possible values of a r.v. X is a countable set, then X is called discrete r.v.
• The function f(x)=P(X=x) for x=x1,x2, … that assigns the probability to each value x is called probability density function (p.d.f.) or probability mass function (p.m.f.)
Example
• Discrete Uniform distribution:
• Example: throw a fair die. P(X=1)=…=P(X=6)=1/6
CONTINUOUS RANDOM VARIABLES
• When sample space is uncountable (continuous)
• Example: Continuous Uniform(a,b)
CUMULATIVE DENSITY FUNCTION (C.D.F.)
• CDF of a r.v. X is defined as F(x)=P(X≤x).
JOINT DISCRETE DISTRIBUTIONS
• A function f(x1, x2,…, xk) is the joint pmf for some vector valued rv X=(X1, X2,…,Xk) iff the following properties are satisfied:

f(x1, x2,…, xk) 0 for all (x1, x2,…, xk)

and

MARGINAL DISCRETE DISTRIBUTIONS
• If the pair (X1,X2) of discrete random variables has the joint pmf f(x1,x2), then the marginal pmfs of X1 and X2 are
CONDITIONAL DISTRIBUTIONS
• If X1 and X2 are discrete or continuous random variables with joint pdf f(x1,x2), then the conditional pdf of X2 given X1=x1 is defined by
• For independent rvs,
EXPECTED VALUES

Let X be a rv with pdf fX(x) and g(X) be a function of X. Then, the expected value (or the mean or the mathematical expectation) of g(X)

providing the sum or the integral exists, i.e.,

<E[g(X)]<.

EXPECTED VALUES
• E[g(X)] is finite if E[| g(X) |]is finite.
Laws of Expected Value

E(c) = c

E(X + c) = E(X) + c

E(cX) = cE(X)

Laws of Variance

V(c) = 0

V(X + c) = V(X)

V(cX) = c2V(X)

Laws of Expected Value and Variance

Let X be a rv and c be a constant.

EXPECTED VALUE

If X and Y are independent,

The covariance of X and Y is defined as

EXPECTED VALUE

If X and Y are independent,

The reverse is usually not correct! It is only correct under normal distribution.

If (X,Y)~Normal, then X and Y are independent iff

Cov(X,Y)=0

EXPECTED VALUE

If X1 and X2 are independent,

CONDITIONAL EXPECTATION AND VARIANCE

(EVVE rule)

Proofs available in Casella & Berger (1990), pgs. 154 & 158

SOME MATHEMATICAL EXPECTATIONS
• Population Mean:  = E(X)
• Population Variance:

(measure of the deviation from the population mean)

• Population Standard Deviation:
• Moments:

This measure reflects the dispersion of all the observations

• The variance of a population of size N x1, x2,…,xN whose mean is m is defined as
• The variance of a sample of n observationsx1, x2, …,xn whose mean is is defined as
The Variance
MOMENT GENERATING FUNCTION

The m.g.f. of random variable X is defined as

for t Є (-h,h) for some h>0.

Properties of m.g.f.
• M(0)=E[1]=1
• If a r.v. X has m.g.f. M(t), then Y=aX+b has a m.g.f.
• M.g.f does not always exists (e.g. Cauchy distribution)
CHARACTERISTIC FUNCTION

The c.h.f. of random variable X is defined as

for all real numbers t.

C.h.f. always exists.

Uniqueness

Theorem:

• If two r.v.s have mg.f.s that exist and are equal, then they have the same distribution.
• If two r.v.s have the same distribution, then they have the same m.g.f. (if they exist)

Similar statements are true for c.h.f.

SOME DISCRETE PROBABILITY DISTRIBUTIONS
• Please review: Degenerate, Uniform, Bernoulli, Binomial, Poisson, Negative Binomial, Geometric, Hypergeometric, Extended Hypergeometric, Multinomial
SOME CONTINUOUS PROBABILITY DISTRIBUTIONS
• Please review: Uniform, Normal (Gaussian), Exponential, Gamma, Chi-Square, Beta, Weibull, Cauchy, Log-Normal, t, F Distributions
TRANSFORMATION OF RANDOM VARIABLES
• If X is an rv with pdf f(x), then Y=g(X) is also an rv. What is the pdf of Y?
• If X is a discrete rv, replace Y=g(X) whereever you see X in the pdf of f(x) by using the relation .
• If X is a continuous rv, then do the same thing, but now multiply with Jacobian.
• If it is not 1-to-1 transformation, divide the region into sub-regions for which we have 1-to-1 transformation.
CDF method
• Example: Let

Consider . What is the p.d.f. of Y?

• Solution:
M.G.F. Method
• If X1,X2,…,Xn are independent random variables with MGFs Mxi (t), then the MGF of is
THE PROBABILITY INTEGRAL TRANSFORMATION
• Let X have continuous cdfFX(x) and define the rvY as Y=FX(x). Then,

Y ~ Uniform(0,1), that is,

P(Y  y) = y, 0<y<1.

• This is very commonly used, especially in random number generation procedures.
SAMPLING DISTRIBUTION
• A statistic is also a random variable. Its distribution depends on the distribution of the random sample and the form of the function Y=T(X1, X2,…,Xn). The probability distribution of a statistic Y is called the sampling distribution of Y.
SAMPLING FROM THE NORMAL DISTRIBUTION

Properties of the Sample Mean and Sample Variance

• Let X1, X2,…,Xn be a r.s. of size n from a N(,2) distribution. Then,
SAMPLING FROM THE NORMAL DISTRIBUTION

If population variance is unknown, we use sample variance:

SAMPLING FROM THE NORMAL DISTRIBUTION
• The F distribution allows us to compare the variances by giving the distribution of
• If X~Fp,q, then 1/X~Fq,p.
• If X~tq, then X2~F1,q.

X

Random Variable (Population) Distribution

Sample Mean Distribution

CENTRAL LIMIT THEOREM

If a random sample is drawn from any population, the sampling distribution of the sample mean is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distribution of will resemble a normal distribution.

Random Sample

(X1, X2, X3, …,Xn)

Sampling Distribution of the Sample Mean

If X is normal, is normal.

If X isnon-normal,is approximately normally distributed for sample size greater than or equal to 30.