- 68 Views
- Uploaded on
- Presentation posted in: General

STAT 552 PROBABILITY AND STATISTICS II. INTRODUCTION Short review of S551. WHAT IS STATISTICS?.

STAT 552 PROBABILITY AND STATISTICS II

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

STAT 552PROBABILITY AND STATISTICS II

INTRODUCTION

Short review of S551

- Statistics is a science of collecting data, organizing and describing it and drawing conclusions from it. That is, statistics is a way to get information from data. It is the science of uncertainty.

- POPULATION: The collection of all items of interest in a particular study.

- SAMPLE: A set of data drawn from the population;
a subset of the population available for observation

- PARAMETER: A descriptive measure of the
population, e.g., mean

- STATISTIC: A descriptive measure of a sample

- VARIABLE: A characteristic of interest about each
element of a population or sample.

- Statistic (or estimator) is any function of a r.v. of r.s. which do not contain any unknown quantity. E.g.
- are statistics.
- are NOT.

- Any observed or particular value of an estimator is an estimate.

- The set of all possible outcomes of an experiment is called a sample space and denoted byS.
- Determining the outcomes.
- Build an exhaustive list of all possible outcomes.
- Make sure the listed outcomes are mutually exclusive.

- Variables whose observed value is determined by chance
- A r.v. is a function defined on the sample space S that associates a real number with each outcome in S.
- Rvs are denoted by uppercase letters, and their observed values by lowercase letters.

- Descriptive statistics involves the arrangement, summary, and presentation of data, to enable meaningful interpretation, and to support decision making.
- Descriptive statistics methods make use of
- graphical techniques
- numerical descriptive measures.

PROBABILITY

POPULATION

SAMPLE

STATISTICAL

INFERENCE

- PROBABILITY: A numerical value expressing the degree of uncertainty regarding the occurrence of an event. A measure of uncertainty.
- STATISTICAL INFERENCE: The science of drawing inferences about the population based only on a part of the population, sample.

P : S [0,1]

Probability domain range

function

- If P is a probability function and A is any set, then
a. P()=0

b. P(A) 1

c. P(AC)=1 P(A)

- The odds of an event A is defined by

- It tells us how much more likely to see the
occurrence of event A.

- OR is the ratio of two odds.
- Useful for comparing the odds under two different conditions or for two different groups, e.g. odds for males versus females.

- (Marginal) Probability: P(A): How likely is it that an event A will occur when an experiment is performed?
- Conditional Probability: P(A|B): How will the probability of event A be affected by the knowledge of the occurrence or nonoccurrence of event B?
- If two events are independent, then P(A|B)=P(A)

- Suppose you have P(B|A), but need P(A|B).

- A and B are independent iff
- P(A|B)=P(A) or P(B|A)=P(B)
- P(AB)=P(A)P(B)

- A1, A2, …, An are mutually independent iff
for every subset j of {1,2,…,n}

E.g. for n=3, A1, A2, A3 are mutually independent iff P(A1A2A3)=P(A1)P(A2)P(A3) and P(A1A2)=P(A1)P(A2) and P(A1A3)=P(A1)P(A3) and P(A2A3)=P(A2)P(A3)

- If the set of all possible values of a r.v. X is a countable set, then X is called discrete r.v.
- The function f(x)=P(X=x) for x=x1,x2, … that assigns the probability to each value x is called probability density function (p.d.f.) or probability mass function (p.m.f.)

- Discrete Uniform distribution:
- Example: throw a fair die. P(X=1)=…=P(X=6)=1/6

- When sample space is uncountable (continuous)
- Example: Continuous Uniform(a,b)

- CDF of a r.v. X is defined as F(x)=P(X≤x).

- A function f(x1, x2,…, xk) is the joint pmf for some vector valued rv X=(X1, X2,…,Xk) iff the following properties are satisfied:
f(x1, x2,…, xk) 0 for all (x1, x2,…, xk)

and

- If the pair (X1,X2) of discrete random variables has the joint pmf f(x1,x2), then the marginal pmfs of X1 and X2 are

- If X1 and X2 are discrete or continuous random variables with joint pdf f(x1,x2), then the conditional pdf of X2 given X1=x1 is defined by
- For independent rvs,

Let X be a rv with pdf fX(x) and g(X) be a function of X. Then, the expected value (or the mean or the mathematical expectation) of g(X)

providing the sum or the integral exists, i.e.,

<E[g(X)]<.

- E[g(X)] is finite if E[| g(X) |]is finite.

Laws of Expected Value

E(c) = c

E(X + c) = E(X) + c

E(cX) = cE(X)

Laws of Variance

V(c) = 0

V(X + c) = V(X)

V(cX) = c2V(X)

Let X be a rv and c be a constant.

If X and Y are independent,

The covariance of X and Y is defined as

If X and Y are independent,

The reverse is usually not correct! It is only correct under normal distribution.

If (X,Y)~Normal, then X and Y are independent iff

Cov(X,Y)=0

If X1 and X2 are independent,

(EVVE rule)

Proofs available in Casella & Berger (1990), pgs. 154 & 158

- Population Mean: = E(X)
- Population Variance:

(measure of the deviation from the population mean)

- Population Standard Deviation:

- Moments:

- This measure reflects the dispersion of all the observations
- The variance of a population of size N x1, x2,…,xN whose mean is m is defined as
- The variance of a sample of n observationsx1, x2, …,xn whose mean is is defined as

The m.g.f. of random variable X is defined as

for t Є (-h,h) for some h>0.

- M(0)=E[1]=1
- If a r.v. X has m.g.f. M(t), then Y=aX+b has a m.g.f.
- M.g.f does not always exists (e.g. Cauchy distribution)

The c.h.f. of random variable X is defined as

for all real numbers t.

C.h.f. always exists.

Theorem:

- If two r.v.s have mg.f.s that exist and are equal, then they have the same distribution.
- If two r.v.s have the same distribution, then they have the same m.g.f. (if they exist)
Similar statements are true for c.h.f.

- Please review: Degenerate, Uniform, Bernoulli, Binomial, Poisson, Negative Binomial, Geometric, Hypergeometric, Extended Hypergeometric, Multinomial

- Please review: Uniform, Normal (Gaussian), Exponential, Gamma, Chi-Square, Beta, Weibull, Cauchy, Log-Normal, t, F Distributions

- If X is an rv with pdf f(x), then Y=g(X) is also an rv. What is the pdf of Y?
- If X is a discrete rv, replace Y=g(X) whereever you see X in the pdf of f(x) by using the relation .
- If X is a continuous rv, then do the same thing, but now multiply with Jacobian.
- If it is not 1-to-1 transformation, divide the region into sub-regions for which we have 1-to-1 transformation.

- Example: Let
Consider . What is the p.d.f. of Y?

- Solution:

- If X1,X2,…,Xn are independent random variables with MGFs Mxi (t), then the MGF of is

- Let X have continuous cdfFX(x) and define the rvY as Y=FX(x). Then,
Y ~ Uniform(0,1), that is,

P(Y y) = y, 0<y<1.

- This is very commonly used, especially in random number generation procedures.

- A statistic is also a random variable. Its distribution depends on the distribution of the random sample and the form of the function Y=T(X1, X2,…,Xn). The probability distribution of a statistic Y is called the sampling distribution of Y.

Properties of the Sample Mean and Sample Variance

- Let X1, X2,…,Xn be a r.s. of size n from a N(,2) distribution. Then,

If population variance is unknown, we use sample variance:

- The F distribution allows us to compare the variances by giving the distribution of

- If X~Fp,q, then 1/X~Fq,p.

- If X~tq, then X2~F1,q.

X

Random Variable (Population) Distribution

Sample Mean Distribution

If a random sample is drawn from any population, the sampling distribution of the sample mean is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distribution of will resemble a normal distribution.

Random Sample

(X1, X2, X3, …,Xn)

If X is normal, is normal.

If X isnon-normal,is approximately normally distributed for sample size greater than or equal to 30.