Stat 552 probability and statistics ii
This presentation is the property of its rightful owner.
Sponsored Links
1 / 51

STAT 552 PROBABILITY AND STATISTICS II PowerPoint PPT Presentation


  • 57 Views
  • Uploaded on
  • Presentation posted in: General

STAT 552 PROBABILITY AND STATISTICS II. INTRODUCTION Short review of S551. WHAT IS STATISTICS?.

Download Presentation

STAT 552 PROBABILITY AND STATISTICS II

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


STAT 552PROBABILITY AND STATISTICS II

INTRODUCTION

Short review of S551


WHAT IS STATISTICS?

  • Statistics is a science of collecting data, organizing and describing it and drawing conclusions from it. That is, statistics is a way to get information from data. It is the science of uncertainty.


BASIC DEFINITIONS

  • POPULATION: The collection of all items of interest in a particular study.

  • SAMPLE: A set of data drawn from the population;

    a subset of the population available for observation

  • PARAMETER: A descriptive measure of the

    population, e.g., mean

  • STATISTIC: A descriptive measure of a sample

  • VARIABLE: A characteristic of interest about each

    element of a population or sample.


STATISTIC

  • Statistic (or estimator) is any function of a r.v. of r.s. which do not contain any unknown quantity. E.g.

    • are statistics.

    • are NOT.

  • Any observed or particular value of an estimator is an estimate.


Sample Space

  • The set of all possible outcomes of an experiment is called a sample space and denoted byS.

  • Determining the outcomes.

    • Build an exhaustive list of all possible outcomes.

    • Make sure the listed outcomes are mutually exclusive.


RANDOM VARIABLES

  • Variables whose observed value is determined by chance

  • A r.v. is a function defined on the sample space S that associates a real number with each outcome in S.

  • Rvs are denoted by uppercase letters, and their observed values by lowercase letters.


DESCRIPTIVE STATISTICS

  • Descriptive statistics involves the arrangement, summary, and presentation of data, to enable meaningful interpretation, and to support decision making.

  • Descriptive statistics methods make use of

    • graphical techniques

    • numerical descriptive measures.


Types of data – examples


PROBABILITY

POPULATION

SAMPLE

STATISTICAL

INFERENCE


  • PROBABILITY: A numerical value expressing the degree of uncertainty regarding the occurrence of an event. A measure of uncertainty.

  • STATISTICAL INFERENCE: The science of drawing inferences about the population based only on a part of the population, sample.


Probability

P : S  [0,1]

Probability domain range

function


THE CALCULUS OF PROBABILITIES

  • If P is a probability function and A is any set, then

    a. P()=0

    b. P(A)  1

    c. P(AC)=1  P(A)


ODDS

  • The odds of an event A is defined by

  • It tells us how much more likely to see the

    occurrence of event A.


ODDS RATIO

  • OR is the ratio of two odds.

  • Useful for comparing the odds under two different conditions or for two different groups, e.g. odds for males versus females.


CONDITIONAL PROBABILITY

  • (Marginal) Probability: P(A): How likely is it that an event A will occur when an experiment is performed?

  • Conditional Probability: P(A|B): How will the probability of event A be affected by the knowledge of the occurrence or nonoccurrence of event B?

  • If two events are independent, then P(A|B)=P(A)


CONDITIONAL PROBABILITY


BAYES THEOREM

  • Suppose you have P(B|A), but need P(A|B).


Independence

  • A and B are independent iff

    • P(A|B)=P(A) or P(B|A)=P(B)

    • P(AB)=P(A)P(B)

  • A1, A2, …, An are mutually independent iff

    for every subset j of {1,2,…,n}

    E.g. for n=3, A1, A2, A3 are mutually independent iff P(A1A2A3)=P(A1)P(A2)P(A3) and P(A1A2)=P(A1)P(A2) and P(A1A3)=P(A1)P(A3) and P(A2A3)=P(A2)P(A3)


DISCRETE RANDOM VARIABLES

  • If the set of all possible values of a r.v. X is a countable set, then X is called discrete r.v.

  • The function f(x)=P(X=x) for x=x1,x2, … that assigns the probability to each value x is called probability density function (p.d.f.) or probability mass function (p.m.f.)


Example

  • Discrete Uniform distribution:

  • Example: throw a fair die. P(X=1)=…=P(X=6)=1/6


CONTINUOUS RANDOM VARIABLES

  • When sample space is uncountable (continuous)

  • Example: Continuous Uniform(a,b)


CUMULATIVE DENSITY FUNCTION (C.D.F.)

  • CDF of a r.v. X is defined as F(x)=P(X≤x).


JOINT DISCRETE DISTRIBUTIONS

  • A function f(x1, x2,…, xk) is the joint pmf for some vector valued rv X=(X1, X2,…,Xk) iff the following properties are satisfied:

    f(x1, x2,…, xk) 0 for all (x1, x2,…, xk)

    and


MARGINAL DISCRETE DISTRIBUTIONS

  • If the pair (X1,X2) of discrete random variables has the joint pmf f(x1,x2), then the marginal pmfs of X1 and X2 are


CONDITIONAL DISTRIBUTIONS

  • If X1 and X2 are discrete or continuous random variables with joint pdf f(x1,x2), then the conditional pdf of X2 given X1=x1 is defined by

  • For independent rvs,


EXPECTED VALUES

Let X be a rv with pdf fX(x) and g(X) be a function of X. Then, the expected value (or the mean or the mathematical expectation) of g(X)

providing the sum or the integral exists, i.e.,

<E[g(X)]<.


EXPECTED VALUES

  • E[g(X)] is finite if E[| g(X) |]is finite.


Laws of Expected Value

E(c) = c

E(X + c) = E(X) + c

E(cX) = cE(X)

Laws of Variance

V(c) = 0

V(X + c) = V(X)

V(cX) = c2V(X)

Laws of Expected Value and Variance

Let X be a rv and c be a constant.


EXPECTED VALUE

If X and Y are independent,

The covariance of X and Y is defined as


EXPECTED VALUE

If X and Y are independent,

The reverse is usually not correct! It is only correct under normal distribution.

If (X,Y)~Normal, then X and Y are independent iff

Cov(X,Y)=0


EXPECTED VALUE

If X1 and X2 are independent,


CONDITIONAL EXPECTATION AND VARIANCE


CONDITIONAL EXPECTATION AND VARIANCE

(EVVE rule)

Proofs available in Casella & Berger (1990), pgs. 154 & 158


SOME MATHEMATICAL EXPECTATIONS

  • Population Mean:  = E(X)

  • Population Variance:

(measure of the deviation from the population mean)

  • Population Standard Deviation:

  • Moments:


  • This measure reflects the dispersion of all the observations

  • The variance of a population of size N x1, x2,…,xN whose mean is m is defined as

  • The variance of a sample of n observationsx1, x2, …,xn whose mean is is defined as

The Variance


MOMENT GENERATING FUNCTION

The m.g.f. of random variable X is defined as

for t Є (-h,h) for some h>0.


Properties of m.g.f.

  • M(0)=E[1]=1

  • If a r.v. X has m.g.f. M(t), then Y=aX+b has a m.g.f.

  • M.g.f does not always exists (e.g. Cauchy distribution)


CHARACTERISTIC FUNCTION

The c.h.f. of random variable X is defined as

for all real numbers t.

C.h.f. always exists.


Uniqueness

Theorem:

  • If two r.v.s have mg.f.s that exist and are equal, then they have the same distribution.

  • If two r.v.s have the same distribution, then they have the same m.g.f. (if they exist)

    Similar statements are true for c.h.f.


SOME DISCRETE PROBABILITY DISTRIBUTIONS

  • Please review: Degenerate, Uniform, Bernoulli, Binomial, Poisson, Negative Binomial, Geometric, Hypergeometric, Extended Hypergeometric, Multinomial


SOME CONTINUOUS PROBABILITY DISTRIBUTIONS

  • Please review: Uniform, Normal (Gaussian), Exponential, Gamma, Chi-Square, Beta, Weibull, Cauchy, Log-Normal, t, F Distributions


TRANSFORMATION OF RANDOM VARIABLES

  • If X is an rv with pdf f(x), then Y=g(X) is also an rv. What is the pdf of Y?

  • If X is a discrete rv, replace Y=g(X) whereever you see X in the pdf of f(x) by using the relation .

  • If X is a continuous rv, then do the same thing, but now multiply with Jacobian.

  • If it is not 1-to-1 transformation, divide the region into sub-regions for which we have 1-to-1 transformation.


CDF method

  • Example: Let

    Consider . What is the p.d.f. of Y?

  • Solution:


M.G.F. Method

  • If X1,X2,…,Xn are independent random variables with MGFs Mxi (t), then the MGF of is


THE PROBABILITY INTEGRAL TRANSFORMATION

  • Let X have continuous cdfFX(x) and define the rvY as Y=FX(x). Then,

    Y ~ Uniform(0,1), that is,

    P(Y  y) = y, 0<y<1.

  • This is very commonly used, especially in random number generation procedures.


SAMPLING DISTRIBUTION

  • A statistic is also a random variable. Its distribution depends on the distribution of the random sample and the form of the function Y=T(X1, X2,…,Xn). The probability distribution of a statistic Y is called the sampling distribution of Y.


SAMPLING FROM THE NORMAL DISTRIBUTION

Properties of the Sample Mean and Sample Variance

  • Let X1, X2,…,Xn be a r.s. of size n from a N(,2) distribution. Then,


SAMPLING FROM THE NORMAL DISTRIBUTION

If population variance is unknown, we use sample variance:


SAMPLING FROM THE NORMAL DISTRIBUTION

  • The F distribution allows us to compare the variances by giving the distribution of

  • If X~Fp,q, then 1/X~Fq,p.

  • If X~tq, then X2~F1,q.


X

Random Variable (Population) Distribution

Sample Mean Distribution

CENTRAL LIMIT THEOREM

If a random sample is drawn from any population, the sampling distribution of the sample mean is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distribution of will resemble a normal distribution.

Random Sample

(X1, X2, X3, …,Xn)


Sampling Distribution of the Sample Mean

If X is normal, is normal.

If X isnon-normal,is approximately normally distributed for sample size greater than or equal to 30.


  • Login