Brief review
Download
1 / 97

Brief Review - PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on

Brief Review. Probability and Statistics. Probability distributions. Continuous distributions. Defn (density function). Let x denote a continuous random variable then f ( x ) is called the density function of x 1) f ( x ) ≥ 0 2) 3) . Defn (Joint density function).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Brief Review' - bijan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Brief review

Brief Review

Probability and Statistics


Probability distributions

Probability distributions

Continuous distributions


Defn density function
Defn (density function)

Let x denote a continuous random variable then f(x) is called the density function of x

1) f(x) ≥ 0

2)

3)


Defn joint density function
Defn (Joint density function)

Let x = (x1 ,x2 ,x3 , ... , xn) denote a vector of continuous random variables then

f(x) = f(x1 ,x2 ,x3 , ... , xn)

is called the joint density function of x = (x1 ,x2 ,x3 , ... , xn)

if

1) f(x) ≥ 0

2)

3)



Defn marginal density function
Defn (Marginal density function)

The marginal density of x1 = (x1 ,x2 ,x3 , ... , xp) (p < n) is defined by:

f1(x1) = =

where x2 = (xp+1 ,xp+2 ,xp+3 , ... , xn)

The marginal density of x2 = (xp+1 ,xp+2 ,xp+3 , ... , xn) is defined by:

f2(x2) = =

where x1 = (x1 ,x2 ,x3 , ... , xp)


Defn conditional density function
Defn (Conditional density function)

The conditional density of x1 given x2 (defined in previous slide) (p < n) is defined by:

f1|2(x1 |x2) =

conditional density of x2 given x1 is defined by:

f2|1(x2 |x1)=


Marginal densities describe how the subvector xi behaves ignoring xj

Conditional densities describe how the subvector xi behaves when the subvector xj is held fixed


Defn independence
Defn (Independence)

The two sub-vectors (x1 and x2) are called independent if:

f(x) = f(x1, x2) = f1(x1)f2(x2)

= product of marginals

or

the conditional density of xi given xj :

fi|j(xi |xj) = fi(xi) = marginal density of xi


Example p variate normal
Example (p-variate Normal)

The random vector x (p× 1) is said to have the

p-variate Normal distribution with

mean vector m(p× 1) and

covariance matrix S(p×p)

(written x ~ Np(m,S)) if:


Example bivariate normal
Example (bivariate Normal)

The random vector is said to have the bivariate

Normal distribution with mean vector

and

covariance matrix


Theorem transformations
Theorem (Transformations)

Let x = (x1 ,x2 ,x3 , ... , xn) denote a vector of continuous random variables with joint density function f(x1 ,x2 ,x3 , ... , xn) = f(x). Let

y1 =f1(x1 ,x2 ,x3 , ... , xn)

y2 =f2(x1 ,x2 ,x3 , ... , xn)

...

yn =fn(x1 ,x2 ,x3 , ... , xn)

define a 1-1 transformation of x into y.


Then the joint density of y is g(y) given by:

g(y) = f(x)|J| where

= the Jacobian of the transformation


Corollary linear transformations
Corollary (Linear Transformations)

Let x = (x1 ,x2 ,x3 , ... , xn) denote a vector of continuous random variables with joint density function f(x1 ,x2 ,x3 , ... , xn) = f(x). Let

y1 = a11x1 + a12x2 + a13x3 , ... + a1nxn

y2 = a21x1 + a22x2 + a23x3 , ... + a2nxn

...

yn = an1x1 + an2x2 + an3x3 , ... + annxn

define a 1-1 transformation of x into y.


Then the joint density of y is g(y) given by:


Corollary linear transformations for normal random variables
Corollary (Linear Transformations for Normal Random variables)

Let x = (x1 ,x2 ,x3 , ... , xn) denote a vector of continuous random variables having an n-variate Normal distribution with mean vector m and covariance matrix S.

i.e. x ~ Nn(m, S)

Let

y1 = a11x1 + a12x2 + a13x3 , ... + a1nxn

y2 = a21x1 + a22x2 + a23x3 , ... + a2nxn

...

yn = an1x1 + an2x2 + an3x3 , ... + annxn

define a 1-1 transformation of x into y.

Then y = (y1 ,y2 ,y3 , ... , yn) ~ Nn(Am,ASA')


Defn expectation
Defn (Expectation) variables)

Let x = (x1 ,x2 ,x3 , ... , xn) denote a vector of continuous random variables with joint density function

f(x) = f(x1 ,x2 ,x3 , ... , xn).

Let U = h(x)= h(x1 ,x2 ,x3 , ... , xn)

Then


Defn conditional expectation
Defn (Conditional Expectation) variables)

Let x = (x1 ,x2 ,x3 , ... , xn) = (x1 , x2 ) denote a vector of continuous random variables with joint density function

f(x) = f(x1 ,x2 ,x3 , ... , xn) = f(x1 , x2 ).

Let U = h(x1)= h(x1 ,x2 ,x3 , ... , xp)

Then the conditional expectation of U given x2


Defn variance
Defn (Variance) variables)

Let x = (x1 ,x2 ,x3 , ... , xn) denote a vector of continuous random variables with joint density function

f(x) = f(x1 ,x2 ,x3 , ... , xn).

Let U = h(x)= h(x1 ,x2 ,x3 , ... , xn)

Then


Defn conditional variance
Defn (Conditional Variance) variables)

Let x = (x1 ,x2 ,x3 , ... , xn) = (x1 , x2 ) denote a vector of continuous random variables with joint density function

f(x) = f(x1 ,x2 ,x3 , ... , xn) = f(x1 , x2 ).

Let U = h(x1)= h(x1 ,x2 ,x3 , ... , xp)

Then the conditional variance of U given x2


Defn covariance correlation
Defn (Covariance, Correlation) variables)

Let x = (x1 ,x2 ,x3 , ... , xn) denote a vector of continuous random variables with joint density function

f(x) = f(x1 ,x2 ,x3 , ... , xn).

Let U = h(x)= h(x1 ,x2 ,x3 , ... , xn) and

V = g(x)=g(x1 ,x2 ,x3 , ... , xn)

Then the covariance of U and V.


Properties
Properties variables)

  • Expectation

  • Variance

  • Covariance

  • Correlation


  • E variables)[a1x1 + a2x2 + a3x3 + ... + anxn]

    = a1E[x1] + a2E[x2] + a3E[x3] + ... + anE[xn]

    or E[a'x] = a'E[x]


  • E[UV] = E[h( variables)x1)g(x2)]

    = E[U]E[V] = E[h(x1)]E[g(x2)]

    if x1 and x2 are independent


  • Var variables)[a1x1 + a2x2 + a3x3 + ... + anxn]

    or Var[a'x] = a′Sa


  • Cov variables)[a1x1 + a2x2 + ... + anxn ,

    b1x1 + b2x2 + ... + bnxn]

    or Cov[a'x, b'x] = a′Sb


Statistical inference

Statistical Inference variables)

Making decisions from data


There are two main areas of Statistical Inference variables)

  • Estimation – deciding on the value of a parameter

    • Point estimation

    • Confidence Interval, Confidence region Estimation

  • Hypothesis testing

    • Deciding if a statement (hypotheisis) about a parameter is True or False


The general statistical model most data fits this situation

The general statistical model variables)Most data fits this situation


Defn the classical statistical model
Defn (The Classical Statistical Model) variables)

The data vector

x = (x1 ,x2 ,x3 , ... , xn)

The model

Let f(x|q) = f(x1 ,x2 , ... , xn|q1 , q2 ,... , qp) denote the joint density of the data vector x = (x1 ,x2 ,x3 , ... , xn) of observations where the unknown parameter vector qW (a subset of p-dimensional space).


An example
An Example variables)

The data vector

x = (x1 ,x2 ,x3 , ... , xn) a sample from the normal distribution with mean m and variance s2

The model

Then f(x|m , s2) = f(x1 ,x2 , ... , xn|m , s2), the joint density of x = (x1 ,x2 ,x3 , ... , xn) takes on the form:

where the unknown parameter vector q = (m , s2) W ={(x,y)|-∞ < x < ∞ , 0 ≤ y < ∞}.


Defn sufficient statistics
Defn (Sufficient Statistics) variables)

Let x have joint density f(x|q) where the unknown parameter vector qW.

Then S = (S1(x) ,S2(x) ,S3(x) , ... , Sk(x)) is called a set of sufficient statistics for the parameter vector q if the conditional distribution of x given S = (S1(x) ,S2(x) ,S3(x) , ... , Sk(x)) is not functionally dependent on the parameter vector q.

A set of sufficient statistics contains all of the information concerning the unknown parameter vector


A simple example illustrating sufficiency
A Simple Example illustrating Sufficiency variables)

Suppose that we observe a Success-Failure experiment n = 3 times. Let q denote the probability of Success. Suppose that the data that is collected is x1, x2, x3 where xi takes on the value 1 is the ith trial is a Success and 0 if the ith trial is a Failure.


The following table gives possible values of ( variables)x1, x2, x3).

The data can be generated in two equivalent ways:

  • Generating (x1, x2, x3) directly from f (x1, x2, x3|q) or

  • Generating S from g(S|q) then generating(x1, x2, x3) from f (x1, x2, x3|S). Since the second step does involve q, no additional information will be obtained by knowing (x1, x2, x3)once S is determined


The sufficiency principle
The Sufficiency Principle variables)

Any decision regarding the parameter qshould be based on a set of Sufficient statistics S1(x), S2(x), ...,Sk(x) and not otherwise on the value of x.


A useful approach in developing a statistical procedure variables)

  • Find sufficient statistics

  • Develop estimators , tests of hypotheses etc. using only these statistics


Defn minimal sufficient statistics
Defn (Minimal Sufficient Statistics) variables)

Let x have joint density f(x|q) where the unknown parameter vector qW.

Then S = (S1(x) ,S2(x) ,S3(x) , ... , Sk(x)) is a set of Minimal Sufficient statistics for the parameter vector qif S = (S1(x) ,S2(x) ,S3(x) , ... , Sk(x)) is a set of Sufficient statistics and can be calculated from any other set of Sufficient statistics.


Theorem the factorization criterion
Theorem (The Factorization Criterion) variables)

Let x have joint density f(x|q) where the unknown parameter vector qW.

Then S = (S1(x) ,S2(x) ,S3(x) , ... , Sk(x)) is a set of Sufficient statistics for the parameter vector qif

f(x|q) = h(x)g(S,q)

= h(x)g(S1(x) ,S2(x) ,S3(x) , ... , Sk(x),q).

This is useful for finding Sufficient statistics

i.e. If you can factor out q-dependence with a set of statistics then these statistics are a set of Sufficient statistics


Defn completeness
Defn (Completeness) variables)

Let x have joint density f(x|q) where the unknown parameter vector qW.

Then S = (S1(x) ,S2(x) ,S3(x) , ... , Sk(x)) is a set of Complete Sufficient statistics for the parameter vector qif S = (S1(x) ,S2(x) ,S3(x) , ... , Sk(x)) is a set of Sufficient statistics and whenever

E[f(S1(x) ,S2(x) ,S3(x) , ... , Sk(x)) ] = 0

then

P[f(S1(x) ,S2(x) ,S3(x) , ... , Sk(x)) = 0] = 1


Defn the exponential family
Defn (The Exponential Family) variables)

Let x have joint density f(x|q)| where the unknown parameter vector qW. Then f(x|q) is said to be a member of the exponential family of distributions if:

qW,where


  • - ∞ < variables)ai < bi < ∞ are not dependent on q.

    2) W contains a nondegenerate k-dimensional rectangle.

    3) g(q), ai,bi and pi(q) are not dependent on x.

    4) h(x), ai ,bi and Si(x) are not dependent on q.


If in addition. variables)

5) The Si(x) are functionally independent for i = 1, 2,..., k.

6) [Si(x)]/ xj exists and is continuous for all i = 1, 2,..., k j = 1, 2,..., n.

7) pi(q) is a continuous function of qfor all i = 1, 2,..., k.

8) R = {[p1(q),p2(q), ...,pK(q)] | qW,} contains nondegenerate k-dimensional rectangle.

Then

the set of statistics S1(x), S2(x), ...,Sk(x) form a Minimal Complete set of Sufficient statistics.


Defn the likelihood function
Defn (The Likelihood function) variables)

Let x have joint density f(x|q) where the unkown parameter vector qW. Then for a

given value of the observation vector x ,the Likelihood function, Lx(q), is defined by:

Lx(q) = f(x|q) with qW

The log Likelihood functionlx(q) is defined by:

lx(q) =lnLx(q) = lnf(x|q) with qW


The likelihood principle
The Likelihood Principle variables)

Any decision regarding the parameter qshould be based on the likelihood function Lx(q) and not otherwise on the value of x.

If two data sets result in the same likelihood function the decision regarding q should be the same.


Some statisticians find it useful to variables)plot the likelihood function Lx(q) given the value of x.

It summarizes the information contained in x regarding the parameter vector q.


An example1
An Example variables)

The data vector

x = (x1 ,x2 ,x3 , ... , xn) a sample from the normal distribution with mean m and variance s2

The joint distribution of x

Then f(x|m , s2) = f(x1 ,x2 , ... , xn|m , s2), the joint density of x = (x1 ,x2 ,x3 , ... , xn) takes on the form:

where the unknown parameter vector q = (m , s2) W ={(x,y)|-∞ < x < ∞ , 0 ≤ y < ∞}.


The likelihood function
The Likelihood function variables)

Assume data vector is known

x = (x1 ,x2 ,x3 , ... , xn)

The Likelihood function

Then L(m , s)= f(x|m , s) = f(x1 ,x2 , ... , xn|m , s2),


or variables)


Hence
hence variables)

Now consider the following data: (n = 10)


70 variables)

m

0

s

50

20


70 variables)

m

50

20

0

s



70 variables)

m

0

s

50

20


70 variables)

m

50

20

0

s


The sufficiency principle1
The Sufficiency Principle variables)

Any decision regarding the parameter qshould be based on a set of Sufficient statistics S1(x), S2(x), ...,Sk(x) and not otherwise on the value of x.

If two data sets result in the same values for the set of Sufficient statistics the decision regarding q should be the same.


Theorem birnbaum equivalency of the likelihood principle and sufficiency principle
Theorem (Birnbaum - Equivalency of the Likelihood Principle and Sufficiency Principle)

Lx1(q) Lx2(q)

if and only if

S1(x1) = S1(x2),..., and Sk(x1) = Sk(x2)


The following table gives possible values of ( and Sufficiency Principle)x1, x2, x3).

The Likelihood function


Estimation theory

Estimation Theory and Sufficiency Principle)

Point Estimation


Defn estimator
Defn (Estimator) and Sufficiency Principle)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector qW.

Then an estimatorof the parameter f(q) = f(q1 ,q2 , ... , qk) is any function T(x)=T(x1 ,x2 ,x3 , ... , xn) of the observation vector.


Defn mean square error
Defn (Mean Square Error) and Sufficiency Principle)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector q W.

Let T(x) be an estimator of the parameter

f(q). Then the Mean Square Error of T(x) is defined to be:


Defn uniformly better
Defn (Uniformly Better) and Sufficiency Principle)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector q W.

Let T(x) and T*(x) be estimators of the parameter f(q). Then T(x) is said to be uniformly better than T*(x) if:


Defn unbiased
Defn (Unbiased and Sufficiency Principle))

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector q W.

Let T(x) be an estimator of the parameter f(q). Then T(x) is said to be an unbiased estimator of the parameter f(q) if:


Theorem cramer rao lower bound
Theorem (Cramer Rao Lower bound) and Sufficiency Principle)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector q W. Suppose that:

i) exists for all x and for all .

ii)

iii)

iv)


Let and Sufficiency Principle)M denote the p x p matrix with ijth element.

Then V = M-1 is the lower bound for the covariance matrix of unbiased estimators of q.

That is, var(c' ) = c'var( )c ≥ c'M-1c = c'Vc where is a vector of unbiased estimators of q.


Defn uniformly minimum variance unbiased estimator
Defn (Uniformly Minimum Variance Unbiased Estimator) and Sufficiency Principle)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector qW. Then T*(x) is said to be theUMVU (Uniformly minimum variance unbiased)estimator off(q) if:

1) E[T*(x)] = f(q) for all qW.

2) Var[T*(x)] ≤ Var[T(x)] for all qW whenever E[T(x)] = f(q).


Theorem rao blackwell
Theorem (Rao-Blackwell) and Sufficiency Principle)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector q W.

Let S1(x), S2(x), ...,SK(x) denote a set of sufficient statistics.

Let T(x) be any unbiased estimator of f(q).

Then T*[S1(x), S2(x), ...,Sk(x)] = E[T(x)|S1(x), S2(x), ...,Sk(x)] is an unbiased estimator of f(q) such that:

Var[T*(S1(x), S2(x), ...,Sk(x))] ≤ Var[T(x)]

for all qW.


Theorem lehmann scheffe
Theorem (Lehmann-Scheffe') and Sufficiency Principle)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector q W.

Let S1(x), S2(x), ...,SK(x) denote a set of completesufficient statistics.

Let T*[S1(x), S2(x), ...,Sk(x)] be an unbiased estimator of f(q). Then:

T*(S1(x), S2(x), ...,Sk(x)) )] is the UMVU estimator of f(q).


Defn consistency
Defn ( and Sufficiency Principle)Consistency)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector qW. Let Tn(x) be anestimator off(q). Then Tn(x) is called a consistentestimator of f(q) if for any e > 0:


Defn m s e consistency
Defn (M. S. E. and Sufficiency Principle)Consistency)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vector qW. Let Tn(x) be anestimator off(q). Then Tn(x) is called a M. S. E. consistentestimator of f(q) if for any e > 0:


Methods for finding estimators

Methods for Finding Estimators and Sufficiency Principle)

The Method of Moments

Maximum Likelihood Estimation


Methods for finding estimators and Sufficiency Principle)

  • Method of Moments

  • Maximum Likelihood Estimation


Method of Moments and Sufficiency Principle)

Let x1, … , xndenote a sample from the density function

f(x; q1, … , qp) = f(x; q)

The kth moment of the distribution being sampled is defined to be:


The and Sufficiency Principle)kth sample moment is defined to be:

To find the method of moments estimator of q1, … , qpwe set up the equations:


We then solve the equations and Sufficiency Principle)

for q1, … , qp.

The solutions

are called the method of moments estimators


The method of maximum likelihood
The Method of Maximum Likelihood and Sufficiency Principle)

Suppose that the data x1, … , xnhas joint density function

f(x1, … , xn; q1, … , qp)

where q = (q1, … , qp) are unknown parameters assumed to lie in W(a subset of p-dimensional space).

We want to estimate the parametersq1, … , qp


Definition maximum likelihood estimation
Definition: and Sufficiency Principle)Maximum Likelihood Estimation

Suppose that the data x1, … , xnhas joint density function

f(x1, … , xn; q1, … , qp)

Then the Likelihood function is defined to be

L(q) = L(q1, … , qp)

= f(x1, … , xn; q1, … , qp)

the Maximum Likelihood estimators of the parameters q1, … , qp are the values that maximize

L(q) = L(q1, … , qp)


the and Sufficiency Principle)Maximum Likelihood estimators of the parameters q1, … , qp are the values

Such that

Note:

is equivalent to maximizing

the log-likelihood function


Application

Application and Sufficiency Principle)

The General Linear Model


Consider the random variable and Sufficiency Principle)Y with

1. E[Y] = g(U1 ,U2 , ... , Uk)

= b1f1(U1 ,U2 , ... , Uk) + b2f2(U1 ,U2 , ... , Uk) + ... + bpfp(U1 ,U2 , ... , Uk)

=

and

2. var(Y) = s2

  • where b1, b2 , ... ,bp are unknown parameters

  • and f1 ,f2 , ... , fp are known functions of the nonrandom variables U1 ,U2 , ... , Uk.

  • Assume further that Y is normally distributed.


Thus the density of and Sufficiency Principle)Y is:

f(Y|b1, b2 , ... ,bp, s2) = f(Y| b, s2)

i = 1,2, … , p


Now suppose that n independent observations of and Sufficiency Principle)Y,

(y1, y2, ..., yn) are made

corresponding to n sets of values of (U1 ,U2 , ... , Uk) - (u11 ,u12 , ... , u1k),

(u21 ,u22 , ... , u2k),

...

(un1 ,un2 , ... , unk).

Let xij = fj(ui1 ,ui2 , ... , uik) j =1, 2, ..., p; i =1, 2, ..., n.

Then the joint density of y = (y1, y2, ... yn) is:

f(y1, y2, ..., yn|b1, b2 , ... ,bp, s2) = f(y|b, s2)


Thus and Sufficiency Principle)f(y|b,s2) is a member of the exponential family of distributions

and S = (y'y, X'y) is a Minimal Complete set of Sufficient Statistics.


Hypothesis testing

Hypothesis Testing and Sufficiency Principle)


Defn test of size a
Defn (Test of size and Sufficiency Principle)a)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q)where the unknown parameter vectorq W.

Letw be any subset ofW.

Consider testing the the Null Hypothesis

H0:q w

against the alternative hypothesis

H1:q w.


Let and Sufficiency Principle)A denote the acceptance region for the test. (all values x = (x1 ,x2 ,x3 , ... , xn) of such that the decision to accept H0 is made.)

and let C denote the critical region for the test (all values x = (x1 ,x2 ,x3 , ... , xn) of such that the decision to reject H0 is made.).

Then the test is said to be of size a if


Defn power
Defn (Power) and Sufficiency Principle)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vectorq W.

Consider testing the the Null Hypothesis

H0:q w

against the alternative hypothesis

H1:qw.

wherew is any subset ofW. Then the Power of the test forqwis defined to be:


Defn uniformly most powerful ump test of size a
Defn (Uniformly Most Powerful (UMP) test of size and Sufficiency Principle)a)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q) where the unknown parameter vectorq W.

Consider testing the the Null Hypothesis

H0:q w

against the alternative hypothesis

H1:q w.

wherew is any subset ofW.

Let C denote the critical region for the test . Then the test is called the UMP test of sizeaif:


Let and Sufficiency Principle)x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q)where the unknown parameter vectorq W.

Consider testing the the Null Hypothesis

H0:q w

against the alternative hypothesis

H1:q w.

wherew is any subset ofW.

Let C denote the critical region for the test . Then the test is called the UMP test of sizeaif:


and for any other critical region C* such that: and Sufficiency Principle)

then


Theorem neymann pearson lemma
Theorem and Sufficiency Principle)(Neymann-Pearson Lemma)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q)where the unknown parameter vectorq W = (q0, q1).

Consider testing the the Null Hypothesis

H0:q=q0

against the alternative hypothesis

H1:q=q1.

Then the UMP test of sizeahas critical region:

where K is chosen so that


Defn likelihood ratio test of size a
Defn (Likelihood Ratio Test of size and Sufficiency Principle)a)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q)where the unknown parameter vectorq W.

Consider testing the the Null Hypothesis

H0:q w

against the alternative hypothesis

H1:q w.

wherew is any subset ofW

Then the Likelihood Ratio (LR) test of size a has critical region:

where K is chosen so that


Theorem asymptotic distribution of likelihood ratio test criterion
Theorem (Asymptotic distribution of Likelihood ratio test criterion)

Let x = (x1 ,x2 ,x3 , ... , xn) denote the vector of observations having joint density f(x|q)where the unknown parameter vectorq W.

Consider testing the the Null Hypothesis

H0:q w

against the alternative hypothesis

H1:q w.

wherew is any subset ofW

Then under proper regularity conditions on U = -2lnl(x)possesses an asymptotic Chi-square distribution with degrees of freedom equal to the difference between the number of independent parameters inWandw.


ad