Parametric Inference

1 / 20

# Parametric Inference - PowerPoint PPT Presentation

Parametric Inference. Sample distance between θ and θ *. True distance between θ and θ * (KLD). Properties of MLE. Consistency True parameter: MLE using n samples: Define Condition 1: Condition 2:.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Parametric Inference' - noah

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Parametric Inference

Sample distance between θ and θ*

True distance between θ and θ* (KLD)

Properties of MLE
• Consistency

True parameter:

MLE using n samples:

Define

Condition 1:

Condition 2:

Asymptotic convergence of sample to true distance for at least one parameter value

Model is identifiable

Properties of MLE
• Equivariance

Condition: g is invertible (see proof)

- g is one-to-one and onto

Properties of MLE
• Asymptotic normality

True standard error

Approximate standard error

: Fisher information at true parameter value θ

: Fisher information at MLE parameter value

Fisher information

Define score function:

Rate of change of log likelihood of X w.r.t. parameter θ

Fisher information at θ:

Measure of information carried by n IID data points X1, X2, … Xn about the model parameter θ

Fact (Cramer Rao bound):

Lower bound on the variance of any unbiased estimator of θ

Parametric Bootstrap

Ifτis any statistic of X1, X2, …., Xn

Nonparametric bootstrap

Each τbis computed using a sample Xb,1, Xb,2, …., Xb,n~(empirical distribution)

Parametric bootstrap

Each τb is computed using a sample Xb,1, Xb,2, …., Xb,n~ (MLE or Method of moments parametric distribution)

Sufficient statistic
• Any function of the data Xn: T(X1, X2, …., Xn) is a statistic
• Definition 1:
• T is sufficient for θ:

Likelihood functions for data sets xn and yn have the same shape

Recall that likelihood function is specific to an observed data set xn !

Sufficient statistic
• Intuitively Tis the connecting link between data and likelihood
• Sufficient statistic is not unique
• For example, xn and T(xn) are both sufficient statistics
Sufficient statistic
• Definition 2:
• T is sufficient for θ:
• Factorization theorem
• T is sufficient for θ if and only if

Distribution of xn is conditionally independent of θ of given T

Implies the first definition of sufficient statistic

Sufficient statistic
• Minimal sufficient
• a sufficient statistic
• function of every other sufficient statistic
• T is minimal sufficient if
• Recall T is sufficient if
Sufficient statistic
• Rao-Blackwell theorem
• An estimator of θ should depend only on the sufficient statistic T, otherwise it can be improved.
• Exponential family of distributions

one parameter θ

multiple parameters

Sufficient statistic
• Exponential family
• n IID random variables X1, X2, …., Xn have distribution
• Examples include Normal, Binomial, Poisson.

Also exponential

is a sufficient statistic (Factorization theorem)

Iterative MLE
• Start with an initial guess for parameter(s). Obtain improved estimates in subsequent iterations until convergence.
• Initial parameter value could come from the method of moments estimator.
• Newton-Raphson
• Iterative technique to find a local root of a function.
• MLE is equivalent to finding the root of the derivative of log likelihood function.

takes closer to MLE at every iteration

Newton-Raphson
• Taylor series expansion of around current parameter estimate

For MLE,

Solving for ,

Multi-parameter case:

where

Newton-Raphson

Slope

Slope

MLE

MLE

Expectation Maximization
• Iterative MLE technique used in missing data problems.
• Sometimes introducing missing data simplifies maximizing of log likelihood.
• Two log likelihoods (complete data and incomplete data)
• Two main steps
• Compute expectation of complete data log likelihood using current parameters.
• Maximize the above over parameter space to obtain new parameters.
Expectation Maximization

Incomplete

Data

Log likelihood

Complete

Data

Log likelihood

Expected log likelihood

constant

variable

Expectation Maximization

Algorithm

Start with an initial guessof parameter value(s). Repeat steps 1 and 2 below for j = 0, 1, 2, ….

1. Expectation: Compute

2. Maximization: Update parameters by maximizing the above expectation over parameter space

Expectation Maximization
• Fact

OR

Incomplete data log likelihood increases every iteration!

MLE can be reached after a sufficient number of iterations