Chap 8 estimation of parameters fitting of probability distributions
Download
1 / 21

Chap 8: Estimation of parameters Fitting of Probability Distributions - PowerPoint PPT Presentation


  • 177 Views
  • Uploaded on

Chap 8 : Estimation of parameters & Fitting of Probability Distributions. Section 6.1 : INTRODUCTION Unknown parameter(s) values must be estimated before fitting probability laws to data. . Section 8.2 : Fitting the Poisson Distribution to Emissions of Alpha Particles (classical example).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chap 8: Estimation of parameters Fitting of Probability Distributions' - alka


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Chap 8 estimation of parameters fitting of probability distributions

Chap 8: Estimation of parameters & Fitting of Probability Distributions

Section 6.1: INTRODUCTION

Unknown parameter(s) values must be estimated before fitting probability laws to data.


Section 8 2 fitting the poisson distribution to emissions of alpha particles classical example
Section 8.2: Fitting the Poisson Distribution to Emissions of Alpha Particles (classical example)

Recall: The Probability Mass Function of a Poisson random variable X is given by:

From the observed data, we must estimate a value for the parameter


What if the experiment is repeated
What if the experiment is repeated?

The estimate of will be viewed as a random variable which has a probability dist’n referred to as its sampling distribution.

The spread of the sampling distribution reflects the variability of the estimate.

Chap 8 is about fitting the model to data.

Chap 9 will be dealing with testing such a fit.


Assessing goodness of fit gof
Assessing Goodness of Fit (GOF):

Example: Fit a Poisson dist’n to counts-p240

Informally, GOF is assessed by comparing the Observed (O) and the Expected (E) counts that are grouped (at least 5 each) into the 16 cells.

Formally, use a measure of discrepancy such as the Pearson’s chi-square statistic

to quantify the comparison of the O and E counts.

In this example,


Null dist n
Null dist’n:

is a random variable (as a function of random counts) whose probability dist’n is called its null distribution. It can be shown that the null dist’n of is approximately the chi-square dist’n with degrees of freedom df = no. of cells — no. of independent parameters fitted — 1.

Notation: df = 16 (cells) –1(parameter ) –1 = 14

The larger the value of , the worse the fit.


P value
p-value:

Figure 8.1 on page 242 gives a nice feeling of what a p-value might be. The p-value measures the degree of evidence against the statement “model fits data well == Poisson is the true model.”

The smaller the p-value, the worse the fit or there is more evidence against the model.

Small p-value means then rejecting the null or saying that “the model does NOT fit the data well.”

How small is small?

when P-value < = ALPHA,

where ALPHA is the level of confidence.


8 3 parameter estimation mom mle
8.3: Parameter Estimation:MOM & MLE

Let the observed data be a random sample i.e. a sequence of I.I.D. random variables whose joint distribution depends on an unknown parameter (scalar or vector).

An estimate of will be a random variable function of the whose dist’n is known as its sampling dist’n.

The standard deviation of the sampling dist’n will be termed as its standard error.


8 4 the method of moments
8.4: The Method of Moments

Definition: the (pop’n) moment of a random variable X is denoted by and its (sample) moment by

is viewed as an estimate of

Algorithm: MOM estimates parameter(s) by finding expressions for them in terms of the lowest possible (pop’n) moments and then substituting (sample) moments into the expressions.


8 5 the method of maximum likelihood
8.5: The Method of Maximum Likelihood

Algorithm: Let be a sequence of I.I.D. random variables.

  • The likelihood function is

  • The MLE of is that value of that maximizes the likelihood function or maximizes the natural logarithm (since the logarithm is monotonic function)

  • The log-likelihood function is then to be maximized to get the MLE.


8 5 1 mles of multinomial cell probabilities
8.5.1: MLEs of Multinomial Cell Probabilities

Suppose that , the counts in cells , follows a multinomial distribution with total count n and cell probabilities

Caution: the marginal dist’n of each is binomial

BUT the … are not INDEPENDENT i.e. their joint PMF is not the product of the marginal PFMs. The good news is that the MLE still applies.

Problem: Estimate the p’s from the x’s.


8 5 1a mles of multinomial cell probabilities cont d
8.5.1a: MLEs of Multinomial Cell Probabilities (cont’d)

To answer the question, we assume n is given and we wish to estimate

From the joint PMF , the log-likelihood becomes:

To maximize such a log-likelihood subject to the constraint , we use a Lagrangemultiplier to get after maximizing


8 5 1b mles of multinomial cell probabilities cont d
8.5.1b: MLEs of Multinomial Cell Probabilities (cont’d)

Deja vu: note that the sampling dist’n of the is determined by the binomial dist’ns of the

Hardy-Weinberg Equilibrium: GENETICS

Here the multinomial cell probabilities are functions of other unknown parameters ; that is

Read example A on page 260-261.


8 5 2 large sample theory for mles
8.5.2: Large Sample Theory for MLEs

Let be an estimate of a parameter based on

The variance of the sampling dist’n of many estimators decreases as the sample size n increases.

An estimate is said to be a consistent estimate of a parameter if approaches as the sample size n approaches infinity.

Consistency is a limiting property that does not require any behavior of the estimator for a finite sample size.


8 5 2 large sample theory for mles cont d
8.5.2: Large Sample Theory for MLEs (cont’d)

Theorem: Under appropriate smoothness conditions on f , the MLE from an I.I.D sample is consistent and the probability dist’n of tends to N(0,1). In other words, the large sample distribution of the MLE is approximately normal with mean (say, the MLE is asymptotically unbiased ) and its asymptotic variance is

where the information about the parameter is:


8 5 3 confidence intervals for mles
8.5.3: Confidence Intervals for MLEs

Recall that a confidence interval (as seen in Chap.7) is a random interval containing the parameter of interest with some specific probability.

Three (3) methods to get CI for MLEs are:

  • Exact CIs

  • Approximated CIs using Section 8.5.2

  • Bootstrap CIs


8 6 efficiency cramer rao lower bound
8.6: Efficiency & Cramer-Rao Lower Bound

Problem: Given a variety of possible estimates, the best one to choose should have its sampling distribution highly concentrated about the true parameter.

Because of its analytic simplicity, the mean square error, MSE, will be used as a measure of such a concentration.


8 6 efficiency cramer rao lower bound cont d
8.6: Efficiency & Cramer-Rao Lower Bound (cont’d)

Unbiasedness means

Definition: Given two estimates, and , of a parameter , the efficiency of relative to is

defined to be:

Theorem: (Cramer-Rao Inequality)

Under smooth assumptions on the density of the IID sequence when is an unbiased estimate of , we get the lower bound:


8 7 sufficiency
8.7: Sufficiency

Is there a function containing all the information in the sample about the parameter ?

If so, without loss of information the original data may be reduced to this statistic .

Definition: a statistic is said to be sufficient for if the conditional dist’n of , given T = t, does not depend on for any value t

In other words, given the value of T, which is called a sufficient statistic, one can gain no more knowledge about the parameter from further investigation with respect to the sample dist’n.


8 7 1 a factorization theorem
8.7.1: a Factorization Theorem

How to get a sufficient statistic?

Theorem A: a necessary and sufficient condition for to be sufficient for a parameter is that the joint PDF or PMF factors in the form:

Corollary A: if T is sufficient for , then the MLE is a function of T.


8 7 2 the rao blackwell thm
8.7.2: The Rao-Blackwell thm

The following theorem gives a quantitative rationale for basing an estimator of a parameter on an existing sufficient statistic.

Theorem: Rao-Blackwell Theorem

Let be an estimator of with for all Suppose that T is sufficient for ,

and let .

Then, for all ,

The inequality is strict unless


8 8 conclusion
8.8: Conclusion

Some key ideas in Chap.7 such as sampling distributions, Confidence Intervals were revisited

MOM and MLE were applied to some distributional theory approximations.

Theoretical concepts of efficiency, Cramer-Rao lower bound, and efficiency were discussed.

Finally, some light was shed in Parametric Bootstrapping.


ad