research method
Download
Skip this Video
Download Presentation
Research Method

Loading in 2 Seconds...

play fullscreen
1 / 29

Research Method - PowerPoint PPT Presentation


  • 80 Views
  • Uploaded on

Research Method. Lecture 13 (Greene Ch 16) Maximum Likelihood Estimation (MLE). Basic idea. Maximum likelihood estimation (MLE) is a method to find the most likely density function that would have generated the data. Thus, MLE requires you to make a distributional assumption first.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Research Method' - olisa


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
research method

Research Method

Lecture 13

(Greene Ch 16)

Maximum Likelihood Estimation (MLE)

basic idea
Basic idea
  • Maximum likelihood estimation (MLE) is a method to find the most likely density function that would have generated the data.
  • Thus, MLE requires you to make a distributional assumption first.
  • This handout provides you with an intuition behind the MLE using examples.
example 1
Example 1
  • Let me explain the basic idea of MLE using this data.
  • Let us make an assumption that the variable X follows normal distribution.
  • Remember that the density function of normal distribution with mean μ and variance σ2 is given by:
slide4

The data is plotted on the horizontal line.

  • Now, ask yourself the following question.

“Which distribution, A or B, is more likely to have generated the data?”

A

B

1

4

5

6

9

slide5

Answer to the question is A, because the data are clustered around the center of the distribution A, but not around the center of the distribution B.

  • This example illustrates that, by looking at the data, it is possible to find the distribution that is most likely to have generated the data.
  • Now, I will explain exactly how to find the distribution in practice.
the illustration of the estimation procedure
The illustration of the estimation procedure.
  • MLE starts with computing the likelihood contribution of each observation.
  • The likelihood contribution is the height of the density function. We use Li to denote the likelihood contribution of ith observation.
slide7

Data value

Graphical illustration of the likelihood contribution

The likelihood contribution of the first observation

=

A

1

4

5

6

9

slide8

Then, you multiply the likelihood contributions of all the observations. This is called the likelihood function. We use the notation L.

  • In our example, n=5.

This notation means you multiply from i=1 through n.

slide9

In our example, the likelihood function looks like:

  • I wrote L(μ,σ) to emphasize that the likelihood function depends on these parameters.
slide10

Then you find the values of μ and σ that maximize the likelihood function.

  • The values of μ and σ which are obtained this way are called the Maximum Likelihood Estimators of μ and σ.
  • Most of the MLE cannot be solved ‘by hand’. Thus, you need to write an iterative procedure to solve it on computer.
slide11

Fortunately, there are many optimization computer programs that can do this.

  • Most common programs among Economists are GQOPT. This program runs on FORTRAN. Thus, you need to write a FORTRAN program.
  • Even more fortunately, many of the models that requires MLE (like Probit or Logit models) can be estimated automatically on STATA.
  • However, it is necessary for you to understand the basic idea of MLE in order to understand what STATA does.
example 2
Example 2
  • Example 1 was the simplest case.
  • We are usually interested in estimating a model like y=β0+β1x+u.
  • Estimating such a model can be done using MLE.
slide13

Suppose that you have this data, and you are interested in estimating the model: y=β0+β1x+u

  • Let us make an assumption that u follows the normal distribution with mean 0 and variance σ2.
slide14

You can write the model as:

u=y-(β0+β1x)

  • This means that y-(β0+β1x) follows the normal distribution with with mean 0 and variance σ2.
  • The likelihood contribution of each person is the height of the density function at the data point (y-β0+β1x).
slide15

Data point

  • For example, the likelihood contribution of the 2nd observation is given by

The likelihood contribution of the 2nd observation

=

2-β0-β1

15-β0-9β1

6-β0-4β1

7-β0-5β1

9-β0-6β1

slide16

Then the likelihood function is given by

  • The likelihood function is a function of β0,β1, and σ.
slide17

You choose the values of β0,β1, and σ that maximizes the likelihood function. These are the maximum likelihood estimators of of β0,β1, and σ .

  • Again, maximization can be easily done using GQOPT or any other programs that have the optimization programs (like Matlab).
example 3
Example 3
  • Consider the following model.
  • y*=β0+β1x+u
  • Sometimes, we only know whether y*≥0 or not.
slide19

The data contain a variable Y which is either 0 or 1.

If Y=1, it means that y*≥0

If Y=0, it means that y*<0

slide20

Then, what is the likelihood contribution of each observation? In this case, we only know if y* ≥0 or y*<0. We do not know the exact value of y* .

  • In such case, we use the probability that y* ≥0 or y*<0 as the likelihood contribution.
  • Now, let’s make an assumption that u follows the standard normal distribution (normal distribution with mean 0 and variance 1.)
slide21

Take 2nd observation as an example. Since Y=0 for this observation, we know y*<0

  • Thus, the likelihood contribution is

L2

-β0-β1

-β0-9β1

-β0-4β1

-β0-5β1

-β0-6β1

slide22

Now, take 3nd observation as an example. Since Y=1 for this observation, we know y*≥0

  • Thus, the likelihood contribution is

L3

-β0-β1

-β0-9β1

-β0-4β1

-β0-5β1

-β0-6β1

slide24

You choose the values of β0 and β1 that maximizes the likelihood function. These are the maximum likelihood estimators of of β0 and β1 .

procedure of the mle
Procedure of the MLE
  • Compute the likelihood contribution of each observation: Li for i=1…n
  • Multiply all the likelihood contribution to form the likelihood function L.
  • Maximize L by choosing the values of the parameters. The values of parameters that maximizes L is the maximum likelihood estimators of the parameters.
the log likelihood function
The log likelihood function
  • It is usually easier to maximize the natural log of the likelihood function than the likelihood function itself.  
the standard errors in mle
The standard errors in MLE
  • This is usually an advanced topic. However, it is useful to know how the standard errors are computed in MLE, since we use it for t-tests.
slide28

The score vector is the first derivative of the log likelihood function with respect to the parameters

  • Let θ be a column vector of the parameters. In Example 2, θ=(β0,β1,σ)’.
  • Then the score vector q is given by
slide29

Then, the standard errors of the parameters are given by the square root of the diagonal elements of the following matrix.

ad