4 maximum likelihood n.
Download
Skip this Video
Download Presentation
4. Maximum Likelihood

Loading in 2 Seconds...

play fullscreen
1 / 11

4. Maximum Likelihood - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

4. Maximum Likelihood. Prof. A.L. Yuille Stat 231. Fall 2004. Learning Probability Distributions. Learn the likelihood functions and priors from datasets. Two Main Strategies. Parametric and Non-Parametric. This Lecture and the next will concentrate on Parametric methods.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '4. Maximum Likelihood' - damisi


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
4 maximum likelihood

4. Maximum Likelihood

Prof. A.L. Yuille

Stat 231. Fall 2004.

learning probability distributions
Learning Probability Distributions.
  • Learn the likelihood functions and priors from datasets.
  • Two Main Strategies. Parametric and Non-Parametric.
  • This Lecture and the next will concentrate on Parametric methods.

(This assumes a parametric form for the distributions).

maximum likelihood estimation
Maximum Likelihood Estimation.

Assume distribution is of form

  • Independent Identically Distributed (I.I.D.) samples;
  • Choose
supervised versus unsupervised learning
Supervised versus Unsupervised Learning.
  • Supervised Learning assumes that we known the class label for each datapoint.
  • I.e. We are given pairs
  • where is the datapoint and is the class label.
  • Unsupervised Learning does not assume that the class labels are specified. This is a harder task.
  • But “unsupervised methods” can also be used for supervised data if the goal is to determine structure in the data (e.g. mixture of Gaussians).
  • Stat 231 is almost entirely concerned with supervised learning.
example of mle
Example of MLE.
  • One-Dimensional Gaussian Distribution:
  • Solve for by differentiation:
slide6
MLE
  • The Gaussian is unusual because the parameters of the distribution can be expressed as an analytic expression of the data.
  • More usually, algorithms are required.
  • Modeling problem: for complicated patterns – shape of fish, natural language, etc. – it requires considerable work to find a suitable parametric form for the probability distributions.
mle and kullback leibler
MLE and Kullback-Leibler
  • What happens if the data is not generated by the model that we assume?
  • Suppose the true distribution is and our models are of form
  • The Kullback-Leiber divergence is:
  • This is
  • K-L is a measure of the difference between
mle and kullback leibler1
MLE and Kullback-Leibler
  • Samples
  • Approximate
  • By the empirical KL:
  • Minimizing the empirical KL is equivalent to MLE.
  • We find the distribution of form
mle example

MLE example

We denote the log-likelihood as

a function of q

q* is computed by solving equations

For example, the Gaussian family

gives close form solution.

learning with a prior
Learning with a Prior.
  • We can put a prior on the parameter values
  • We can estimate this recursively (if samples are i.i.d):
  • Bayes Learning: estimate a probability distribution on