240 650 principles of pattern recognition
Download
Skip this Video
Download Presentation
240-650 Principles of Pattern Recognition

Loading in 2 Seconds...

play fullscreen
1 / 21

240-650 Principles of Pattern Recognition - PowerPoint PPT Presentation


  • 78 Views
  • Uploaded on

240-650 Principles of Pattern Recognition. Montri Karnjanadecha [email protected] http://fivedots.coe.psu.ac.th/~montri. Chapter 3. Maximum-Likelihood and Bayesian Parameter Estimation. Introduction. We could design an optimum classifier if we know P( w i ) and p( x | w i )

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '240-650 Principles of Pattern Recognition' - kenna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
240 650 principles of pattern recognition

240-650 Principles of Pattern Recognition

Montri Karnjanadecha

[email protected]

http://fivedots.coe.psu.ac.th/~montri

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

chapter 3

Chapter 3

Maximum-Likelihood and Bayesian Parameter Estimation

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

introduction
Introduction
  • We could design an optimum classifier if we know P(wi) and p(x|wi)
  • We rarely have knowledge about the probabilistic structure of the problem
  • We often estimate P(wi) and p(x|wi) from training data or design samples

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

maximum likelihood estimation
Maximum-Likelihood Estimation
  • ML Estimation
  • Always have good convergence properties as the number of training samples increases
  • Simpler that other methods

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

the general principle
The General Principle
  • Suppose we separate a collection of samples according to class so that we have c data sets, D1, …, Dcwith the samples in Djhaving been drawn independently according to the probability law p(x|wj)
  • We say such samples are i.i.d.– independently and identically distributed random variable

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

the general principle1
The General Principle
  • We assume that p(x|wj) has a known parametric form and is determined uniquely by the value of a parameter vector qj
  • For example
  • We explicitly write p(x|wj) as p(x|wj, qj)

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

problem statement
Problem Statement
  • To use the information provided by the training samples to obtain good estimates for the unknown parameter vectors q1,…qc associated with each category

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

simplified problem statement
Simplified Problem Statement
  • If samples in Di give no information about qj if i = j
  • We now have c separated problems of the following form:

To use a set D of training samples drawn independently from the probability density p(x|q) to estimate the unknown vector q.

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

slide9
Suppose that D contains n samples, x1,…,xn.
  • Then we have
  • The Maximum-Likelihood estimate of q is the value of that maximizes p(D|q)

Likelihood of q with respect to the set of samples

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

slide11
Let q = (q1, …, qp)t
  • Let be the gradient operator

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

log likelihood function
Log-Likelihood Function
  • We define l(q) as the log-likelihood function
  • We can write our solution as

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

slide13
MLE
  • From
  • We have
  • And
  • Necessary condition for MLE

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

the gaussian case unknown m
The Gaussian Case: Unknown m
  • Suppose that the samples are drawn from a multivariate normal populationwith mean m and covariance matrix S
  • Let m is the only unknown
  • Consider a sample point xk and find
  • and

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

slide15
The MLE of m must satisfy
  • After rearranging

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

sample mean
Sample Mean
  • The MLE for the unknown population meanis just the arithmetic average of the training samples (or sample mean)
  • If we think of the n samples as a cloud of points, then the sample mean is the centroid of the cloud

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

the gaussian case unknown m and s
The Gaussian Case: Unknown mand S
  • This is a more typical case where mean and covariance matrix are unknown
  • Consider the univariate case with q1=m and q2=s2

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

slide18
And its derivative is
  • Set to 0
  • and

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

slide19
With a little rearranging, we have

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

mle for multivariate case
MLE for multivariate case

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

slide21
Bias
  • The MLE for the variance s2 is biased
  • The expected value over all data sets of size n of the sample variance is not equal to the true variance
  • An Unbiased estimator for S is given by

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

ad