240 650 principles of pattern recognition
Download
1 / 21

240-650 Principles of Pattern Recognition - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

240-650 Principles of Pattern Recognition. Montri Karnjanadecha [email protected] http://fivedots.coe.psu.ac.th/~montri. Chapter 3. Maximum-Likelihood and Bayesian Parameter Estimation. Introduction. We could design an optimum classifier if we know P( w i ) and p( x | w i )

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' 240-650 Principles of Pattern Recognition' - kenna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
240 650 principles of pattern recognition

240-650 Principles of Pattern Recognition

Montri Karnjanadecha

[email protected]

http://fivedots.coe.psu.ac.th/~montri

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


Chapter 3

Chapter 3

Maximum-Likelihood and Bayesian Parameter Estimation

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


Introduction
Introduction

  • We could design an optimum classifier if we know P(wi) and p(x|wi)

  • We rarely have knowledge about the probabilistic structure of the problem

  • We often estimate P(wi) and p(x|wi) from training data or design samples

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


Maximum likelihood estimation
Maximum-Likelihood Estimation

  • ML Estimation

  • Always have good convergence properties as the number of training samples increases

  • Simpler that other methods

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


The general principle
The General Principle

  • Suppose we separate a collection of samples according to class so that we have c data sets, D1, …, Dcwith the samples in Djhaving been drawn independently according to the probability law p(x|wj)

  • We say such samples are i.i.d.– independently and identically distributed random variable

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


The general principle1
The General Principle

  • We assume that p(x|wj) has a known parametric form and is determined uniquely by the value of a parameter vector qj

  • For example

  • We explicitly write p(x|wj) as p(x|wj, qj)

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


Problem statement
Problem Statement

  • To use the information provided by the training samples to obtain good estimates for the unknown parameter vectors q1,…qc associated with each category

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


Simplified problem statement
Simplified Problem Statement

  • If samples in Di give no information about qj if i = j

  • We now have c separated problems of the following form:

    To use a set D of training samples drawn independently from the probability density p(x|q) to estimate the unknown vector q.

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


  • Suppose that D contains n samples, x1,…,xn.

  • Then we have

  • The Maximum-Likelihood estimate of q is the value of that maximizes p(D|q)

Likelihood of q with respect to the set of samples

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation



  • Let Estimationq = (q1, …, qp)t

  • Let be the gradient operator

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


Log likelihood function
Log-Likelihood Function Estimation

  • We define l(q) as the log-likelihood function

  • We can write our solution as

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


MLE Estimation

  • From

  • We have

  • And

  • Necessary condition for MLE

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


The gaussian case unknown m
The Gaussian Case: EstimationUnknown m

  • Suppose that the samples are drawn from a multivariate normal populationwith mean m and covariance matrix S

  • Let m is the only unknown

  • Consider a sample point xk and find

  • and

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


Sample mean
Sample Mean Estimation

  • The MLE for the unknown population meanis just the arithmetic average of the training samples (or sample mean)

  • If we think of the n samples as a cloud of points, then the sample mean is the centroid of the cloud

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


The gaussian case unknown m and s
The Gaussian Case: EstimationUnknown mand S

  • This is a more typical case where mean and covariance matrix are unknown

  • Consider the univariate case with q1=m and q2=s2

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


Mle for multivariate case
MLE for multivariate case Estimation

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


Bias Estimation

  • The MLE for the variance s2 is biased

  • The expected value over all data sets of size n of the sample variance is not equal to the true variance

  • An Unbiased estimator for S is given by

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation


ad