- By
**kenna** - Follow User

- 78 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about '240-650 Principles of Pattern Recognition' - kenna

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### 240-650 Principles of Pattern Recognition

### Chapter 3

Montri Karnjanadecha

http://fivedots.coe.psu.ac.th/~montri

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Maximum-Likelihood and Bayesian Parameter Estimation

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Introduction

- We could design an optimum classifier if we know P(wi) and p(x|wi)
- We rarely have knowledge about the probabilistic structure of the problem
- We often estimate P(wi) and p(x|wi) from training data or design samples

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Maximum-Likelihood Estimation

- ML Estimation
- Always have good convergence properties as the number of training samples increases
- Simpler that other methods

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

The General Principle

- Suppose we separate a collection of samples according to class so that we have c data sets, D1, …, Dcwith the samples in Djhaving been drawn independently according to the probability law p(x|wj)
- We say such samples are i.i.d.– independently and identically distributed random variable

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

The General Principle

- We assume that p(x|wj) has a known parametric form and is determined uniquely by the value of a parameter vector qj
- For example
- We explicitly write p(x|wj) as p(x|wj, qj)

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Problem Statement

- To use the information provided by the training samples to obtain good estimates for the unknown parameter vectors q1,…qc associated with each category

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Simplified Problem Statement

- If samples in Di give no information about qj if i = j
- We now have c separated problems of the following form:

To use a set D of training samples drawn independently from the probability density p(x|q) to estimate the unknown vector q.

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Suppose that D contains n samples, x1,…,xn.

- Then we have
- The Maximum-Likelihood estimate of q is the value of that maximizes p(D|q)

Likelihood of q with respect to the set of samples

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Let q = (q1, …, qp)t

- Let be the gradient operator

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Log-Likelihood Function

- We define l(q) as the log-likelihood function
- We can write our solution as

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

MLE

- From
- We have
- And
- Necessary condition for MLE

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

The Gaussian Case: Unknown m

- Suppose that the samples are drawn from a multivariate normal populationwith mean m and covariance matrix S
- Let m is the only unknown
- Consider a sample point xk and find
- and

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

The MLE of m must satisfy

- After rearranging

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Sample Mean

- The MLE for the unknown population meanis just the arithmetic average of the training samples (or sample mean)
- If we think of the n samples as a cloud of points, then the sample mean is the centroid of the cloud

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

The Gaussian Case: Unknown mand S

- This is a more typical case where mean and covariance matrix are unknown
- Consider the univariate case with q1=m and q2=s2

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

And its derivative is

- Set to 0
- and

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

With a little rearranging, we have

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

MLE for multivariate case

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Bias

- The MLE for the variance s2 is biased
- The expected value over all data sets of size n of the sample variance is not equal to the true variance
- An Unbiased estimator for S is given by

240-650: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation

Download Presentation

Connecting to Server..