Loading in 5 sec....

Bayesian Learning VC Dimension PowerPoint Presentation

Bayesian Learning VC Dimension

- 108 Views
- Updated On :

Bayesian Learning VC Dimension . Jahwan Kim 2000. 5. 24 AIPR Lab. CSD., KAIST. Contents. Bayesian learning General idea, & an example Parametric vs. nonparametric statistical inference Model capacity and generalizability Further readings. Bayesian learning.

Related searches for Bayesian Learning VC Dimension

Download Presentation
## PowerPoint Slideshow about 'Bayesian Learning VC Dimension' - everett

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Contents

- Bayesian learning
- General idea, & an example

- Parametric vs. nonparametric statistical inference
- Model capacity and generalizability
- Further readings

Jahwan Kim, Dept. of CS, KAIST

Bayesian learning

- Conclude from hypothesis constructed from the given data.
- Predictions are made from the hypotheses, weighted by their posterior probabilities.

Jahwan Kim, Dept. of CS, KAIST

Bayesian learningFormulation

- X is the prediction, H’s the hypotheses, and D the give data.
- Requires calculation of P(H|D) for all H’s, and this is intractable in many cases.

Jahwan Kim, Dept. of CS, KAIST

Bayesian learningMaximum a posteriori hypothesis

- Take H that maximizes the a posteriori probability P(H|D).
- How do we find such H? Use Bayes’ rule:

Jahwan Kim, Dept. of CS, KAIST

Bayes learningcontinued

- P(D) remains fixed for all H.
- P(D|H) is the likelihood the given data is observed given H.
- P(H), the prior probability, has been the source of debate.
- If too biased, we get underfitting.
- Sometimes a uniform prior is appropriate. In that case, we choose the maximum likelihood hypothesis.

Jahwan Kim, Dept. of CS, KAIST

Bayesian learningParameter estimation

- Problem: Find p(x|D) when
- We know the form of pdf, i.e., the pdf is parametrized by , written as p(x|).
- A priori pdf p() is known.
- Data D is given.

- We only have to find p(|D), since then we may use

Jahwan Kim, Dept. of CS, KAIST

Parameter estimation, continued

- By Bayes’ rule,
- Assume also each sample in D is drawn independently with identical pdf, i.e., it is i.i.d. Then
- This gives the formal solution to the problem

Jahwan Kim, Dept. of CS, KAIST

Parameter estimationExample

- One-dimensional normal distribution
- Two parameters, and .
- Assume that p() is normal with known mean m and variance s.
- Assume also that is also known.
- Then

Jahwan Kim, Dept. of CS, KAIST

Example, continued

- squared term appears in the exponent of the expression
(or compute it)

- Namely, p(|D) is also normal.
- Its variance and mean are given by
where is the sample mean.

- Its variance and mean are given by

Jahwan Kim, Dept. of CS, KAIST

Estimation of mean

- As n goes to infinity , p(|D) approaches the Dirac delta function centered at the sample mean.

Jahwan Kim, Dept. of CS, KAIST

Two main approaches of (statistical) inference

- Parametric inference
- Investigator should know the problem well.
- The model contains finite number of unknown parameters.

- Nonparametric inference
- No reliable a priori info about the problem.
- Number of samples required is too large.

Jahwan Kim, Dept. of CS, KAIST

Capacity of models

- Well known fact:
- If a model is too complicated, it doesn’t generalize well;
- if too simple, it doesn’t represent well.

- How do we measure model capacity?
- In classical statistics, by the number of parameter, or degree of freedom
- In the (new) statistical learning theory, by VC dim.

Jahwan Kim, Dept. of CS, KAIST

VC dimension

- Vapnik-Chervonenkis dimension is a measure of capacity of a model.

Jahwan Kim, Dept. of CS, KAIST

VC dimensionExamples

- It’s not always equal to the number of parameters:
- A line of the form {ax+by+c} in 2D plane has VC dimension 3, but

- One parameter family {sgn(sin ax)} (in one dimension) has VC dimension infinity!

Jahwan Kim, Dept. of CS, KAIST

Theorem from STL onVC dimension and generalizability

Jahwan Kim, Dept. of CS, KAIST

Further readings

- Vapnik, Statistical Learning Theory, Ch. 0 & sections 1.1-1.3
- Haykin, Neural Networks, sections 2.13-2.14
- Duda & Hart, Pattern Classification and Scene Analysis, sections 3.3-3.5

Jahwan Kim, Dept. of CS, KAIST

Download Presentation

Connecting to Server..