1 / 17

580.704 Mathematical Foundations of BME Reza Shadmehr

580.704 Mathematical Foundations of BME Reza Shadmehr logistic regression, iterative re-weighted least squares. Logistic regression In the last lecture we classified by computing a posterior probability. The posterior was calculated by modeling the likelihood and prior for each class.

sgarey
Download Presentation

580.704 Mathematical Foundations of BME Reza Shadmehr

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 580.704 Mathematical Foundations of BME Reza Shadmehr logistic regression, iterative re-weighted least squares

  2. Logistic regression • In the last lecture we classified by computing a posterior probability. The posterior was calculated by modeling the likelihood and prior for each class. • To compute the posterior, we modeled the right side of the equation below by assuming that they were Gaussians and computed their parameters (or used a kernel estimate of the density). • In logistic regression, we want to directly model the posterior as a function of the variable x. • In practice, when there are k classes to classify, we model:

  3. Classification by maximizing the posterior distribution In this example we assume that the two distributions for the classes have equal variance. Suppose we want to classify a person as male or female based on height. What we have: What we want: Height is normally distributed in the population of men and in the population of women, with different means, and similar variances. Let y be an indicator variable for being a female. Then the conditional distribution of x (the height becomes):

  4. Posterior probability for classification when we have two classes:

  5. 1 0.8 0.6 0.4 0.2 120 140 160 180 200 220 Computing the probability that the subject is female, given that we observed height x. a logistic function In the denominator, x appears linearly inside the exponential So if we assume that the class membership densities p(x/y) are normal with equal variance, then the posterior probability will be a logistic function. Posterior:

  6. 6 4 2 0 -2 -4 -2 0 2 4 6 Logistic regression with assumption of equal variance among density of classes implies a linear decision boundary Class 0

  7. Logistic regression: problem statement Assumption of equal variance among the clusters The goal is to find parameters w that maximize the log-likelihood.

  8. Some useful properties of the logistic function

  9. Online algorithm for logistic regression

  10. Batch algorithm: Iteratively Re-weighted Least Squares

  11. 11 10 9 8 7 6 5 0.2 0.4 0.6 0.8 Iteratively Re-weighted Least Squares IRLS certain certain Sensitivity to error uncertain

  12. 4 2 0 -2 -4 -6 -4 -2 0 2 4 6 4 4 x2 x2 2 2 0 0 -2 -2 1 0.75 0.5 0.25 0 2 2 0 0 -2 -2 -4 -4 -6 -6 x1 x1 Iteratively Re-weighted Least Square: Example

  13. Modeling the posterior when the densities have unequal variance (uni-variate case with two classes)

  14. 4 3 2 1 0 -1 -8 -6 -4 -2 0 Logistic regression with basis functions By using non-linear bases, we can deal with clusters having unequal variance. Estimated posterior probability

  15. Logistic function for multiple classes with equal variance Rather than modeling the posterior directly, let us pick the posterior for one class as our reference and then model the ratio of the posterior for all other classes with respect to that class. Suppose we have k classes:

  16. Logistic function for multiple classes with equal variance: soft-max A “soft-max” function

  17. 15 0.025 0.015 10 0.0125 0.02 5 0.01 0.015 0.0075 160 180 200 220 0.01 0.005 -5 0.005 0.0025 -10 1 -15 160 160 180 180 200 200 220 220 0.8 0.6 0.4 0.2 160 180 200 220 Classification of multiple classes with equal variance Posterior probabilities

More Related