1 / 15

Linear Methods for Classification

Linear Methods for Classification. 20.04.2015: Presentation for MA seminar in statistics Eli Dahan. Introduction - problem and solution LDA - Linear Discriminant Analysis LR : Logistic Regression (Linear Regression) LDA Vs. LR In a word – Separating Hyperplanes. Outline.

rmiguel
Download Presentation

Linear Methods for Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linear Methods for Classification 20.04.2015: Presentation for MA seminar in statistics Eli Dahan

  2. Introduction - problem and solution LDA - Linear Discriminant Analysis LR : Logistic Regression (Linear Regression) LDA Vs. LR In a word – Separating Hyperplanes Outline

  3. Introduction - the problem Group k Posteriori Pj=P(G=j|X=x) Observation X Or Group l? *We can think of G as “group label”

  4. Introduction - the solution Linear Decision boundary: pk>plchoose K pk=pl pl>pkchoose L

  5. Linear Discriminant Analysis • Let P(G = k) = k and P(X=x|G=k) = fk(x) • Then by bayes rule: • Decision boundary:

  6. Linear Discriminant Analysis • Assuming fk(x) ~ gauss(k, k) and 1 =2 = …=K= • We get Linear (in x) decision boundary • For not common  we get QDA (RDA)

  7. Linear Discriminant Analysis • Using empirical estimation methods: Top classifier (Michie et al., 1994) – the data supports linear boundaries, stability

  8. Logistic Regression • Models posterior prob. Of K classes; they sum to one and remain in [0,1]: • Linear Decision boundary:

  9. Logistic Regression • Model fit: • In max. ML Newton-Raphson algorithm is used

  10. Linear Regression • Recall the common features of multivariate regression: • +Lack of multicollinearity etc. • Here: Assuming N instances (N*p observation matrix X), Y is a N*K indicator response matrix (K classes).

  11. Linear Regression

  12. Linear Regression

  13. LDA Vs. LR • Similar results, LDA slightly better (56% vs. 67% error rate for LR) • Presumably, they are identical because of the linear end-form of decision boundaries (return to see).

  14. LDA Vs. LR • LDA: parameters fit by max. full log-likelihood based on the joint density which assumes Gaussian density (Efron 1975 – worst case of ignoring gaussianity 30% eff. reduction) • Linearity is derived LR: P(X) arbitrary (advantage in model selection and abitility to absorb extreme X values), fits parameters of P(G|X) by maximizing the conditional likelihood. Linearity is assumed

  15. In a word – separating hyperplanes

More Related