1 / 10

Linear Discriminant Analysis (LDA)

Linear Discriminant Analysis (LDA). Goal . To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical with k classes.) Assumptions Multivariate Normal Distribution variables are distributed normally within the classes/groups.

artie
Download Presentation

Linear Discriminant Analysis (LDA)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linear Discriminant Analysis (LDA)

  2. Goal • To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical with k classes.) • Assumptions • Multivariate Normal Distribution variables are distributed normally within the classes/groups. • Similar Group Covariances Correlations between and the variances within each group should be similar.

  3. Dependent Variable • Must be categorical with 2 or more classes (groups). • If there are only 2 classes, the discriminant analysis procedure will give the same result as the multiple regression procedure.

  4. Independent Variables • Continuous or categorical independent variables • If categorical, they are converted into binary (dummy) variables as in multiple linear regression

  5. Output Example: Assume 3 classes (y=1,2,3) of the dependent.

  6. Binary Dependent - Regression If only 2 classes of dependent, can do multiple regression Sample data shown below:

  7. Regression Output

  8. Classification Classification Rule in this case: If Pred. Y > 0.5 then Class = 1; else Class = 0. This model yielded 2 misclassifications out of 24. How good is R-square?

  9. Crosstab of Pred. Y and Y For large datasets, one can format the Predicted Y variable and create a crosstab with Y to see how accurately the model classifies the data (fictitious results shown here). The Good and Bad columns represent the number of actual Y values.

  10. Kolmogorov-Smirnov Test Use the crosstabs shown in last slide to conduct the KS Test to determine • Cutoff score, • Classification accuracy, and • Forecasts of model performance.

More Related