1 / 39

Discrimination amongst k populations

Discrimination amongst k populations. We want to determine if an observation vector. comes from one of the k populations. For this purpose we need to partition p -dimensional space into k regions C 1 , C 2 , …, C k.

levij
Download Presentation

Discrimination amongst k populations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discrimination amongst k populations

  2. We want to determine if an observation vector comes from one of the k populations For this purpose we need to partition p-dimensional space into k regions C1, C2 , …, Ck

  3. For this purpose we need to partition p-dimensional space into k regions C1, C2 , …, Ck We will make the decision: if Misclassification probabilities P[j|i] = P[ classify the case in pj when case is from pi] Cost of Misclassification cj|i = Costclassifying the case in pj when case is from pi

  4. Initial probabilities of inclusion P[i] = P[ classify the case is from pi initially] Expected Cost of Misclassification of a case from population i We assume that we know the case came frompi

  5. Total Expected Cost of Misclassification j i

  6. Optimal Classification Rule The optimal classification rule will find the regions Cj that will minimize: ECM will be minimized if Cjis chosen where the term that is omitted: is the largest

  7. Optimal Regions when misclassification costs are equal

  8. Optimal Regions when misclassification costs are equal an distributions are p-variate Normal with common covariance matrix S In the case of normality

  9. Summarizing We will classify the observation vector in population pj if:

  10. k—means Clustering A non-hierarchical clustering scheme want subdivide the data set into k groups

  11. The k means algorithm • Initially subdivide the complete data into k groups. • Compute the centroids (mean vector) for each group. • Sequentially go through the data reassigning each case to the group with the closest centroid. • After reassigning a case to a new group recalculate the centroid for the original group and the new group to which it is a member. • Continue until there are no new reassignment of cases.

  12. Example: n = 60 cases with two variables (x,y) measured

  13. Graph: Scattergram of data

  14. Graph: Initial Clustering

  15. Graph: Final Clustering

  16. Graph: True subpopulations

  17. An Example: Cluster Analysis, Discriminant Analysis, MANOVA A survey was given to 132 students • Male=35, • Female=97 They rated, on a Likert scale • 1 to 5 • their agreement with each of 40 statements. All statements are related to the Meaning of Life

  18. Questions and Statements

  19. Statements - continued

  20. The Analysis The first step in the analysis is to perform cluster analysis to see if there are any subgroups of interest: Both hierarchical and partitioning method (K-means) approaches were used for the cluster analysis.

  21. The Analysis From the of the previous figure, it follows by cutting across the dendogram branches at a linkage distance of between 30 or 75, that 2 or 3 clusters describe the data best. The k-means method was then used (with k=2 and k=3) to identify members of these clusters. Using the k-means procedure, similarly, two and three cluster models fit the data best (attempts to use higher values of k resulted in clusters with only one case).

  22. One-way MANOVA was then utilized to test for significant differences between the clusters It was also used to identify the statements on which the differences between the two clusters were most significant.

  23. A step-wise discriminant function analysis was done to predict cluster membership and to attempt to identify the minimal set of survey statements used to identify cluster separation for the 128 participants in the study.

  24. Optimistic Pessimistic religious Non-religious

  25. Discrimination performance • 96% of the cluster 1 respondents were correctly classified, • 88% of cluster 2 respondents were correctly classified, and • 84% of cluster 3 respondents were classified correctly.

  26. Techniques for studying correlation and covariance structure Principle Components Analysis (PCA) Factor Analysis

  27. Principle Component Analysis

  28. Let have a p-variate Normal distribution with mean vector Definition: The linear combination is called the first principle component if is chosen to maximize subject to

  29. Consider maximizing subject to Using the Lagrange multiplier technique Let

  30. Now and

  31. Summary is the first principle component if is the eigenvector (length 1)of S associated with the largest eigenvalue l1 of S.

More Related