1 / 21

COMPE 467 - Pattern Recognition

Nearest and k-nearest Neighbor Classification. COMPE 467 - Pattern Recognition. Nearest Neighbor (NN) Rule & k-Nearest Neighbor (k-NN) Rule. Non-parametric : Can be used with arbitrary distributions, No need to assume that the form of the underlying densities are known NN Rule s

tierra
Download Presentation

COMPE 467 - Pattern Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nearest and k-nearest Neighbor Classification COMPE 467 - Pattern Recognition

  2. Nearest Neighbor (NN) Rule &k-Nearest Neighbor (k-NN) Rule Non-parametric : • Can be used with arbitrary distributions, • No need to assume that the form of the underlying densities are known NN Rules A direct classification using learning samples Assume we have learning samples Assume a distance measure between samples

  3. A general distance metric should obey the following rules: • Most standard: Euclidian Distance

  4. 1-NN Rule: Given an unknown sample X decide if for That is, assign X to category if the closest neighbor of X is from category i.

  5. Example: Find the decision boundary for the problem below. Decision Boundary: Piecewise linear

  6. Voronoi Diagrams Decision boundary is a polygon such that any point that falls in is closer to than any other sample .

  7. k-NN rule: instead of looking at the closest sample, we look at k nearest neighbors to X and we take a vote. The largest vote wins. k is usually taken as an odd number so that no ties occur.

  8. k-NN rule is shown to approach optimal classification when k becomes very large but k-l NN ( NN with a reject option) Decide if majority is over a given threshold l. Otherwise reject. k=5 If we threshold l=4 then, For above example is rejected to be classified.

  9. Analysis of NN rule is possible when and it was shown that it is no worse than twice of the minimum-error classification (in error rate). EDITING AND CONDENSING NN rule becomes very attractive because of its simplicity and yet good performance. So, it becomes important to reduce the computational costs involved. Do an intelligent elimination of the samples. Remove samples that do not contribute to the decision boundary.

  10. So the editing rule consists of throwing away all samples that do not have a voronoi polygon that has a common boundary with a voronoi polygon belonging to a sample from other category. Let’s assume we have random samples that are labeled as belonging to different categories 1-NN Rule: Assume a distance measure d between two samples with following properties

  11. Advantage of NNR: No learning – no estimation so easy to implement Disadvantage: Classification is more expensive. So people found ways to reduce the cost of NNR. Analysis of NN and k-NN rules: when • When , X(unknown sample) and X’(nearest neighbor) will get very close. Then, • that is, we are selecting the category with probability (a-posteriori probability).

  12. Error Bounds and Relation with Bayes Rule: Assume - Error bound for Bayes Rule - Error bound for 1-NN - Error bound for k-NN It can be shown that for 2 categories and for c categories Always better than twice the Bayes error rate!

  13. Highest error occurs when then,

  14. Distance Measures(Metrices) Non-negativity Reflexivity when only x=y Symmetry Triangle inequality Euclidian distance satisfies. Not always a meaningful measure. Consider 2 features with different units scaling problem. Scale of changed to half.

  15. Minkowski Metric A general definition norm – City block (Manhattan) distance norm – Euclidian Scaling: Normalize the data by re-scaling (Reduce the range to 0-1) (equivalent to changing the metric) Y b X a

  16. Computational Complexity Consider n samples in d dimensions in crude 1-nn rule, we measure the distance fromX to all samples. for classification.

  17. To reduce the costs: -Partial Distance: Calculate the distances using some subset r of the full d dimensions, and if the partial distance is too great we do not compute further.

  18. To reduce the costs: -Editing(condensing): eliminate prototypes that are surrounded by training points. All neighbors black Voronoi neighborhoods All neighbors red

  19. NN Editing Algorithms • Consider the Voronoi diagrams for all samples • Find Voronoi neighbors of X’ • If any neighbor is from other category, keep X’. Else remove from the set. • Construct the Voronoi diagram with the remaining samples and use it for classification. All neighbors black All neighbors red

  20. References • R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, New York: John Wiley, 2001. • N. Yalabık, “Pattern Classification with Biomedical Applications Course Material”, 2010.

More Related