Machine learning
Download
1 / 14

Machine Learning - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

Machine Learning. Lecture 6 K-Nearest Neighbor Classifier. x 1. 2. 3. 1. 4. 5. 6. 7. 10. 9. 11. 12. 13. 14. 8. x 2. 15. 16. Elliptical blobs (objects). Objects, Feature Vectors, Points. X(15). X(1). X(7). X(16). X(3). X(8). X(25). X(12). X(13). X(6). X(9). X(10).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Machine Learning' - aren


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Machine learning

Machine Learning

Lecture 6

K-Nearest Neighbor Classifier

G53MLE Machine Learning

Dr Guoping Qiu


Objects feature vectors points

x1

2

3

1

4

5

6

7

10

9

11

12

13

14

8

x2

15

16

Elliptical blobs (objects)

Objects, Feature Vectors, Points

X(15)

X(1)

X(7)

X(16)

X(3)

X(8)

X(25)

X(12)

X(13)

X(6)

X(9)

X(10)

X(4)

X(11)

X(14)


Nearest neighbours

x1

x2

Nearest Neighbours

X(j)=(x1(j), x2(j), …,xn(j))

X(i)=(x1(i), x2(i), …,xn(i))

G53MLE Machine Learning Dr Guoping Qiu


Nearest neighbour algorithm
Nearest Neighbour Algorithm

  • Given training data (X(1),D(1)), (X(2),D(2)), …, (X(N),D(N))

  • Define a distance metric between points in inputs space. Common measures are:

    Euclidean Distance

G53MLE Machine Learning Dr Guoping Qiu


K nearest neighbour model
K-Nearest Neighbour Model

Given test point X

  • Find the K nearest training inputs to X

  • Denote these points as

    (X(1),D(1)), (X(2), D(2)), …, (X(k), D(k))

x

G53MLE Machine Learning Dr Guoping Qiu


K nearest neighbour model1
K-Nearest Neighbour Model

Classification

  • The class identification of X

    Y = most common class in set {D(1), D(2), …, D(k)}

x

x

G53MLE Machine Learning Dr Guoping Qiu


K nearest neighbour model2
K-Nearest Neighbour Model

  • Example : Classify whether a customer will respond to a survey question using a 3-Nearest Neighbor classifier

G53MLE Machine Learning Dr Guoping Qiu


K nearest neighbour model3
K-Nearest Neighbour Model

  • Example : 3-Nearest Neighbors

15.16

15

152.23

122

15.74

G53MLE Machine Learning Dr Guoping Qiu


K nearest neighbour model4
K-Nearest Neighbour Model

  • Example : 3-Nearest Neighbors

15.16

15

152.23

122

15.74

Three nearest ones to David are: No, Yes, Yes

G53MLE Machine Learning Dr Guoping Qiu


K nearest neighbour model5
K-Nearest Neighbour Model

  • Example : 3-Nearest Neighbors

15.16

15

152.23

122

15.74

Yes

Three nearest ones to David are: No, Yes, Yes

G53MLE Machine Learning Dr Guoping Qiu


K nearest neighbour model6
K-Nearest Neighbour Model

  • Picking K

    • Use N fold cross validation – Pick K to minimize the cross validation error

    • For each of N training example

      • Find its K nearest neighbours

      • Make a classification based on these K neighbours

      • Calculate classification error

      • Output average error over all examples

    • Use the K that gives lowest average error over the N training examples

G53MLE Machine Learning Dr Guoping Qiu


K nearest neighbour model7
K-Nearest Neighbour Model

  • Example: For the example we saw earlier, pick the best K from the set {1, 2, 3} to build a K-NN classifier

G53MLE Machine Learning Dr Guoping Qiu


Further readings
Further Readings

  • T. M. Mitchell, Machine Learning, McGraw-Hill International Edition, 1997

    Chapter 8

G53MLE Machine Learning Dr Guoping Qiu


Tutorial exercise questions
Tutorial/Exercise Questions

  • K nearest neighbor classifier has to store all training data creating high requirement on storage. Can you think of ways to reduce the storage requirement without affecting the performance? (hint: search the Internet, you will find many approximation methods).

G53MLE Machine Learning Dr Guoping Qiu