1 / 25

Instance Based Learning

Instance Based Learning. Nearest Neighbor. Remember all your data When someone asks a question Find the nearest old data point Return the answer associated with it In order to say what point is nearest, we have to define what we mean by "near".

Download Presentation

Instance Based Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instance Based Learning

  2. Nearest Neighbor • Remember all your data • When someone asks a question • Find the nearest old data point • Return the answer associated with it • In order to say what point is nearest, we have to define what we mean by "near". • Typically, we use Euclidean distance between two points. Nominal attributes: distance is set to 1 if values are different, 0 if they are equal

  3. Predicting Bankruptcy

  4. Predicting Bankruptcy • Now, let's say we have a new person with R equal to 0.3 and L equal to 2. • What y value should we predict? And so our answer would be "no".

  5. Scaling • The naïve Euclidean distance isn't always appropriate. • Consider the case where we have two features describing a car. • f1 = weight in pounds • f2 = number of cylinders. • Any effect of f2will be completely lost because of the relative scales. • So, rescale the inputs to put all of the features on about equal footing:

  6. Time and Space • Learning is fast • We just have to remember the training data. • Space is n. • What takes longer is answering a query. • If we do it naively, we have to, for each point in our training set (and there are n of them) compute the distance to the query point (which takes about m computations, since there are m features to compare). • So, overall, this takes about m * n time.

  7. Noise Someone with an apparently healthy financial record goes bankrupt.

  8. Remedy: K-Nearest Neighbors • k-nearest neighbor algorithm: • Just like the old algorithm, except that when we get a query, we'll search for the k closest points to the query points. • Output what the majority says. • In this case, we've chosen k to be 3. • The three closest points consist of two "no"s and a "yes", so our answer would be "no". Find the optimal k using cross-validation

  9. Other Variants • IB2: save memory, speed up classification • Work incrementally • Only incorporate misclassified instances • Problem: noisy data gets incorporated • IB3: deal with noise • Discard instances that don’t perform well • Keep a record of the number of correct and incorrect classification decisions that each exemplar makes. • Two predetermined thresholds are set on success ratio. • If the performance of exemplar falls below the low threshold it is deleted. • If the performance exceeds the upper threshold it is used for prediction.

  10. Instance-based learning: IB2 • IB2: save memory, speed up classification • Work incrementally • Only incorporate misclassified instances • Problem: noisy data gets incorporated Data: “Who buys gold jewelry” (25,60,no) (45,60,no) (50,75,no) (50,100,no) (50,120,no) (70,110,yes) (85,140,yes) (30,260,yes) (25,400,yes) (45,350,yes) (50,275,yes) (60,260,yes)

  11. Instance-based learning: IB2 • Data: • (25,60,no) • (85,140,yes) • (45,60,no) • (30,260,yes) • (50,75,no) • (50,120,no) • (70,110,yes) • (25,400,yes) • (50,100,no) • (45,350,yes) • (50,275,yes) • (60,260,yes) This is the final answer. I.e. we memorize only these 5 points. However, let’s compute gradually the classifier.

  12. Instance-based learning: IB2 • Data: • (25,60,no)

  13. Instance-based learning: IB2 • Data: • (25,60,no) • (85,140,yes) Since so far the model has only the first instance memorized, this second instance gets wrongly classified. So, we memorize it as well.

  14. Instance-based learning: IB2 • Data: • (25,60,no) • (85,140,yes) • (45,60,no) So far the model has the two first instances memorized. The third instance gets properly classified, since it happens to be closer with the first. So, we don’t memorize it.

  15. Instance-based learning: IB2 • Data: • (25,60,no) • (85,140,yes) • (45,60,no) • (30,260,yes) So far the model has the two first instances memorized. The fourth instance gets properly classified, since it happens to be closer with the second. So, we don’t memorize it.

  16. Instance-based learning: IB2 • Data: • (25,60,no) • (85,140,yes) • (45,60,no) • (30,260,yes) • (50,75,no) So far the model has the two first instances memorized. The fifth instance gets properly classified, since it happens to be closer with the first. So, we don’t memorize it.

  17. Instance-based learning: IB2 • Data: • (25,60,no) • (85,140,yes) • (45,60,no) • (30,260,yes) • (50,75,no) • (50,120,no) So far the model has the two first instances memorized. The sixth instance gets wrongly classified, since it happens to be closer with the second. So, we memorize it.

  18. Instance-based learning: IB2 • Continuing in a similar way, we finally get, the figure in the right. • The colored points are the one that get memorized. This is the final answer. I.e. we memorize only these 5 points.

  19. Instance-based learning: IB3 • IB3: deal with noise • Discard instances that don’t perform well • Keep a record of the number of correct and incorrect classification decisions that each exemplar makes. • Two predetermined thresholds are set on success ratio. • An instance is used for training: • If the number of incorrect classifications is  the first threshold, and • If the number of correct classifications  the second threshold.

  20. Instance-based learning: IB3 • Suppose the lower threshold is 0, and upper threshold is 1. • Shuffle the data first • (25,60,no) • (85,140,yes) • (45,60,no) • (30,260,yes) • (50,75,no) • (50,120,no) • (70,110,yes) • (25,400,yes) • (50,100,no) • (45,350,yes) • (50,275,yes) • (60,260,yes)

  21. Instance-based learning: IB3 • Suppose the lower threshold is 0, and upper threshold is 1. • Shuffle the data first • (25,60,no) [1,1] • (85,140,yes) [1,1] • (45,60,no) [0,1] • (30,260,yes) [0,2] • (50,75,no) [0,1] • (50,120,no) [0,1] • (70,110,yes) [0,0] • (25,400,yes) [0,1] • (50,100,no) [0,0] • (45,350,yes) [0,0] • (50,275,yes) [0,1] • (60,260,yes) [0,0]

  22. Instance-based learning: IB3 • The points that will be used in classification are: • (45,60,no) [0,1] • (30,260,yes) [0,2] • (50,75,no) [0,1] • (50,120,no) [0,1] • (25,400,yes) [0,1] • (50,275,yes) [0,1]

  23. Rectangular generalizations • When a new exemplar is classified correctly, it is generalized by simply merging it with the nearest exemplar. • The nearest exemplar may be either a single instance or a hyper-rectangle.

  24. Rectangular generalizations • Data: • (25,60,no) • (85,140,yes) • (45,60,no) • (30,260,yes) • (50,75,no) • (50,120,no) • (70,110,yes) • (25,400,yes) • (50,100,no) • (45,350,yes) • (50,275,yes) • (60,260,yes)

  25. Classification • If the new instance lies within a rectangle then output the rectangle class • If the new instance lies in the overlap of several rectangles, then output the class of the rectangle whose center is the closest to the new data instance. • If the new instance lies outside any of the rectangles, output the class of the rectangle, which is the closest to the data instance. • The distance of a point from a rectangle is: • If an instance lies within rectangle, d=0 • If outside, d = distance from the closest rectangle part, i.e. distance from some point in the rectangle boundary. Class 1 Class 2 Separation line

More Related