1 / 33

Chapter Eight Instance Based Learning

Chapter Eight Instance Based Learning. Machine Learning Tom M. Mitchell. Outline. Introduction K-Nearest Neighbor Locally Weighted Regression Radial Basis Functions Case-Based Reasoning Lazy and Eager Learning. What Is Instance Based Learning.

keena
Download Presentation

Chapter Eight Instance Based Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter Eight Instance Based Learning Machine Learning Tom M. Mitchell

  2. Outline • Introduction • K-Nearest Neighbor • Locally Weighted Regression • Radial Basis Functions • Case-Based Reasoning • Lazy and Eager Learning

  3. What Is Instance Based Learning • Compare with previous learning algorithm • Decision tree, Neural network… • Key characteristic • Simply store training examples and delay processing until a new instance must be classified. • Lazy learning • Advantage and disadvantage

  4. Instance Based Learning • Introduction • K-Nearest Neighbor • Locally Weighted Regression • Radial Basis Functions • Case-Based Reasoning • Lazy and Eager Learning

  5. K-Nearest Neighbor • Instance representation x=<a1(x), a2(x),…, an(x)> • Euclidean distance • Target function may be either discrete-valued or real-valued

  6. KNN Algorithm for Discrete-Valued Target Function • Training algorithm: • Store each training example <x,f(x)> • Classification algorithm: • Given xq to be classified • Find k nearest neighbor x1…xk of xq

  7. An Example different results of 1-nearest neighbor and 5-nearest neighbor algorithm

  8. Voronoi Diagram • Decision surface induced by 1-Nearest neighbor over the entire instance space. • Convex polygon indicates the region of instance spaceclosest to that point.

  9. KNN Algorithm for Real-Valued Target Function • For just replace the previous formula with

  10. Distance-Weighted KNN • Might want nearer neighbors with more heavy weight: • For discrete-valued target functions: • For real-valued target functions: • Shepard method and d(xq,xi) is the distance between xq and xiNote now it makes sense to use all training examples instead of just k

  11. Remarks on KNN Algorithm • Inductive bias • Similar classification of nearby instances… • Curse of dimensionality • Similarity metric mislead by irrelevant attributes • Solutions: • Weight each attribute differently: • Use cross-validation to automatically choose weights • Set weightj to 0 to eliminate the most irrelevant attributes: • Leave-one-out (Moore&Lee, 1994) • Stretch each axis by a variable value. • Efficient memory indexing • kd-tree

  12. Kd-tree (k-dimensional tree) • A binary search tree • Partition the set of points into equal halves. • Partition first on d1, then d2,…,dk before cycling back to d1.

  13. Instance Based Learning • Introduction • K-Nearest Neighbor • Locally Weighted Regression • Radial Basis Functions • Case-Based Reasoning • Lazy and Eager Learning

  14. Some Terminology • Regression: • Approximating a real-valued target function. • Residual: • Kernel function: • wi=K(d(xi,xq))

  15. Locally Weighted Regression • Form an explicit approximation for region surrounding xq • Fit linear function to k nearest neighbors • Fit quadratic… • General approach: • To construct that fits the training examples in the neighborhood surrounding xq • Calculate • Different local approximation for each query instance.

  16. Locally Weighted Linear Regression • Using linear function to approximate f: • Recall chapter 4:

  17. Locally Weighted Linear Regression(cont.) • Three possible error criteria: • Choose criterion 3 and get gradient descent training rule: • Other methods to directly solve for w0…wn • Atkeson et al.(1997), Bishop(1995)

  18. Remarks on Locally Weighted Regression • In most cases, the target function is approximated by a constant, linear, or quadratic function. • More complex functional forms not often found • High cost • Simple approximations are quite well

  19. Instance Based Learning • Introduction • K-Nearest Neighbor • Locally Weighted Regression • Radial Basis Functions • Case-Based Reasoning • Lazy and Eager Learning

  20. Radial Basis Function • Function to be learned: One common choice for is: • Global approximation to target function, in terms of linear combination of local approximations. • “eager” instead of “lazy”.

  21. Radial Basis Function Networks • ai(x) are attributesdescribing instance x. • The first layer computes variousKu(d(xu,x)). • Second layer computeslinear combination of first-layer unit values. • Hidden unit activation is close to 0 unless x is near xu

  22. Training RBF Networks • Stage one: define hidden units by choosing k, xu and • Allocate Gaussian kernel function for each training example <xi, f(xi)>. • Choose set of kernel functions that is smaller than the number of training examples. • Scatter uniformly over instance space • Or nonuniformly • Stage two: train wu • Gradient descent by global error criterion

  23. Instance Based Learning • Introduction • K-Nearest Neighbor • Locally Weighted Regression • Radial Basis Functions • Case-Based Reasoning • Lazy and Eager Learning

  24. Case-Based Reasoning • Key properties of KNN and locally weighted regression: • Lazy learning • Analyzing similar instances • Points in Euclidean space • For CBR: • Using more rich symbolic descriptions • Need different “distance” metric • Application: mechanical device design, legal cases reasoning…

  25. CADET System • What is CADET? • Employs CBR to design simple mechanical devices. • 75 stored examples of mechanical devices. • Training example:<qualitative function, mechanical structure> • New query: desired function • Target value: mechanical structure for this function

  26. CADET System (cont.)

  27. Case-Based Reasoning in CADET • Given function specification for new design, CADET search its library to find an exact match. • If found, return this case. • If not, find cases matching subgraphs.i.e., isomorphism subgraph searching,then piece them together. • Elaborate the original function graph to match more cases. • eg: rewrite as:

  28. Correspondence between CADET and instance-based methods • Instance space X: space of all function graphs • Target function f: function graph -> structure • Training example <x,f(x)>: describe some function graph x and structure f(x)

  29. Several properties of CBR • Instance represented by rich structural descriptions • CADET… • Multiple cases retrieval (and combined) to form solution to new problem • KNN… • Tight coupling between case retrieval, knowledge-based reasoning and problem solving.

  30. Instance Based Learning • Introduction • K-Nearest Neighbor • Locally Weighted Regression • Radial Basis Functions • Case-Based Reasoning • Lazy and Eager Learning

  31. Lazy and Eager Learning • Lazy: wait for query before generalizing • KNN, locally weighted regression, CBR • Eager: generalize before seeing query • RBF networks • Differences: • Computation time • Global and local approximations to the target function • Use same H, lazy can represent more complex functions. (e.g., consider H=linear functions)

  32. Summary • Differences and advantages • KNN algorithm:the most basic instance-based method. • Locally weighted regression: generalization of KNN. • RBF networks:blend of instance-based method and neural network method. • Case-based reasoning

  33. Thank you!

More Related