1 / 29

Learning: Nearest Neighbor

Learning: Nearest Neighbor. Artificial Intelligence CMSC 25000 January 31, 2002. Agenda. Machine learning: Introduction Nearest neighbor techniques Applications: Robotic motion, Credit rating Efficient implementations: k-d trees, parallelism Extensions: K-nearest neighbor Limitations:

stevencombs
Download Presentation

Learning: Nearest Neighbor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning:Nearest Neighbor Artificial Intelligence CMSC 25000 January 31, 2002

  2. Agenda • Machine learning: Introduction • Nearest neighbor techniques • Applications: Robotic motion, Credit rating • Efficient implementations: • k-d trees, parallelism • Extensions: K-nearest neighbor • Limitations: • Distance, dimensions, & irrelevant attributes

  3. Machine Learning • Learning: Acquiring a function, based on past inputs and values, from new inputs to values. • Learn concepts, classifications, values • Identify regularities in data

  4. Machine Learning Examples • Pronunciation: • Spelling of word => sounds • Speech recognition: • Acoustic signals => sentences • Robot arm manipulation: • Target => torques • Credit rating: • Financial data => loan qualification

  5. Machine Learning Characterization • Distinctions: • Are output values known for any inputs? • Supervised vs unsupervised learning • Supervised: training consists of inputs + true output value • E.g. letters+pronunciation • Unsupervised: training consists only of inputs • E.g. letters only • Course studies supervised methods

  6. Machine Learning Characterization • Distinctions: • Are output values discrete or continuous? • Discrete: “Classification” • E.g. Qualified/Unqualified for a loan application • Continuous: “Regression” • E.g. Torques for robot arm motion • Characteristic of task

  7. Machine Learning Characterization • Distinctions: • What form of function is learned? • Also called “inductive bias” • Graphically, decision boundary • E.g. Single, linear separator • Rectangular boundaries - ID trees • Vornoi spaces…etc… + + + - - -

  8. Machine Learning Functions • Problem: Can the representation effectively model the class to be learned? • Motivates selection of learning algorithm For this function, Linear discriminant is GREAT! Rectangular boundaries (e.g. ID trees) TERRIBLE! Pick the right representation! - - - - - - - - - ++ + + + +

  9. Machine Learning Features • Inputs: • E.g.words, acoustic measurements, financial data • Vectors of features: • E.g. word: letters • ‘cat’: L1=c; L2 = a; L3 = t • Financial data: F1= # late payments/yr : Integer • F2 = Ratio of income to expense: Real

  10. Machine Learning Features • Question: • Which features should be used? • How should they relate to each other? • Issue 1: How do we define relation in feature space if features have different scales? • Solution: Scaling/normalization • Issue 2: Which ones are important? • If differ in irrelevant feature, should ignore

  11. Complexity & Generalization • Goal: Predict values accurately on new inputs • Problem: • Train on sample data • Can make arbitrarily complex model to fit • BUT, will probably perform badly on NEW data • Strategy: • Limit complexity of model (e.g. degree of equ’n) • Split training and validation sets • Hold out data to check for overfitting

  12. Nearest Neighbor • Memory- or case- based learning • Supervised method: Training • Record labeled instances and feature-value vectors • For each new, unlabeled instance • Identify “nearest” labeled instance • Assign same label • Consistency heuristic: Assume that a property is the same as that of the nearest reference case.

  13. Nearest Neighbor Example • Problem: Robot arm motion • Difficult to model analytically • Kinematic equations • Relate joint angles and manipulator positions • Dynamics equations • Relate motor torques to joint angles • Difficult to achieve good results modeling robotic arms or human arm • Many factors & measurements

  14. Nearest Neighbor Example • Solution: • Move robot arm around • Record parameters and trajectory segment • Table: torques, positions,velocities, squared velocities, velocity products, accelerations • To follow a new path: • Break into segments • Find closest segments in table • Get those torques (interpolate as necessary)

  15. Nearest Neighbor Example • Issue: Big table • First time with new trajectory • “Closest” isn’t close • Table is sparse - few entries • Solution: Practice • As attempt trajectory, fill in more of table • After few attempts, very close

  16. Nearest Neighbor Example II Name L R G/P • Credit Rating: • Classifier: Good / Poor • Features: • L = # late payments/yr; • R = Income/Expenses A 0 1.2 G B 25 0.4 P C 5 0.7 G D 20 0.8 P E 30 0.85 P F 11 1.2 G G 7 1.15 G H 15 0.8 P

  17. Nearest Neighbor Example II Name L R G/P A 0 1.2 G A F B 25 0.4 P 1 G R E C 5 0.7 G H D C D 20 0.8 P E 30 0.85 P B F 11 1.2 G G 7 1.15 G 10 20 30 L H 15 0.8 P

  18. Nearest Neighbor Example II Name L R G/P H 6 1.15 G A F J I 22 0.45 P 1 H G ?? E J 15 1.2 D H R C I B Distance Measure: Sqrt ((L1-L2)^2 + [sqrt(10)*(R1-R2)]^2)) - Scaled distance 10 20 30 L

  19. Efficient Implementations • Classification cost: • Find nearest neighbor: O(n) • Compute distance between unknown and all instances • Compare distances • Problematic for large data sets • Alternative: • Use binary search to reduce to O(log n)

  20. Efficient Implementation: K-D Trees • Divide instances into sets based on features • Binary branching: E.g. > value • 2^d leaves with d split path = n • d= O(log n) • To split cases into sets, • If there is one element in the set, stop • Otherwise pick a feature to split on • Find average position of two middle objects on that dimension • Split remaining objects based on average position • Recursively split subsets

  21. K-D Trees: Classification R > 0.825? Yes No L > 17.5? L > 9 ? No Yes Yes No R > 0.6? R > 0.75? R > 1.175 ? R > 1.025 ? No Yes No Yes No No Yes Yes Poor Good Good Poor Good Good Poor Good

  22. Efficient Implementation:Parallel Hardware • Classification cost: • # distance computations • Const time if O(n) processors • Cost of finding closest • Compute pairwise minimum, successively • O(log n) time

  23. Nearest Neighbor: Issues • Prediction can be expensive if many features • Affected by classification, feature noise • One entry can change prediction • Definition of distance metric • How to combine different features • Different types, ranges of values • Sensitive to feature selection

  24. Nearest Neighbor Analysis • Problem: • Ambiguous labeling, Training Noise • Solution: • K-nearest neighbors • Not just single nearest instance • Compare to K nearest neighbors • Label according to majority of K • What should K be? • Often 3, can train as well

  25. Nearest Neighbor: Analysis • Issue: • What is a good distance metric? • How should features be combined? • Strategy: • (Typically weighted) Euclidean distance • Feature scaling: Normalization • Good starting point: • (Feature - Feature_mean)/Feature_standard_deviation • Rescales all values - Centered on 0 with std_dev 1

  26. Nearest Neighbor: Analysis • Issue: • What features should we use? • E.g. Credit rating: Many possible features • Tax bracket, debt burden, retirement savings, etc.. • Nearest neighbor uses ALL • Irrelevant feature(s) could mislead • Fundamental problem with nearest neighbor

  27. Nearest Neighbor: Advantages • Fast training: • Just record feature vector - output value set • Can model wide variety of functions • Complex decision boundaries • Weak inductive bias • Very generally applicable

  28. Summary • Machine learning: • Acquire function from input features to value • Based on prior training instances • Supervised vs Unsupervised learning • Classification and Regression • Inductive bias: • Representation of function to learn • Complexity, Generalization, & Validation

  29. Summary: Nearest Neighbor • Nearest neighbor: • Training: record input vectors + output value • Prediction: closest training instance to new data • Efficient implementations • Pros: fast training, very general, little bias • Cons: distance metric (scaling), sensitivity to noise & extraneous features

More Related