1 / 26

Instance Based Approach

Instance Based Approach. KNN Classifier. Simple classification technique. Handed an instance you wish to classify Look around the nearby region to see what other classes are around Whichever is most common—make that the prediction. K-nearest neighbor.

isha
Download Presentation

Instance Based Approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instance Based Approach KNN Classifier

  2. Simple classification technique • Handed an instance you wish to classify • Look around the nearby region to see what other classes are around • Whichever is most common—make that the prediction Instance Based Classification

  3. K-nearest neighbor • Assign the most common class among the K-nearest neighbors (like a vote) KNN Classifier Instance Based Classification

  4. How Train? Don’t Instance Based Classification

  5. Let’s get specific • Train • Load training data • Classify • Read in instance • Find K-nearest neighbors in the training data • Assign the most common class among the K-nearest neighbors (like a vote) Euclidean distance: a is an attribute (dimension) Instance Based Classification

  6. How find nearest neighbors • Naïve approach: exhaustive • For the instance to be classified • Visit every training sample and calculate distance • Sort • First K in the list Voting Formula Where is ’s class, if ; 0 otherwise Euclidean distance: a is an attribute (dimension) Instance Based Classification

  7. Classifying a lot of work • The Work that Must be Performed • Visit every training sample and calculate distance • Sort • Lots of floating point calculations • Classifier puts-off work till time to classify Euclidean distance: a is an attribute (dimension) Instance Based Classification

  8. Lazy • This is known as a “lazy” learning method • If do most of the work during the training stage known as “eager” • Our next classifier, Naïve Bayes, will be eager • Training takes a while but can classify fast • Which do you think is better? Lazy vs. Eager Where the work happens Training or Classifying Instance Based Classification

  9. Book mentions KD-Tree From Wikipedia: space‑partitioning data structure for organizing points in a k‑dimensional space. kd‑trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches). kd-trees are a special case of BSP trees. Instance Based Classification

  10. If use some data structure … • Speeds up classification • Probably slows “training” Instance Based Classification

  11. How choose K? • Choosing K can be a bit of an art • What if you could include all data-points (K=n)? • How might you do such a thing? Weighted Voting Formula Where , and is “1” if it is a member of class (i.e. where returns the class of ) How include all data points? What if weighted the votes of each training sample by its distance from the point being classified? Instance Based Classification

  12. Weight Curve • 1 over distance squared • Could get less fancy and go linear • But then training data very-far-away still have strong influence Instance Based Classification

  13. Could go more fancy • Other Radial Basis Functions • Sometimes known as a Kernel Function • One of the more common Instance Based Classification

  14. Issues • Work back-loaded • Worse the bigger the training data • Can alleviate with data structures • What else? Other Issues? What if only some dimensions contribute to ability to classify? Differences in other dimensions would put distance between that point and the target. Instance Based Classification

  15. Curse of dimensionality • Book calls this the curse of dimensionality • More is not always better • Might be identical in important dimensions but distant in others From Wikipedia: In applied mathematics, curse of dimensionality (a term coined by Richard E. Bellman),[1][2] also known as the Hughes effect[3] or Hughes phenomenon[4] (named after Gordon F. Hughes),[5][6] refers to the problem caused by the exponential increase in volume associated with adding extra dimensions to a mathematical space. For example, 100 evenly-spaced sample points suffice to sample a unit interval with no more than 0.01 distance between points; an equivalent sampling of a 10-dimensional unit hypercube with a lattice with a spacing of 0.01 between adjacent points would require 1020 sample points: thus, in some sense, the 10-dimensional hypercube can be said to be a factor of 1018 "larger" than the unit interval. (Adapted from an example by R. E. Bellman; see below.) Instance Based Classification

  16. Gene expression data • Thousands of genes • Relatively few patients • Is there a curse? Instance Based Classification

  17. Can it classify discrete data? • Bayesian could • Think of discrete data as being pre-binned • Remember RNA classification • Data in each dimension was A, C, U, or G Representation becomes all important If could arrange appropriately could use techniques like Hamming distances How measure distance? A might be closer to G than C or U (A and G are both purines while C and U are pyrimidines). Dimensional distance becomes domain specific. Instance Based Classification

  18. First few records in the training data See any issues? Hint: think of how Euclidean distance is calculated Another issue Should really normalize the data For each entry in a dimension Instance Based Classification

  19. Other uses of instance based approaches Why average? • Function approximation • Real valued prediction: take average of nearest k neighbors • If don’t know the function and/or it is too complex to “learn”, just plug-in a new value the KNN classifier can “learn” the predicted value on the fly by averaging the nearest neighbors Instance Based Classification

  20. Regression • Choose an m and b that minimizes the squared error • But again, computationallyHow? m and b that minimize Instance Based Classification

  21. Other things that can be learned • If want to learn an instantaneous slope • Can do local regression • Get the slope of a line that fits just the local data Instance Based Classification

  22. The How: Big Picture • For each of the training datum we know what Y should be • If we have a randomly generated m and b, these, along with X will tell us a predicted Y • Know whether the m and b yield too large or too small a prediction • Can nudge “m” and “b” in an appropriate direction (+ or -) • Sum these proposed nudges across all training data Line represents output or predicted Y Target Y too low Instance Based Classification

  23. Gradient Descent • Which way should m go to reduce error? Rise y actual b y actual Could Average Then do same for b Then do again Instance Based Classification

  24. Back to why we went down this road • Locally weighted linear regression • Would still perform gradient descent • Becomes a global function approximation Instance Based Classification

  25. Summary • KNN highly effective for many practical problems • With sufficient training data • Robust to noisy training • Work back-loaded • Susceptible to dimensionality curse Instance Based Classification

  26. Instance Based Classification

More Related