1 / 47

Introduction to Pattern Recognition Chapter 1 ( Duda et al.)

Introduction to Pattern Recognition Chapter 1 ( Duda et al.). CS479/679 Pattern Recognition Dr. George Bebis. Pattern Recognition/Classification. Assign an unknown pattern to one of several known categories (or classes ). What is a Pattern?. A pattern could be an object or event.

wstratton
Download Presentation

Introduction to Pattern Recognition Chapter 1 ( Duda et al.)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Pattern RecognitionChapter 1 (Duda et al.) CS479/679 Pattern RecognitionDr. George Bebis

  2. Pattern Recognition/Classification • Assign an unknown pattern to one of several known categories (or classes).

  3. What is a Pattern? • A pattern could be an object or event. • Typically, represented by a vector x of numbers biometric patterns hand gesture patterns

  4. What is a Pattern? (con’t) • Loan/Credit card applications • Income, # of dependents, mortgage amount credit worthiness classification • Dating services • Age, hobbies, income “desirability” classification • Web documents • Key-word based descriptions (e.g., documents containing “football”, “NFL”)  document classification

  5. Pattern Class • A collection of “similar” objects. Male class Female class

  6. Main Objectives • Find a “way” to separate the data belonging to different classes. • Given “new” data, assign them to the closest category. Gender Classification

  7. Main Approaches x: input vector (pattern) y: class label (class) • Generative • Model the joint probability, p(x, y) • Make predictions by using Bayes rule to calculate p(ylx) • Pick the most likely class label y • Discriminative • Estimate p(ylx) directly (e.g., learn a direct mapping from inputs x to the class labels y) • Pick the most likely class label y

  8. How do we model p(x,y)? • Typically, using a statistical model. • probability density function (e.g., Gaussian) male Gender Classification female

  9. How do we model p(x,y)? (cont’d) • Key challenges: • Intra-class variability • Inter-class variability The letter “T” in different typefaces Letters/Numbers that look similar

  10. Generative Approach:Main Objectives • Hypothesize the models that describe each pattern class (e.g., recover the process that generated the patterns). • Given a novel pattern, choose the best-fitting model for it and then assign it to the pattern class associated with the model.

  11. Classification vsClustering • Classification (known categories) • Clustering (unknown categories) Category “A” Category “B” Clustering (Unsupervised Classification) Classification (Recognition) (Supervised Classification)

  12. Pattern Recognition Applications

  13. Handwriting Recognition

  14. Handwriting Recognition (cont’d)

  15. License Plate Recognition

  16. Biometric Recognition

  17. Fingerprint Classification

  18. Face Detection

  19. Autonomous Systems

  20. Medical Applications Breast Cancer Detection Skin Cancer Detection

  21. Land Classification(from aerial or satellite images)

  22. “Hot” Applications • Recommendation systems • Amazon, Netflix • Targeted advertising • Spam filters • Loan/Credit Card Applications • Malicious website detection

  23. Complexity of PR – An Example camera Problem: Sorting incoming fish on a conveyor belt. Assumption: Two kind of fish: (1) sea bass (2) salmon

  24. Training/Test Phases Test Phase Training Phase

  25. Training/Test data • How do we know that we have collected an adequately large and representative set of examples for training/testing the system? Training Set ? Test Set ?

  26. Pre-processing Step Example (1) Image enhancement (2) Separate touching or occluding fish (3) Find the boundary of each fish

  27. Sensors & Preprocessing • Sensing: • Use a sensor (camera or microphone) for data capture. • PR depends on bandwidth, resolution, sensitivity, distortion of the sensor. • Pre-processing: • Removal of noise in data. • Segmentation (i.e., isolation of patterns of interest from background).

  28. Feature Extraction • Assume a fisherman told us that a sea bass is generally longer than a salmon. • We can use length as a feature and decide between sea bass and salmon according to a threshold on length. • How should we choose the threshold?

  29. “Length” Histograms • Even though sea bass is longer than salmon on the average, there are many examples of fish where this observation does not hold. threshold l*

  30. “Average Lightness” Histograms • Consider a different feature such as “average lightness” • It seems easier to choose the threshold x* but we still cannot make a perfect decision. threshold x*

  31. Multiple Features • To improve recognition accuracy, we might have to use more than one features. • Single features might not yield the best performance. • Using combinations of features might yield better performance. • How many features should we choose?

  32. How Many Features? • Does adding more features always improve performance? • It might be difficult and computationally expensive to extract certain features. • Correlated features might not improve performance (i.e. redundancy). • “Curse” of dimensionality.

  33. Curse of Dimensionality • Adding too many features can, paradoxically, lead to a worsening of performance. • Divide each of the input features into a number of intervals, so that the value of a feature can be specified approximately by saying in which interval it lies. • If each input feature is divided into M divisions, then the total number of cells is Md (d: # of features). • Since each cell must contain at least one point, the number of training data grows exponentially with d.

  34. Missing Features • Certain features might be missing (e.g., due to occlusion). • How should we train the classifier with missing features ? • How should the classifier make the best decision with missing features ?

  35. “Quality” of Features • How to choose a good set of features? • Discriminative features • Invariant features (e.g., invariant to geometric transformations such as translation, rotation and scale) • Are there ways to automatically learn which features are best ?

  36. Classification • Partition the feature space into two regions by finding the decision boundary that minimizes the error. • How should we find the optimal decision boundary?

  37. Complexity of Model • We can get perfect classification performance on the training data by choosing a more complex model. • Complex models are tuned to the particular training samples, rather than on the characteristics of the true model. overfitting How well can the model generalize to unknown samples?

  38. Generalization • Generalization is defined as the ability of a classifier to produce correct results on novel patterns. • How can we improve generalization performance ? • More training examples (i.e., better model estimates). • Simpler models usually yield better performance. simpler model complex model

  39. Understanding model complexity: function approximation • Approximatea function from a set of samples • Green curve is the true function • Ten sample points are shown by the blue circles (assuming noise)

  40. Understanding model complexity: function approximation (cont’d) Polynomial curve fitting: polynomials having various orders, shown as red curves, fitted to the set of 10 sample points.

  41. Understanding model complexity: function approximation (cont’d) Polynomial curve fitting: 9’th order polynomials fitted to 15 and 100 sample points.

  42. Improve Classification Performance through Post-processing • Consider the problem of character recognition • Exploit context to improve performance. How m ch info mation are y u mi sing?

  43. Improve Classification Performance through Ensembles of Classifiers • Performance can be improved using a "pool" of classifiers. • How should we build and combine different classifiers ?

  44. Cost of miss-classifications • Fish classification: two possible classificationerrors: (1) Deciding the fish was a sea bass when it was a salmon. (2) Deciding the fish was a salmon when it was a sea bass. • Are both errors equally important ?

  45. Cost of miss-classifications (cont’d) • Suppose that: • Customers who buy salmon will object vigorously if they see sea bass in their cans. • Customers who buy sea bass will not be unhappy if they occasionally see some expensive salmon in their cans. • How does this knowledge affect our decision?

  46. Computational Complexity • How does an algorithm scale with the number of: • features • patterns • categories • Need to consider tradeoffs between computational complexity and performance.

  47. Would it be possible to build a “general purpose” PR system? • It would be very difficult to design a system that is capable of performing a variety of classification tasks. • Different problems require different features. • Different features might yield different solutions. • Different tradeoffs exist for different problems.

More Related