Outline
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

Outline PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on
  • Presentation posted in: General

Outline. K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures. Eager Learners : when given a set of training tuples, will construct a generalization model before receiving new tuples to classify Classification by decision tree induction Rule-based classification

Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Outline

Outline

  • K-Nearest Neighbor algorithm

  • Fuzzy Set theory

  • Classifier Accuracy Measures


Chapter 6 classification and prediction

Eager Learners: when given a set of training tuples, will construct a generalization model before receiving new tuples to classify

Classification by decision tree induction

Rule-based classification

Classification by back propagation

Support Vector Machines (SVM)

Associative classification

Chapter 6. Classification and Prediction


Lazy vs eager learning

Lazy vs. Eager Learning

  • Lazy vs. eager learning

    • Lazy learning (e.g., instance-based learning): Simply stores training data (or only minor processing) and waits until it is given a test tuple

    • Eager learning (the above discussed methods): Given a set of training set, constructs a classification model before receiving new (e.g., test) data to classify

  • Lazy: less time in training but more time in predicting


Lazy learner instance based methods

Lazy Learner: Instance-Based Methods

  • Typical approaches

    • k-nearest neighbor approach

      • Instances represented as points in a Euclidean space.


The k nearest neighbor algorithm

The k-Nearest Neighbor Algorithm

  • All instances correspond to points in the n-D space

  • The nearest neighbor are defined in terms of Euclidean distance, dist(X1, X2)

  • Target function could be discrete- or real- valued

  • For discrete-valued, k-NN returns the most common value among the k training examples nearest to xq

_

_

_

.

_

+

+

_

+

xq

_

+


The k nearest neighbor algorithm1

The k-Nearest Neighbor Algorithm

  • k-NN for real-valued prediction for a given unknown tuple

    • Returns the mean values of the k nearest neighbors

  • Distance-weighted nearest neighbor algorithm

    • Weight the contribution of each of the k neighbors according to their distance to the query xq

    • Give greater weight to closer neighbors

  • Robust to noisy data by averaging k-nearest neighbors


The k nearest neighbor algorithm2

The k-Nearest Neighbor Algorithm

  • How can I determine the value of k, the number of neighbors?

    • In general, the larger the number of training tuples is, the larger the value of k is

  • Nearest-neighbor classifiers can be extremely slow when classifying test tuples O(n)

  • By simple presorting and arranging the stored tuples into search tree, the number of comparisons can be reduced to O(logN)


The k nearest neighbor algorithm3

The k-Nearest Neighbor Algorithm

  • Example:

    K=5


Outline1

Outline

  • K-Nearest Neighbor algorithm

  • Fuzzy Set theory

  • Classifier Accuracy Measures


Fuzzy set approaches

Fuzzy Set Approaches

  • Rule-based systems for classification have the disadvantage that they involve sharp cutoffs for continuous attributes

    • For example:

      IF (years_employed>2) AND (income>50K)

      THEN credit_card=approved

      What if a customer has 10 years employed and income is 49K?


Fuzzy set approaches1

Fuzzy Set Approaches

  • Instead, we can discretize income into categories such as {low,medium,high}, and then apply fuzzy logic to allow “fuzzy” threshold for each category


Fuzzy set approaches2

Fuzzy Set Approaches

  • Fuzzy theory is also known as possibility theory, it was proposed by Lotif Zadeh in 1965

  • Unlike the notion of traditional “crisp” sets where an element either belongs to a set S, in fuzzy theory, elements can belong to more than one fuzzy set


Fuzzy set approaches3

Fuzzy Set Approaches

  • For example, the income value $49K belongs to both the medium and high fuzzy sets:

    Mmedium($49K)=0.15 and

    Mhigh($49K)=0.96


Fuzzy set approaches4

Fuzzy Set Approaches

Another example for temperature


Fuzzy set applications

Fuzzy Set Applications

  • http://www.dementia.org/~julied/logic/applications.html


Outline2

Outline

  • K-Nearest Neighbor algorithm

  • Fuzzy Set theory

  • Classifier Accuracy Measures


Classifier accuracy measures

Classifier Accuracy Measures


Classifier accuracy measures1

Classifier Accuracy Measures

  • Alternative accuracy measures (e.g., for cancer diagnosis)

    sensitivity = t-pos/pos

    specificity = t-neg/neg

    precision = t-pos/(t-pos + f-pos)

    accuracy =


  • Login