CS 782 Machine Learning
This presentation is the property of its rightful owner.
Sponsored Links
1 / 24

Learning Agents Laboratory Computer Science Department George Mason University PowerPoint PPT Presentation


  • 66 Views
  • Uploaded on
  • Presentation posted in: General

CS 782 Machine Learning. 9 Instance-Based Learning. Prof. Gheorghe Tecuci. Learning Agents Laboratory Computer Science Department George Mason University. Overview. Exemplar-based representation of concepts. The k-nearest neighbor algorithm. Discussion.

Download Presentation

Learning Agents Laboratory Computer Science Department George Mason University

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Learning agents laboratory computer science department george mason university

CS 782 Machine Learning

9 Instance-Based Learning

Prof. Gheorghe Tecuci

Learning Agents Laboratory

Computer Science Department

George Mason University


Learning agents laboratory computer science department george mason university

Overview

Exemplar-based representation of concepts

The k-nearest neighbor algorithm

Discussion

Lazy Learning versus Eager Learning

Recommended reading


Learning agents laboratory computer science department george mason university

Concepts representation

Let us consider a set of concepts C = {c1, c2, ... , cn}, covering a universe of instances I.

Each concept ci represents a subset of I.

How is a concept usually represented?

How does one test whether an object ‘a’ is an instance of a concept “c1”?


Learning agents laboratory computer science department george mason university

Intentional representation of concepts

How is a concept usually represented?

Usually, a concept is represented intentionally by a description covering the positive examples of the concept and not covering the negative examples.

How does one test whether an object ‘a’ is an instance of a concept “ci”?

The set of instances represented by a concept ci is the set of instances of the description of ci.Therefore, testing if an object a is an instance of a concept ci reduces to testing if the description of ci is more general than the description of a.

How could we represent a concept extensionally?

How could we represent a concept extensionally, without specifying all its instances?


Learning agents laboratory computer science department george mason university

Exemplar based representation of concepts

  • A concept ci may be represented extensionally by:

    • a collection of examples ci = {ei1, ei2, ...},

    • a similarity estimation function f, and

    • a threshold value q.

An instance ‘a’ belongs to the concept ci if ‘a’ is similar to an element eij of ci, and this similarity is greater than q, that is, f(eij, ci) > q.

How could a concept ci be generalized in this representation?


Learning agents laboratory computer science department george mason university

Generalization in exemplar based representations

How could a concept ci be generalized in this representation?

  • Generalizing the concept ci may be achieved by:

    • adding a new exemplar;

    • decreasing q.

Why are these generalization operations?

Is there an alternative to considering the threshold value q for classification of an instance?


Learning agents laboratory computer science department george mason university

Prediction with exemplar based representations

Let us consider a set of concepts C = {c1, c2, ... , cn}, covering a universe of instances I.

Each concept ci is represented extensionally as a collection of examples ci = {ei1, ei2, ...}.

Let ‘a’ be an instance to classify.

How to decide to which concept does ‘a’ belong?

Different answers to this question lead to different learning methods.


Learning agents laboratory computer science department george mason university

Prediction (cont)

Let ‘a’ be an instance to classify in one of the classes

{c1, c2, ... , cn}.

How to decide to which concept does it belong?

Method 1

‘a’ belongs to the concept ci if ‘a’ is similar to an element eij of ci, and this similarity is greater than the similarity between ‘a’ and any other concept exemplar (1-nearest neighbor).

What is a potential problem with 1-nearest neighbor?

Hint: Think of an exemplar which is not typical.


Learning agents laboratory computer science department george mason university

Prediction (cont)

How could the problem with method 1 be alleviated?

Use more than one example.

Method 2

Consider the k most similar exemplars.

‘a’ belongs to the concept ci that contains most of the k exemplars (k-nearest neighbor).

What is a potential problem with k-nearest neighbor?

Hint: Think of the intuition behind instance-based learning.


Learning agents laboratory computer science department george mason university

Prediction (cont)

How could the problem with method 2 be alleviated?

Weight the exemplars.

Answer 3

Consider the k most similar exemplars, but weight their contribution to the class of ‘a’ by their distance to ‘a’, giving greater weight to the closest neighbors (distance-weighted nearest neighbor).


Learning agents laboratory computer science department george mason university

Overview

Exemplar-based representation of concepts

The k-nearest neighbor algorithms

Discussion

Lazy Learning versus Eager Learning

Recommended reading


Learning agents laboratory computer science department george mason university

The k-nearest neighbor algorithm

Each example is represented using the feature-vector representation:

ei = (a1=vi1, a2=vi2, … , an=vin)

The distance between two examples ei and ej is the Euclidean distance:

d(ei, ej) = √Σ (vik - vjk)2

Training algorithm

Each example is represented as a feature-value vector.

For each training example (eik Ci) add eik to the exemplars of Ci.

Classification algorithm

Let ‘a’ be an instance to classify.

Find the k most similar exemplars.

Assign ‘a’ to the concept that contains the most of the k exemplars.


Learning agents laboratory computer science department george mason university

Nearest neighbors algorithms: illustration

-

-

-

-

+

q1

e1

+

-

+

+

-

1-nearest neighbor:

the concept represented by e1

5-nearest neighbors:

q1 is classified as negative


Learning agents laboratory computer science department george mason university

Overview

Exemplar based representation of concepts

The k-nearest neighbor algorithms

Discussion

Lazy Learning versus Eager Learning

Recommended reading


Learning agents laboratory computer science department george mason university

Nearest neighbors algorithms: inductive bias

What is the inductive bias of the k-nearest neighbor algorithm?

The assumption that the classification of an instance ‘a’ will be most similar to the classification of other instances that are nearby in the Euclidian space.


Learning agents laboratory computer science department george mason university

Application issues

Which are some practical issues in applying the k-nearest neighbor algorithms?

Because the distance between instances is based on all the attributes, less relevant attributes and even the irrelevant ones are used in the classification of a new instance.

Because the algorithm delays all processing until a new classification/prediction is required, significant processing is needed to make the prediction.

Because the algorithm is based on a distance function, the attribute values should be such that a distance could be computed.

How to alleviate these problems?


Learning agents laboratory computer science department george mason university

Application issue: the use of the attributes

The classification of an example is based on all the attributes, independent of their relevance. Even the irrelevant attributes are used.

How to alleviate this problem?

Weight the contribution of each attribute, based on its relevance.

How to determine the relevance of an attribute?

Use an approach similar to cross-validation.

How?


Learning agents laboratory computer science department george mason university

Application issue: processing for classification

Because the algorithm delays all processing until a new classification/prediction is required, significant processing is needed to make the prediction.

How to alleviate this problem?

Use complex indexing techniques to facilitate the identification of the nearest neighbors at some additional cost in memory.

How?

Tress where the leaves are exemplars, nearby exemplars are stored at nearby nodes, and internal nodes sort the query to the relevant leaf by testing selected attributes.


Learning agents laboratory computer science department george mason university

Instance-based learning: discussion

Which are the advantages of the instance-based learning algorithms?

Which are the disadvantages of the instance-based learning algorithms?


Learning agents laboratory computer science department george mason university

Instance-based learning: advantages

Model complex concept descriptions using simpler example descriptions.

Information present in the training examples is never lost, because the examples themselves are stored explicitly.


Learning agents laboratory computer science department george mason university

Instance-based learning: disadvantages

Efficiency of labeling new instances is low, because all processing is done at prediction time.

It is difficult to determine an appropriate distance function, especially when examples are represented as complex symbolic expressions.

Irrelevant features have a negative impact of on the distance metric.


Learning agents laboratory computer science department george mason university

Lazy Learning versus Eager Learning

Lazy learning

Defer the decision of how to generalize beyond the training data until each new query instance is encountered.

Eager learning

Generalizes beyond the training data before observing the new query, committing at the training time to the learned concept.

How do the two types of learning compare in terms of computation time?

Lazy learners require less computation time for training and more for prediction.


Learning agents laboratory computer science department george mason university

Exercise

Suggest a lazy version of the eager decision tree learning algorithm ID3.

What are the advantages and disadvantages of your lazy algorithm compared to the original eager algorithm?


Learning agents laboratory computer science department george mason university

Recommended reading

Mitchell T.M., Machine Learning, Chapter 8: Instance-based learning,

pp. 230 - 248, McGraw Hill, 1997.

Kibler D, Aha D., Learning Representative Exemplars of Concepts: An Initial Case Study, in J.W.Shavlik, T.G.Dietterich (eds), Readings in Machine Learning, Morgan Kaufmann, 1990.


  • Login