Learning Decision Trees for Knowledge Infusion

Learning Decision Trees

Bridge-In • “knowledge infusion” is not always the best way of providing an agent with knowledge • impractical, tedious • incomplete, imprecise, possibly incorrect • adaptive • an agent can expand and modify its knowledge base to reflect changes • improved performance • through learning the agent can make better decisions • autonomy • without learning, an agent can hardly be considered autonomous

Learning • learning is important for agents to deal with • unknown environments • changes • in many cases, it is more efficient to train an agent via examples, than to “manually” extract knowledge from the examples, and “instill” it into the agent

Introduction to Learning • Learning is the ability to improve behavior based on experience. • Components of a Learning Problem • task - behavior being improved • data - experiences being used to improve performance • measure of improvement- e.g., accuracy in prediction, speed, coverage • background knowledge - bias on what can be learned

reasoning procedure internal representation Elements of Learning experiences (data) task background knowledge (bias) performance

Feedback to the Learner • Supervised learning: Learner told immediately whether response behavior was appropriate (training set) - correct answers for each example. • Unsupervised learning: No classifications are given; the learner has to discover regularities and categories in the data for itself - correct answers not given.

Measuring Success • Training set, test set • The measure of success is not how well the agent performs on the training examples, but how well it performs for new examples.

Learning from observation • Learning Agents • Inductive Learning • Learning Decision Trees

Learning agents

Inductive Learning • Simplest form: learn a function from examples • tries to find a function h (the hypothesis) that approximates a set of samples defining a function f • the samples are usually provided as input-output pairs (x, f(x)) • relies on inductive inference, or induction

f(x) x Example Inductive Learning 1 • input-output pairs displayed as points in a plane • the task is to find a hypothesis (functions) that connects the points • either all of them, or most of them • various performance measures • number of points connected • minimal surface • lowest tension

f(x) x Example Inductive Learning 2 • hypothesis is a function consisting of linear segments • fully incorporates all sample pairs • goes through all points • very easy to calculate

f(x) x Example Inductive Learning 3 • hypothesis expressed as a polynomial function • incorporates all samples • more complicated to calculate than linear segments • better predictive power

f(x) x Example Inductive Learning 4 • hypothesis is a linear functions • does not incorporate all samples • extremely easy to compute • low predictive power

Decision Trees - Introduction • Goal: Categorization • Given an event, predict its category. • Who won a given ball game? • How should we file a given e-mail? • What word sense was intended for a given occurrence of a word? • Event = list of features. • Ball game: Which players were on offense? • E-mail: who sent the message? • Disambiguation: what was the preceding word?

Introduction, cont. • Use a decision tree to predict categories for new events. • Use training data to build the decision tree. New event Training events and categories Decision Tree Category

Terminology • example or sample • describes the values of the attributes and that of the goal predicated • a positive sample has the value true for the goal predicate, a negative sample false • the training set consists of samples used for constructing the decision tree • the test set is used to determine if the decision tree performs correctly • ideally, the test set is different from the training set

Decision Trees • A decision tree is a tree where • each node is labeled with a feature • each arc of an interior node is labeled with a value for that feature. • each leaf is labeled with a category location away home goalie weather dry wet Jane Bob dry win time weather lose 5pm wet 3pm win lose lose win 4pm win

Learning decision trees Problem: decide whether to wait for a table at a restaurant, based on the following attributes: • Alternate: is there an alternative restaurant nearby? • Bar: is there a comfortable bar area to wait in? • Fri/Sat: is today Friday or Saturday? • Hungry: are we hungry? • Patrons: number of people in the restaurant (None, Some, Full) • Price: price range ($, $$, $$$) • Raining: is it raining outside? • Reservation: have we made a reservation? • Type: kind of restaurant (French, Italian, Thai, Burger) • WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)

Restaurant Sample Set

Constructing Decision Trees • in general, constructing the smallest possible decision tree is a difficult problem • basic idea: test the most important attribute first • attribute that makes the most difference for the classification of an example • can be determined through information theory • hopefully will yield the correct classification with few tests

Decision Tree Algorithm • recursive formulation • select the best attribute to split positive and negative examples • if only positive or only negative examples are left, we are done • if no examples are left, no such examples were observed • return a default value calculated from the majority classification at the node’s parent • if we have positive and negative examples left, but no attributes to split them we are in trouble • Samples have the same description, but different classifications • may be caused by incorrect data (noise), or by a lack of information, or by a truly non-deterministic domain

Restaurant Sample Set • select best attribute • candidate : Pat Some and None in agreement with goal

Partial Decision Tree • Patrons needs further discrimination only for the Full value • None and Some agree with the WillWait goal predicate • the next step will be performed on the remaining samples for the Full value of Patrons X1, X3, X4, X6, X8, X12 X2, X5, X7, X9, X10, X11 Patrons? Full None Some X7, X11 X1, X3, X6, X8 X4, X12 X2, X5, X9, X10 No Yes

Restaurant Sample Set • select next best attribute • candidate : Hungry No in agreement with goal

Partial Decision Tree • Hungry needs further discrimination only for the Yes value • No agrees with the WillWait goal predicate • the next step will be performed on the remaining samples for the Yes value of Hungry X1, X3, X4, X6, X8, X12 X2, X5, X7, X9, X10, X11 Patrons? Full None Some X7, X11 X1, X3, X6, X8 X4, X12 X2, X5, X9, X10 No Yes Hungry? Y N X4, X12 X5, X9 X2, X10 No

Restaurant Sample Set • select next best attribute • candidate : Type Italian, Burger in agreement with goal

Partial Decision Tree X1, X3, X4, X6, X8, X12 X2, X5, X7, X9, X10, X11 • Type needs further discrimination only for the Thai value • Italian, Burger agree with the WillWait goal predicate • the next step will be performed on the remaining samples for the Thai value of Type Patrons? Full None Some X7, X11 X1, X3, X6, X8 X4, X12 X2, X5, X9, X10 No Yes Hungry? Y N X4, X12 X5, X9 X2, X10 Type? No French Burger Thai Ital. X4 X10 X12 No X2 No Yes

Restaurant Sample Set • select next best attribute • candidate 1: Friday Yes and No in agreement with goal

X1, X3, X4, X6, X8, X12 Decision Tree X2, X5, X7, X9, X10, X11 Patrons? None Full Some • the two remaining samples can be made consistent by selecting Friday as the next predicate • no more samples left X7, X11 X1, X3, X6, X8 X4, X12 X2, X5, X9, X10 Hungry? No Yes N Y X4, X12 X5, X9 X2, X10 Type? No French Burger Ital. Thai X4 No X10 X12 X2 No Yes Friday? N Y X4 X2 Yes No

Learning Decision Trees for Knowledge Infusion

Learning Decision Trees for Knowledge Infusion

Presentation Transcript

Learning and Lifelong learning

Learning About Learning

Reinforcement Learning : Learning Algorithms

Learning E-Learning: Introduction

Blended Learning , Better Learning

Learning and Learning LiveCode

Learning styles Learning Preferences Learning Strategies

HUMAN LEARNING AND LEARNING

Learning about learning

Machine learning: Unsupervised learning

Learning about learning

Mobility + Learning = Mobile Learning

Active Learning = Deep Learning

Learning and Learning Disabilities

Learning Unit – learning consultant

Learning inside, learning outside

Learning Teaching Teaching Learning

Learning Technologies E-Learning