1 / 30

Learning Decision Trees for Knowledge Infusion

This article explores the concept of learning decision trees as a more efficient way of providing agents with knowledge compared to manual extraction and instillation. It discusses the benefits of learning, such as improved performance and autonomy, and introduces the elements and components of learning. The article concludes with an example of constructing decision trees for categorization.

mckayr
Download Presentation

Learning Decision Trees for Knowledge Infusion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Decision Trees

  2. Bridge-In • “knowledge infusion” is not always the best way of providing an agent with knowledge • impractical, tedious • incomplete, imprecise, possibly incorrect • adaptive • an agent can expand and modify its knowledge base to reflect changes • improved performance • through learning the agent can make better decisions • autonomy • without learning, an agent can hardly be considered autonomous

  3. Learning • learning is important for agents to deal with • unknown environments • changes • in many cases, it is more efficient to train an agent via examples, than to “manually” extract knowledge from the examples, and “instill” it into the agent

  4. Introduction to Learning • Learning is the ability to improve behavior based on experience. • Components of a Learning Problem • task - behavior being improved • data - experiences being used to improve performance • measure of improvement- e.g., accuracy in prediction, speed, coverage • background knowledge - bias on what can be learned

  5. reasoning procedure internal representation Elements of Learning experiences (data) task background knowledge (bias) performance

  6. Feedback to the Learner • Supervised learning: Learner told immediately whether response behavior was appropriate (training set) - correct answers for each example. • Unsupervised learning: No classifications are given; the learner has to discover regularities and categories in the data for itself - correct answers not given.

  7. Measuring Success • Training set, test set • The measure of success is not how well the agent performs on the training examples, but how well it performs for new examples.

  8. Learning from observation • Learning Agents • Inductive Learning • Learning Decision Trees

  9. Learning agents

  10. Inductive Learning • Simplest form: learn a function from examples • tries to find a function h (the hypothesis) that approximates a set of samples defining a function f • the samples are usually provided as input-output pairs (x, f(x)) • relies on inductive inference, or induction

  11. f(x) x Example Inductive Learning 1 • input-output pairs displayed as points in a plane • the task is to find a hypothesis (functions) that connects the points • either all of them, or most of them • various performance measures • number of points connected • minimal surface • lowest tension

  12. f(x) x Example Inductive Learning 2 • hypothesis is a function consisting of linear segments • fully incorporates all sample pairs • goes through all points • very easy to calculate

  13. f(x) x Example Inductive Learning 3 • hypothesis expressed as a polynomial function • incorporates all samples • more complicated to calculate than linear segments • better predictive power

  14. f(x) x Example Inductive Learning 4 • hypothesis is a linear functions • does not incorporate all samples • extremely easy to compute • low predictive power

  15. Decision Trees - Introduction • Goal: Categorization • Given an event, predict its category. • Who won a given ball game? • How should we file a given e-mail? • What word sense was intended for a given occurrence of a word? • Event = list of features. • Ball game: Which players were on offense? • E-mail: who sent the message? • Disambiguation: what was the preceding word?

  16. Introduction, cont. • Use a decision tree to predict categories for new events. • Use training data to build the decision tree. New event Training events and categories Decision Tree Category

  17. Terminology • example or sample • describes the values of the attributes and that of the goal predicated • a positive sample has the value true for the goal predicate, a negative sample false • the training set consists of samples used for constructing the decision tree • the test set is used to determine if the decision tree performs correctly • ideally, the test set is different from the training set

  18. Decision Trees • A decision tree is a tree where • each node is labeled with a feature • each arc of an interior node is labeled with a value for that feature. • each leaf is labeled with a category location away home goalie weather dry wet Jane Bob dry win time weather lose 5pm wet 3pm win lose lose win 4pm win

  19. Learning decision trees Problem: decide whether to wait for a table at a restaurant, based on the following attributes: • Alternate: is there an alternative restaurant nearby? • Bar: is there a comfortable bar area to wait in? • Fri/Sat: is today Friday or Saturday? • Hungry: are we hungry? • Patrons: number of people in the restaurant (None, Some, Full) • Price: price range ($, $$, $$$) • Raining: is it raining outside? • Reservation: have we made a reservation? • Type: kind of restaurant (French, Italian, Thai, Burger) • WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)

  20. Restaurant Sample Set

  21. Constructing Decision Trees • in general, constructing the smallest possible decision tree is a difficult problem • basic idea: test the most important attribute first • attribute that makes the most difference for the classification of an example • can be determined through information theory • hopefully will yield the correct classification with few tests

  22. Decision Tree Algorithm • recursive formulation • select the best attribute to split positive and negative examples • if only positive or only negative examples are left, we are done • if no examples are left, no such examples were observed • return a default value calculated from the majority classification at the node’s parent • if we have positive and negative examples left, but no attributes to split them we are in trouble • Samples have the same description, but different classifications • may be caused by incorrect data (noise), or by a lack of information, or by a truly non-deterministic domain

  23. Restaurant Sample Set • select best attribute • candidate : Pat Some and None in agreement with goal

  24. Partial Decision Tree • Patrons needs further discrimination only for the Full value • None and Some agree with the WillWait goal predicate • the next step will be performed on the remaining samples for the Full value of Patrons X1, X3, X4, X6, X8, X12 X2, X5, X7, X9, X10, X11 Patrons? Full None Some X7, X11 X1, X3, X6, X8 X4, X12 X2, X5, X9, X10 No Yes

  25. Restaurant Sample Set • select next best attribute • candidate : Hungry No in agreement with goal

  26. Partial Decision Tree • Hungry needs further discrimination only for the Yes value • No agrees with the WillWait goal predicate • the next step will be performed on the remaining samples for the Yes value of Hungry X1, X3, X4, X6, X8, X12 X2, X5, X7, X9, X10, X11 Patrons? Full None Some X7, X11 X1, X3, X6, X8 X4, X12 X2, X5, X9, X10 No Yes Hungry? Y N X4, X12 X5, X9 X2, X10 No

  27. Restaurant Sample Set • select next best attribute • candidate : Type Italian, Burger in agreement with goal

  28. Partial Decision Tree X1, X3, X4, X6, X8, X12 X2, X5, X7, X9, X10, X11 • Type needs further discrimination only for the Thai value • Italian, Burger agree with the WillWait goal predicate • the next step will be performed on the remaining samples for the Thai value of Type Patrons? Full None Some X7, X11 X1, X3, X6, X8 X4, X12 X2, X5, X9, X10 No Yes Hungry? Y N X4, X12 X5, X9 X2, X10 Type? No French Burger Thai Ital. X4 X10 X12 No X2 No Yes

  29. Restaurant Sample Set • select next best attribute • candidate 1: Friday Yes and No in agreement with goal

  30. X1, X3, X4, X6, X8, X12 Decision Tree X2, X5, X7, X9, X10, X11 Patrons? None Full Some • the two remaining samples can be made consistent by selecting Friday as the next predicate • no more samples left X7, X11 X1, X3, X6, X8 X4, X12 X2, X5, X9, X10 Hungry? No Yes N Y X4, X12 X5, X9 X2, X10 Type? No French Burger Ital. Thai X4 No X10 X12 X2 No Yes Friday? N Y X4 X2 Yes No

More Related