110 likes | 228 Views
Explore the significance of decision trees in various industries through case-based reasoning technology. Learn how decision trees aid in making informed choices, evaluate data sources, discuss induction, understand expected value and information gain formulas, and dive into the deployment and maintenance processes. Discover real-world examples of decision trees transforming businesses like American Express UK, Hartford Steam Boiler, and more. Gain insights into the advantages and limitations of using decision trees for industrial applications.
E N D
Induction: Discussion Sources: Chapter 3, Lenz et al Book: Case-based Reasoning Technology www.aic.nrl.navy.mil/~aha/research/applications.html
Patrons? full none some X4(+),x12(+), x2(-),x5(-),x9(-),x10(-) X7(-),x11(-) X1(+),x3(+),x6(+),x8(+) The standard Expected Value Formula Information Gain Formula Gain(A) = I(p/(p+n),n/(p+n)) – Remainder(A) Reminder(A) = p(A,1) I(p1/(p1+ n1), n1/(p1+ n1)) + p(A,2) I(p2/(p2+ n2), n2/(p2+ n2)) + p(A,3) I(p3/(p3+ n3), n3/(p3+ n3))
Patrons? full none some X4(+),x12(+), x2(-),x5(-),x9(-),x10(-) X7(-),x11(-) X1(+),x3(+),x6(+),x8(+) The IDT Example Gain(Patrons) = 1 – ((2/12)I(0,1)+(4/12)I(1,0)+(6/12)I(2/6,4/6)) = 0.541
Type? burger italian french thai X3(+),x12(+), x7(-),x9(-) X6(+), x10(-) X1(+), x5(-) X4(+),x12(+) x2(-),x11(-) The IDT Example (II) Gain(Type) = 1 – ((2/12)I(1/2,1/2)+(2/12)I(1/2,1/2)+ (4/12)I(2/4,2/4)+(4/12)I(2/4,2/4)) = 0 Thus Parents is a better choice than Type
Induction: Fielded Applications • Westinghouse: Transforming uranium gas • Hartford Steam Boiler: Transformer diagnosis • Steel Works Jesenice: Oil/lubricant properties • American Express UK: credit cards applicant • Siemens (BMT): Equipment configuration • USAF school: Thallium diagnosis • Boeing (Gold-digger): Manufacturing flaws • R.R. Donelly and Sons (APOS): Banding • Enichem (Enigma): Trouble shooting motor pumps • Palomar Observation (SKICAT): Astronomical cataloging • Continuum (Shopping): WWW shopping • …
no Borderline? yes (10% of 104) Induced Rule System Accept? Classifying Credit Card Applications(from (Aha, 1996)) Credit card application • American Express UK • Problem: Expert accuracy was below average (48%) • Evaluation: system was iteratively refined with experts • 18 attributes (age, years of residence, etc) • Improved accuracy: 75%+
Reduce Process Delays of Rotogravure Printers • Problem: Bandwidth often appears on chrome cylinders causing a shutdown or costly replacement of cylinders. • Cause unknown • Use of inductive process to predict setting of control parameters (e.g., ink viscosity) • Rules were posted on shop floor • Gain: less downtime and lower replacement costs
Data collection Induction of Decision Trees/rules Evaluation of DT/rules Fielding and acceptance Maintenance Developing Cycle of IDT Applications(Adapted from (Langley, 1995)) Problem formulation
When to Consider Decision Trees • Examples describable by attribute-value pairs • Target function is discrete valued • Disjunctive hypothesis might be required • Possible noise in data Some functions are not amenable to be represented with decision trees: Parity function (returns true if input has an even number of 1’s)
Induction: Advantages • Building a decision tree is a straightforward process • The information gain measure is built on a sound basis • During consultation, only a few tests are necessary before a classification is obtained • For industrial applications, the consultation system can be delivered in a runtime system
Induction: Limitations • DTs are not incremental: cannot be modified in runtime • Consultation system is static • Handling of unknown values for attributes is problematic • The inductive approach cannot distinguish between various classes of users (e.g., experts vs non experts)