Slide1 l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 31

Generation of Attribute Value Taxonomies from Data for Data-Driven Construction of Accurate and Compact Classifiers PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on
  • Presentation posted in: General

Generation of Attribute Value Taxonomies from Data for Data-Driven Construction of Accurate and Compact Classifiers. Dae-Ki Kang , Adrian Silvescu, Jun Zhang and Vasant Honavar Artificial Intelligence Research Laboratory Iowa State University, USA.

Download Presentation

Generation of Attribute Value Taxonomies from Data for Data-Driven Construction of Accurate and Compact Classifiers

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Slide1 l.jpg

Generation of Attribute Value Taxonomies from Data for Data-Driven Construction of Accurate and Compact Classifiers

Dae-Ki Kang,Adrian Silvescu, Jun Zhang and Vasant Honavar

Artificial Intelligence Research Laboratory

Iowa State University, USA

This research is sponsored in part by grants from the National Science Foundation (IIS 0219699) and National Institutes of Health (GM066387)


Paper highlights l.jpg

Paper Highlights

  • AVT-Learner, an algorithm for automated construction of attribute value taxonomiesfrom data

  • Evaluation of the AVTs generated by AVT-Learner using AVT-aware learning algorithm on benchmark data sets


Overview l.jpg

Overview

  • Background and Motivation

  • AVT-Learner Algorithm

  • Experimental Results

  • Summary


Attribute value taxonomies avt isa hierarchies l.jpg

Attribute Value Taxonomies (AVT)-- ISA hierarchies

Human-supplied Attribute Value Taxonomy (AVT) for student status

ISA relationship

Abstract Values

Primitive Values

Cut


Motivations for learning avt s from data l.jpg

Motivations for learning AVTsfrom data

  • Learning from AVTs and data has severaladvantages (Zhang & Honavar; 2003, 2004)

    • Preference for simple, comprehensible, yet accurate and robust classifiers

    • When data are limited, statistics estimated from abstract values are often more reliable than statistics estimated from primitive values

  • However, in most domains, AVTs are usually unavailable, and manual AVT generation is tedious

  • Need to generate AVTs that are useful for classification tasks


Learning scenario with avt learner l.jpg

Learning Scenario with AVT-Learner


Overview7 l.jpg

Overview

  • Background and Motivation

  • AVT-Learner Algorithm

  • Experimental Results

  • Summary


Avt l earner l.jpg

AVT-Learner

  • An algorithm for automated construction of AVTs from a data set of instances wherein each instance is described by an ordered tuple of N nominal attribute values and a class label

    • Hierarchical agglomerative clustering (HAC) of the attribute values according to the distribution of classes that co-occur with them

    • Using the pairwise divergence between the distributions of class labels associated with the corresponding attribute values as a measure of the dissimilarity between the attribute values.


Problem definitions l.jpg

Problem Definitions

  • A={A1,A2,…,An} – nominal attributes

  • Vi– a set of primitive values of attribute Ai

  • C={C1,C2,…,Ck} – mutually disjoint class labels

  • Data D  V1  V2  …Vn C

  • T={T1,T2,…,Tn} – a set of AVT s.t. Ti is an AVT associated with the attribute Ai

  • Learning AVTs from data – given a data set D and similarity measure DM(P(x)||Q(x)), output a set of AVTs T={T1,T2,…,Tn} s.t. each Ti corresponds to a hierarchical grouping of values in Vi based on the specified similarity measure


Major steps of avt learner l.jpg

Major steps of AVT Learner

  • Initialize Cut L = {vi1 ,… vij ,… vil}

  • Compare and choose

    • For each value lj in L and class label ck, calculate class label conditional probability distribution P(C|lj)

    • Find (x,y)=argmin(DM(P(C|lx), P(C|ly))), x≠y

  • Merge and update

    • lxy← lxly

    • L←L\{lx,ly}{lxy}

  • Loop until |L|>1


Avt construction for odor attribute l.jpg

AVT Construction for Odor attribute

Odor

{m,s,y,f,c,p}

{s,y,f,c,p}

Most similar!

{s,y,f}

{a,l,n}

{s,y}

{c,p}

{a,l}

{m}

{y}

{s}

{f}

{c}

{p}

{a}

{l}

{n}

Done!


More about avt learner l.jpg

More about AVT Learner

  • Similarity measure – Pairwise Jensen-Shannon Divergence

  • For continuous-valued attributes, define intervals based on observed values for the attribute in the data set


Evaluation of avts l.jpg

Evaluation of AVTs

  • We use Attribute Value Taxonomies guided Naïve Bayes Learner (AVT-NBL) – (Zhang & Honavar, 2004)

  • Why AVT-NBL?

    • AVT-NBL offers an effective approach to learning compact (hence more comprehensible) accurate classifiers from AVTs and data


Avt nbl algorithm l.jpg

AVT-NBL algorithm

  • Find the most accurate Naïve Bayes classifier using the most abstract attribute values

    • Same assumption as NBL that each attribute is independent of the other attributes given the class

    • Starting with the NBL that is based on the most abstract value of each attribute and successively refining the classifier (hypothesis)

    • Using a tradeoff criterion between the accuracy and complexity of the resulting classifier


Overview15 l.jpg

Overview

  • Background and Motivation

  • AVT-Learner Algorithm

  • Experimental Results

  • Summary


Experimental settings l.jpg

Experimental Settings

  • Settings

    • AVT-NBL with AVT generated by AVT-Learner

    • AVT-NBL with human-supplied taxonomy

    • Naïve Bayes Learner (NBL)

  • Data sets:

    • 37 benchmark datasets from UCI Machine Learning Repository

    • Simulated missing attribute values

  • Use stratified 10-fold cross validation


P erformance comparisons l.jpg

Performance comparisons

  • AVT-Learner generated AVTs vs. human-supplied AVTsvs. no AVTs (standard NBL)

    • AVT-Learner generated AVTs vs. no AVTs

  • Binary AVTs vs. k-ary AVTs


Comparison with human supplied avts l.jpg

Comparison with human-supplied AVTs

  • Compare performance between human-supplied AVTs and AVT-Learner generated AVTson Mushroom and Nursery datasets from UCI Repository

  • Explore the performance on datasets with different percentage (0%~50%) of simulated missing attribute values

  • Assume the missing values are uniformly distributed on the nominal attributes


Slide19 l.jpg

Figure 1(a). Error rate comparison of classifiers generated by NBL, AVT-NBL JS (Jensen-Shannon divergence, AVT-Learner generated), and AVT-NBL HT (human-supplied AVTs) on Mushroom data


Slide20 l.jpg

Figure 1(b). Size comparison of classifiers generated by NBL, AVT-NBL JS (Jensen-Shannon divergence, AVT-Learner generated), and AVT-NBL HT (human-supplied AVTs) on Mushroom data


Result shown from figure 1 a 1 b l.jpg

Result Shown from Figure 1(a) & 1(b)

  • In terms of the error rates and the size of the resulting classifiers, AVTs generated by AVT-Learner are competitive with human-supplied AVTs when used by AVT-NBL


Further experiments l.jpg

Further experiments

  • For most data sets, there are no human-supplied AVTs available

  • Compare performance between standard NBL and AVT-NBL with AVT-Learner generated AVTson 37 data sets from UCI


Slide23 l.jpg

Table 1. Comparison of accuracy and size of classifiers generated by standard NBL and AVT-NBL with AVT-Learner


Results shown from table 1 l.jpg

Results Shown from Table 1

  • AVT-Learner can generate useful AVTs when no human-supplied AVTs are available (which is common in most application domains)

    • AVTs generated by AVT-Learner, when used by AVT-NBL, yield substantially more compact Naive Bayes Classifiers than those produced by NBL


Binary vs k ary l.jpg

Binary vs. k-ary

  • The AVTs generated by AVT-Learner are basically binary trees

  • Does k-ary AVTs yield better results when used with AVT-NBL?

  • K-ary clustering by merging two internal nodes (parent-child pair) of AVT generated by binary clustering


Merging internal nodes for 4 ary clustering l.jpg

Merging internal nodes for 4-ary clustering

Odor

{m,s,y,f,c,p}

Most similar!

{s,y,f,c,p}

{s,y,f}

{a,l,n}

{s,y}

{c,p}

{a,l}

{m}

{y}

{s}

{f}

{c}

{p}

{a}

{l}

{n}

Done!


Slide27 l.jpg

Table 2. Accuracycomparison of classifiers generated by AVT-NBL used with (a) 2-ary AVTs, (b) 3-ary AVTs, and (c) 4-ary AVTs


Results shown from table 2 l.jpg

Results Shown from Table 2

  • AVT-NBL mostly works best when binary AVTs are used

  • Reducing internal nodes in AVTs will eventually reduce the search space of cuts in AVT-NBL, which leads to generating a less compact classifier


Summary l.jpg

Summary

  • Human-supplied AVTs are unavailable in many application domains AVT-Learner, a simple algorithm for automated construction of AVT from data

  • The AVTs generated by AVT-Learner are competitive with human-supplied AVTs in terms of both the error rate and size of the resulting classifiers.

  • AVT-Learner is effective in generating AVTs that when used by AVT-NBL, result in classifiers that are substantially more compact (and often more accurate) than those obtained by the standard Naive Bayes Learner (that does not use AVTs) on the domain where human-supplied AVTs are not available.


Future work l.jpg

Future Work

  • Extending AVT-Learner to learn AVTs that correspond to tangled hierarchies (which can be represented by directed acyclic graphs)

  • Learning AVT from data for a broad range of real world applications

  • Developing algorithms for learning hierarchical ontologies based on part-whole and other relations as opposed to ISA relations captured by an AVT


Slide31 l.jpg

Thank You! Questions?


  • Login