Trade offs in explanatory model learning
This presentation is the property of its rightful owner.
Sponsored Links
1 / 53

Trade-offs in Explanatory Model Learning  PowerPoint PPT Presentation


  • 47 Views
  • Uploaded on
  • Presentation posted in: General

1. 22 nd of February 2012. Trade-offs in Explanatory Model Learning . DCAP Meeting Madalina Fiterau. 2. Outline. Motivation: need for interpretable models Overview of data analysis tools Model evaluation – accuracy vs complexity Model evaluation – understandability

Download Presentation

Trade-offs in Explanatory Model Learning 

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Trade offs in explanatory model learning

1

22nd of February 2012

Trade-offs in Explanatory Model Learning 

DCAP Meeting

Madalina Fiterau


Outline

2

Outline

  • Motivation: need for interpretable models

  • Overview of data analysis tools

  • Model evaluation – accuracy vs complexity

  • Model evaluation – understandability

  • Example applications

  • Summary


Example application nuclear threat detection

Border control: vehicles are scanned

Human in the loop interpreting results

Example Application:Nuclear Threat Detection

prediction

feedback

vehicle scan


Boosted decision stumps

4

Boosted Decision Stumps

  • Accurate, but hard to interpret

How is the prediction derived from the input?


Decision tree more interpretable

5

Decision Tree – More Interpretable

yes

no

Radiation > x%

no

yes

Payload type = ceramics

Consider balance of Th232, Ra226 and Co60

yes

no

Uranium level > max. admissible for ceramics

Threat

Clear


Motivation

6

Motivation

Many users are willing to trade accuracy to

better understand the system-yielded results

Need: simple, interpretable model

Need: explanatory prediction process


Analysis tools black box

7

Analysis Tools – Black-box


Analysis tools white box

8

Analysis Tools – White-box


Explanation oriented partitioning

9

Explanation-Oriented Partitioning

2 Gaussians

Uniform cube

(X,Y) plot


Eop execution example 3d data

EOP Execution Example – 3D data

Step 1: Select a projection - (X1,X2)


Trade offs in explanatory model learning

EOP Execution Example – 3D data

Step 1: Select a projection - (X1,X2)


Eop execution example 3d data1

EOP Execution Example – 3D data

h1

Step 2: Choose a good classifier - call it h1


Eop execution example 3d data2

EOP Execution Example – 3D data

Step 2: Choose a good classifier - call it h1


Eop execution example 3d data3

EOP Execution Example – 3D data

OK

NOT OK

Step 3: Estimate accuracy of h1 at each point


Eop execution example 3d data4

EOP Execution Example – 3D data

Step 3: Estimate accuracy of h1 for each point


Eop execution example 3d data5

EOP Execution Example – 3D data

Step 4: Identify high accuracy regions


Eop execution example 3d data6

EOP Execution Example – 3D data

Step 4: Identify high accuracy regions


Eop execution example 3d data7

EOP Execution Example – 3D data

Step 5:Training points - removed from consideration


Eop execution example 3d data8

EOP Execution Example – 3D data

Step 5:Training points - removed from consideration


Eop execution example 3d data9

EOP Execution Example – 3D data

Finished first iteration


Eop execution example 3d data10

EOP Execution Example – 3D data

Finished second iteration


Eop execution example 3d data11

EOP Execution Example – 3D data

Iterate until all data is accounted for

or error cannot be decreased


Learned model processing query x 1 x 2 x 3

Learned Model – Processing query [x1x2x3]

[x1x2] in R1 ?

yes

h1(x1x2)

no

yes

[x2x3] in R2 ?

h2(x2x3)

no

yes

[x1x3] in R3 ?

h3(x1x3)

no

Default Value


Parametric nonparametric regions

Parametric / Nonparametric Regions

Query point

decision

decision

Correctly classified

Incorrectly classified

n3

n2

p

n5

n1

n4


Eop in context

25

EOP in context

Similarities

Differences

Default classifiers in leafs

CART

Decision structure

Models trained on all features

Feating

Local models

Subspacing

Low-d projection

Keeps all data points

Multiboosting

Ensemble learner

Committee decision

Gradually deals with difficult data

Boosting


Outline1

26

Outline

  • Motivation: need for interpretable models

  • Overview of data analysis tools

  • Model evaluation – accuracy vs complexity

  • Model evaluation – understandability

  • Example applications

  • Summary


Overview of datasets

27

  • Real valued features, binary output

  • Artificial data – 10 features

    • Low-d Gaussians/uniform cubes

  • UCI repository

  • Application-related datasets

  • Results by k-fold cross validation

    • Accuracy

    • Complexity = expected number of vector operations performed for a classification task

    • Understandability = w.r.t. acknowledged metrics

Overview of datasets


Eop vs adaboost svm base classifiers

EOP is often less accurate, but not significantly

the reduction of complexity is statistically significant

EOP vsAdaBoost - SVM base classifiers

Accuracy

Complexity

Boosting

EOP (nonparametric)

mean diff in accuracy: 0.5%

mean diff in complexity: 85

p-value of paired t-test: 0.832

p-value of paired t-test: 0.003


Eop stumps as base classifiers vs cart on data from the uci repository

EOP (stumps as base classifiers) vs CART on data from the UCI repository

EOP P.

Accuracy

Complexity

  • Parametric EOP yields the simplest models

  • CART is the most accurate

CART

EOP N.


Trade offs in explanatory model learning

  • Why are EOP models less complex?

Typical XOR dataset


Trade offs in explanatory model learning

  • Why are EOP models less complex?

  • CART

  • is accurate

  • takes many iterations

  • does not uncover or leverage structure of data

Typical XOR dataset


Trade offs in explanatory model learning

Why are EOP models less complex?

  • CART

  • is accurate

  • takes many iterations

  • does not uncover or leverage structure of data

Typical XOR dataset

  • EOP

  • equally accurate

  • uncovers structure

+

o

Iteration 1

o

+

Iteration 2


Trade offs in explanatory model learning

At low complexities, EOP is typically more accurate

Error Variation With Model Complexity for EOP and CART

EOP_Error-CART_Error

Depth of decision tree/list


Uci data accuracy

35

UCI data – Accuracy

White box models – including EOP – do not lose much


Uci data model complexity

36

UCI data – Model complexity

White box models – especially EOP – are less complex

Complexity of Random Forests is huge

- thousands of nodes -


Robustness

  • Accuracy-targeting EOP

    • Identifies which portions of the data can be confidently classified with a given rate.

    • Allowed to set aside the noisy part of the data.

Robustness

Accuracy of AT-EOP on 3D data

Accuracy

Max allowed expected error


Outline2

38

Outline

  • Motivation: need for interpretable models

  • Overview of data analysis tools

  • Model evaluation – accuracy vs complexity

  • Model evaluation – understandability

  • Example applications

  • Summary


Metrics of explainability

Metrics of Explainability*

* L.Geng and H. J. Hamilton - ‘Interestingness measures for data mining: A survey’


Evaluation with usefulness metrics

For 3 out of 4 metrics, EOP beats CART

Evaluation with usefulness metrics

BF =Bayes Factor. L = Lift. J = J-score. NMI = Normalized Mutual Info

  • Higher values are better


Outline3

41

Outline

  • Motivation: need for interpretable models

  • Overview of data analysis tools

  • Model evaluation – accuracy vs complexity

  • Model evaluation – understandability

  • Example applications

  • Summary


Spam detection uci spambase

42

Spam Detection (UCI ‘SPAMBASE’)

  • 10 features: frequencies of misc. words in e-mails

  • Output: spam or not

Complexity


Spam detection iteration 1

43

  • classifier labels everything as spam

  • high confidence regions do enclose mostly spam and:

    • Incidence of the word ‘your’ is low

    • Length of text in capital letters is high

Spam Detection – Iteration 1


Spam detection iteration 2

44

Spam Detection – Iteration 2

  • the required incidence of capitals is increased

  • the square region on the left also encloses examples that will be marked as `not spam'


Spam detection iteration 3

45

Spam Detection – Iteration 3

  • Classifier marks everything as spam

  • Frequency of ‘our’ and ‘hi’ determine the regions

word_frequency_hi


Example applications

Example Applications

Stem Cells

ICU Medication

identify combinations of drugs that correlate with readmissions

explain relationships between treatment

and cell evolution

sparse features

class imbalance

Nuclear Threats

Fuel Consumption

identify causes of excess fuel consumption among driving behaviors

support interpretation of radioactivity detected in cargo containers

- high-d data

- train/test from different distributions

adapted regression problem


Effects of cell treatment

47

Effects of Cell Treatment

  • Monitored population of cells

  • 7 features: cycle time, area, perimeter ...

  • Task: determine which cells were treated

Complexity


Trade offs in explanatory model learning

48


Mimic medication data

49

Mimic Medication Data

  • Information about administered medication

  • Features: dosage for each drug

  • Task: predict patient return to ICU

Complexity


Trade offs in explanatory model learning

50


Predicting fuel consumption

51

Predicting Fuel Consumption

  • 10 features: vehicle and driving style characteristics

  • Output: fuel consumption level (high/low)

Complexity


Trade offs in explanatory model learning

52


Nuclear threat detection data

53

Nuclear threat detection data

325 Features

  • Random Forests accuracy: 0.94

  • Rectangular EOP accuracy: 0.881

    … but

    Regions found in 1st iteration for Fold 0:

    • incident.riidFeatures.SNR [2.90,9.2]

    • Incident.riidFeatures.gammaDose [0,1.86]*10-8

      Regions found in 1stiteration for Fold 1:

    • incident.rpmFeatures.gamma.sigma [2.5, 17.381]

    • incident.rpmFeatures.gammaStatistics.skewdose [1.31,…]

No match


Summary

55

Summary

  • In most cases EOP:

    • maintains accuracy

    • reduces complexity

    • identifies useful aspects of the data

  • EOP typically wins in terms of expressiveness

  • Open questions:

    • What if no good low-dimensional projections exist?

    • What to do with inconsistent models in folds of CV?

    • What is the best metric of explainability?


  • Login