Loading in 5 sec....

Considering Cost Asymmetry in Learning Classifiers PowerPoint Presentation

Considering Cost Asymmetry in Learning Classifiers

- 237 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Considering Cost Asymmetry in Learning Classifiers ' - andrew

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Considering Cost Asymmetry in Learning Classifiers

by Bach, Heckerman and Horvitz

Presented by Chunping Wang

Machine Learning Group, Duke University

May 21, 2007

Outline

- Introduction
- SVM with Asymmetric Cost
- SVM Regularization Path (Hastie et al., 2005)
- Path with Cost Asymmetry
- Results
- Conclusions

Introduction (1)

Binary classification

real-valued predictors

binary response

A classifier could be defined as

based on a linear decision function

Parameters

Introduction (2)

- Two types of misclassification:
- false negative: cost
- false positive: cost

Expected cost:

In terms of 0-1 loss function

Real loss function but Non-convex Non-differentiable

Introduction (3)

Convex loss functions – surrogates for the 0-1 loss function

(for training purpose)

Introduction (4)

Empirical cost given n labeled data points

Objective function

asymmetry

regularization

Since convex surrogates of the 0-1 loss function are used for training, the cost asymmetries for training and testing are mismatched.

Motivation: efficiently look at many training asymmetries even if the testing asymmetry is given.

SVM with Asymmetric Cost (3)

The dual problem

where

A quadratic optimization problem given a cost structure

Computation will be intractable for the whole space

Following the SVM regularization path algorithm (Hastie et al., 2005), the authors deal with (1)-(3) and KKT conditions instead of the dual problem.

SVM Regularization Path (1)

- Define active sets of data points:
- Margin:
- Left of margin:
- Right of margin:

KKT conditions

SVM regularization path

The cost is symmetric and thus searching is along the axis.

SVM Regularization Path (2)

Initialization ( )

Consider sufficiently large (C is very small), all the points are in L

with

Decrease

Remain

One or more positive and negative examples hit the margin simultaneously

SVM Regularization Path (3)

Initialization ( )

Define

The critical condition for first two points hitting the margin

For , this initial condition keeps the same except the definition of .

SVM Regularization Path (4)

- The path: decrease , changes only for except that one of the following events happens
- A point from L or R has entered M;
- A point in M has left the set to join either R or L

consider only the points on the margin

where is some function of ,

Therefore, the for points on the margin proceed linearly in ; the function changes in a piecewise-inverse manner in

SVM Regularization Path (4)

- The path: decrease , changes only for except that one of the following events happens
- A point from L or R has entered M;
- A point in M has left the set to join either R or L

consider only the points on the margin

where is some function of ,

Therefore, the for points on the margin proceed linearly in ; the function changes in a piecewise-inverse manner in .

SVM Regularization Path (5)

- Update regularization
- Update active sets and solutions
- Stopping condition
- In the separable case, we terminate when L become empty;
- In the non-separable case, we terminate when

for all the possible events

Path withCost Asymmetry (1)

Exploration in the 2-d space

Path initialization: start at situations when all points are in L

Follow the updating procedure in the 1-d case along the line

Regularization is changing and the cost asymmetry is fixed.

Among all the classifiers, find the best one , given user’s cost function

Paths starting from

Path withCost Asymmetry (2)

Produce ROC

Collecting R lines in the direction of , we can build three ROC curves

Results (1)

- For 1000 testing asymmetries , three methods are compared:
- “one” – take as training cost asymmetry;
- “int” – vary the intercept of “one” and build an ROC, then select the optimal classifier;
- “all” – select the optimal classifier from the ROC obtained by varying both the training asymmetry and the intercept.

- Use a nested cross-validation:
- The outer cross-validation: produce overall accuracy estimates for the classifier;
- The inner cross-validation: select optimal classifier parameters (training asymmetry and/or intercept).

Conclusions

- An efficient algorithm is presented to build ROC curves by varying the training cost asymmetries for SVMs.
- The main contribution is generalizing the SVM regularization path (Hastie et al., 2005) from a 1-d axis to a 2-d plane.
- Because of the usage of a convex surrogate, using the testing asymmetry for training leads to non-optimal classifier.
- Results show advantages of considering more training asymmetries.

Download Presentation

Connecting to Server..