Loading in 5 sec....

Margin Trees for High-dimensional ClassificationPowerPoint Presentation

Margin Trees for High-dimensional Classification

- By
**india** - Follow User

- 80 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Margin Trees for High-dimensional Classification' - india

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Margin Trees for High-dimensional Classification

Tibshirani and Hastie

Errata (confirmed by Tibshirani)

- Section 2 (a) about the property of 'single linkage‘. M should be M0
- Section 2.1 close to the last line of second paragraph. “at least” should be “at most”
- The statements about complete/single linkage are misleading. In fact, they use standard definition of complete/single linkage except the distance metric is replaced with margin between pairwise classes. (I traced their code to confirm this).

Targeted Problem

- Multi-class
- #class >> 2

- High-dimensional, few samples
- #features >> #data linear separable
- already good accuracy, need interpretable model

- Ex. micro-array data
- feature : gene expression measurement
- class: type of cancer
- Instances: patients

(

)

¯

¯

S

i

+

x

g

n

x

0

Learn a Highly Interpretable Structure for Domain ExpertsCheck certain genes

Help create the link of gene to cancer

Higher Interpretability

- Multi-class problems reduce to binary
- 1vs1 voting not meaningful
- tree representation

- Non-linear-separable data
- single non-linear classifier
- organized teams of linear classifiers

- Solution:
- Margintree =Hierarchical Tree + max-margin classifier + Feature Selection (interpretation) (minimize risk) (limited #feature/split)

Construct tree structure

Train max-margin classifier at each splitter

Testing

Start from root node

Going down following the prediction of classifiers at splitting points

ex. Right, Right class: 3

Using margin-Tree{1} vs{2,3}

{2} vs {3}

Tree Structure(1/2)

- Top-down Construction
- Greedy

Greedy (1/3)

1,2,3

- Starting from root with all classes {1,2,3}
- find maximum margin among all partitions {1} vs {2,3}; {2} vs {1,3}; {3}vs{1,2} 2n-1partitions!

Greedy (2/3)

1,2,3

2,3

- Done!
- Warning: Greedy not necessary lead to global optimum
- i.e. find out the global maximal margin

Tree Structure(2/2)

- Bottom-up Treeiteratively merge closest groups.
- Single linkage: distance = nearest pair.
- Complete linkage: distance = farthest pair.

Complete Tree

Height(subtree) = distance(the farthest pair of classes)≥ Margin(cutting through the subtree)

When looking for a Margin > Height(substree), never break classes in the subtree

Efficient Greedy Tree Construction

- Construct a complete linkage tree T
- Estimate current lower bound of maximal margin M0= max Margin(individual class, rest)
- To find a margin ≥ M0We only needto consider partition between{5,4,6}, {1}, {2,3}

M0

- Comparable testing performances (also 1vs1 voting)
- Complete linkage tree more balance more interpretable

T

(

)

¯

¯

¯

¯

D

S

i

i

i

0

+

+

e

c

x

s

o

n

g

n

x

=

=

0

0

Recall the cutting planeβis the weight of features in decision function

Feature Selection

- Hard-thresholding at each split
- Discard n features with low abs(βi) by setting βi=0
- Proportional to margin: n = α|Margin|
- α chosen by cross-validation error

- βunavailable using non-linear kernel
- Alternative methods
- L1-norm SVM force βi to zero

Discussion

- Good for multi-class, high-dimensional data
- Bad for non-linear separable data.
- Each node will contain impure dataimpure β

- Testing performance comparable to traditional multi-class max-margin classifiers (SVMs).

Download Presentation

Connecting to Server..