Margin trees for high dimensional classification
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

Margin Trees for High-dimensional Classification PowerPoint PPT Presentation


  • 51 Views
  • Uploaded on
  • Presentation posted in: General

Margin Trees for High-dimensional Classification. Tibshirani and Hastie. Errata (confirmed by Tibshirani). Section 2 (a) about the property of 'single linkage‘. M should be M 0 Section 2.1 close to the last line of second paragraph. “at least” should be “at most”

Download Presentation

Margin Trees for High-dimensional Classification

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Margin trees for high dimensional classification

Margin Trees for High-dimensional Classification

Tibshirani and Hastie


Errata confirmed by tibshirani

Errata (confirmed by Tibshirani)

  • Section 2 (a) about the property of 'single linkage‘. M should be M0

  • Section 2.1 close to the last line of second paragraph. “at least” should be “at most”

  • The statements about complete/single linkage are misleading. In fact, they use standard definition of complete/single linkage except the distance metric is replaced with margin between pairwise classes. (I traced their code to confirm this).


Targeted problem

Targeted Problem

  • Multi-class

    • #class >> 2

  • High-dimensional, few samples

    • #features >> #data linear separable

    • already good accuracy, need interpretable model

  • Ex. micro-array data

    • feature : gene expression measurement

    • class: type of cancer

    • Instances: patients


Learn a highly interpretable structure for domain experts

T

(

)

¯

¯

S

i

+

x

g

n

x

0

Learn a Highly Interpretable Structure for Domain Experts

Check certain genes

Help create the link of gene to cancer


Higher interpretability

Higher Interpretability

  • Multi-class problems  reduce to binary

    • 1vs1 voting  not meaningful

    • tree representation

  • Non-linear-separable data

    • single non-linear classifier

    • organized teams of linear classifiers

  • Solution:

    • Margintree =Hierarchical Tree + max-margin classifier + Feature Selection (interpretation) (minimize risk) (limited #feature/split)


Using margin tree

Training

Construct tree structure

Train max-margin classifier at each splitter

Testing

Start from root node

Going down following the prediction of classifiers at splitting points

ex. Right, Right  class: 3

Using margin-Tree

{1} vs{2,3}

{2} vs {3}


Tree structure 1 2

Tree Structure(1/2)

  • Top-down Construction

    • Greedy


Greedy 1 3

Greedy (1/3)

1,2,3

  • Starting from root with all classes {1,2,3}

  • find maximum margin among all partitions {1} vs {2,3}; {2} vs {1,3}; {3}vs{1,2} 2n-1partitions!


Greedy 2 3

Greedy (2/3)

1,2,3

2,3

  • Repeat in child nodes.


Greedy 2 31

Greedy (2/3)

1,2,3

2,3

  • Done!

  • Warning: Greedy not necessary lead to global optimum

  • i.e. find out the global maximal margin


Tree structure 2 2

Tree Structure(2/2)

  • Bottom-up Treeiteratively merge closest groups.

    • Single linkage: distance = nearest pair.

    • Complete linkage: distance = farthest pair.


Complete tree

Complete Tree


Complete tree1

Complete Tree

Height(subtree) = distance(the farthest pair of classes)≥ Margin(cutting through the subtree)

When looking for a Margin > Height(substree), never break classes in the subtree


Efficient greedy tree construction

Efficient Greedy Tree Construction

  • Construct a complete linkage tree T

  • Estimate current lower bound of maximal margin M0= max Margin(individual class, rest)

  • To find a margin ≥ M0We only needto consider partition between{5,4,6}, {1}, {2,3}

M0


Margin trees for high dimensional classification

  • Comparable testing performances (also 1vs1 voting)

  • Complete linkage tree more balance  more interpretable


Recall the cutting plane

T

T

(

)

¯

¯

¯

¯

D

S

i

i

i

0

+

+

e

c

x

s

o

n

g

n

x

=

=

0

0

Recall the cutting plane

βis the weight of features in decision function


Feature selection

Feature Selection

  • Hard-thresholding at each split

    • Discard n features with low abs(βi) by setting βi=0

    • Proportional to margin: n = α|Margin|

    • α chosen by cross-validation error

  • βunavailable using non-linear kernel

  • Alternative methods

    • L1-norm SVM  force βi to zero


Setting i 0

T

T

(

)

¯

¯

¯

¯

¯

D

S

i

i

i

0

+

+

e

c

x

s

o

n

g

n

x

=

=

0

0

Setting βi=0


Feature selection result

Feature Selection Result


Discussion

Discussion

  • Good for multi-class, high-dimensional data

  • Bad for non-linear separable data.

    • Each node will contain impure dataimpure β

  • Testing performance comparable to traditional multi-class max-margin classifiers (SVMs).


  • Login