tighter and convex maximum margin clustering l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Tighter and Convex Maximum Margin Clustering PowerPoint Presentation
Download Presentation
Tighter and Convex Maximum Margin Clustering

Loading in 2 Seconds...

play fullscreen
1 / 29

Tighter and Convex Maximum Margin Clustering - PowerPoint PPT Presentation


  • 158 Views
  • Uploaded on

Tighter and Convex Maximum Margin Clustering. Yu-Feng Li (LAMDA, Nanjing University, China) (liyf@lamda.nju.edu.cn) Ivor W. Tsang (NTU, Singapore) (IvorTsang@ntu.edu.sg)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Tighter and Convex Maximum Margin Clustering' - penney


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
tighter and convex maximum margin clustering

Tighter and Convex Maximum Margin Clustering

Yu-Feng Li (LAMDA, Nanjing University, China) (liyf@lamda.nju.edu.cn)

Ivor W. Tsang (NTU, Singapore) (IvorTsang@ntu.edu.sg)

James T. Kwok (HKUST, Hong Kong) (jamesk@cse.ust.hk)

Zhi-Hua Zhou (LAMDA, Nanjing University, China) (zhouzh@lamda.nju.edu.cn)

summary
Summary
  • Maximum Margin Clustering (MMC) [Xu et al., nips05]
    • inspired by the success of large margin criterion in SVM
    • the state-of-the-art performance in many clustering problems.
  • The problem of existing methods
    • SDP relaxation: global but not scalable
    • Local search: efficient but non-convex
  • We propose a convex LG-MMC method which is also scalable to large datasets via Label Generation strategy.
outline
Outline
  • Introduction
  • The Proposed LG-MMC Method
  • Experimental Results
  • Conclusion
outline4
Outline
  • Introduction
  • The Proposed LG-MMC Method
  • Experimental Results
  • Conclusion
maximum margin clustering xu et al nips05
Maximum Margin Clustering [Xu et.al., NIPS05]
  • Perform clustering (i.e., determining the unknown label y) by simultaneously finding maximum margin hyperplane in the data
  • Setting
    • Given a set of unlabeled pattern
  • Goal
    • Learn a decision function and a label vector

Error

Margin

Balance Constraint

maximum margin clustering xu et al nips056
Maximum Margin Clustering [Xu et.al., NIPS05]
  • The Dual problem
  • Key
    • Some kind of relaxation maybe helpful 

Mixed integer program, intractable for large scale dataset 

related work
Related work
  • MMC with SDP relaxation [Xu et.al., nips05]
    • convex, state-of-the-art performance
    • Expensive: the worse-case O(n^6.5)
  • Generalized MMC [Valizadegan & Jin, nips07]
    • a smaller SDP problem which speedup MMC by 100 times
    • Still expensive: cannot handle medium datasets
  • Some efficient algorithms [Zhang et.al., icml07][Zhao et.al.,sdm08]
    • Much more scalable than global methods
    • Non-convex: may get struck in local minima

To investigate a convex method which is also scalable for large datasets

outline8
Outline
  • Introduction
  • The Proposed LG-MMC Method
  • Experiment Results
  • Conclusion
intuition
Intuition

efficient

hard

?

?

1

-1

SVM

?

SVM

-1

?

1

?

-1

combination

efficient

-1

1

1

-1

1

- yy’ : label-kernel

SVM

-1

-1

1

1

-1

- Multiple label-kernel learning

flow chart of lg mmc
Flow Chart of LG-MMC
  • LG-MMC: transform MMC problem to multiple label-kernel learning via minmax relaxation
  • Cutting Plane Algorithm
    • multiple label-kernel learning
    • Finding the most violated y
  • LG-MMC achieves tighter relaxation than SDP relaxation [Xu et al., nips05]
lg mmc minmax relaxation of mmc problem
LG-MMC: Minmax relaxation of MMC problem
  • Consider interchanging the order of and , leading to:
  • According to the minmax theorem, the optimal objective of LG-MMC is upper bound of that of MMC problem.
lg mmc multiple label kernel learning
LG-MMC: multiple label-kernel learning
  • Firstly, LG-MMC can be rewritten as:
  • For the inner optimization subproblem, let be the dual variable for each constraint. Its Lagrangian can be obtained as:
lg mmc multiple label kernel learning cont
LG-MMC: multiple label-kernel learning (cont.)
  • Setting its derivative w.r.t. to zero, we have
  • Let be the simplex
  • Replace the inner subproblem with its dual and one can have:
  • Similar to single label learning, the above formulation can be regarded as multiple label-kernel learning.
cutting plane algorithm
Cutting Plane Algorithm
  • Problem: Exponential number of possible labeling assignment
    • the set of base kernels is also exponential in size
    • direct multiple kernel learning (MKL) is computationally intractable
  • Observation
    • only a subset of these constraints are active at optimality
    • cutting-planemethod
cutting plane algorithm15
Cutting Plane Algorithm

1. Initialize . Find the most violated y and set = {y,−y}. ( is the subset of constraints).

2. Run MKL for the subset of kernel matrices selected in .

3. Find the most violated y and set

4. Repeat steps 2-3 until convergence.

How?

How?

cutting plane algorithm step2 multiple label kernel learning
Cutting Plane AlgorithmStep2: Multiple Label-Kernel Learning
    • Suppose that the current working set is
    • The feature map for the base kernel matrix :
  • SimpleMKL

1. Fix and solve the SVM’s dual

2. Fix and use gradient method for updating

3. Iterate until converge

cutting plane algorithm step 3 finding the most violated y
Cutting Plane Algorithm Step 3: Finding the most violated y
  • Find the most violated y:
  • Problem: Concave QP
  • Observation:
    • The cutting plane algorithm only requires the addition of a violated constraint at each iteration
    • Replace the L2 norm above with infinity-norm
cutting plane algorithm step 3 finding the most violated y18
Cutting Plane Algorithm Step 3: Finding the most violated y
  • Each of these is of the form:
    • Sort ‘s
    • Balance constraint
lg mmc achieves tighter relaxation
LG-MMC achieves tighter relaxation
  • Consider the set of all feasible label matrices

and two relaxations

Convex hull

lg mmc achieves tighter relaxation cont
LG-MMC achieves tighter relaxation (cont.)
  • Define
  • One can find that
    • Maximum margin clustering is the same as
    • LG-MMC problem is the same as
    • SDP based MMC problem is the same as
lg mmc achieves tighter relaxation cont21
LG-MMC achieves tighter relaxation (cont.)
  • is the convex-hull of , which is the smallest convex set containing .
    • LG-MMC gives the tightest convex relaxation.
  • It can be shown that is more relaxed than .
    • SDP MMC is a looser relaxation than the proposed formulation.
outline22
Outline
  • Introduction
  • The Proposed LG-MMC Method
  • Experimental Results
  • Conclusion
experiments
Experiments

Data sets

17 UCI dataset

MNIST dataset

Implementation

Matlab 7.6

Evaluation

Misclassification error

compared methods
Compared Methods
  • k-means
    • One of most mature baseline methods
  • Normalized Cut [Shi & Malik, PAMI00]
    • The first spectral based clustering method
  • GMMC [Valizadegan & Jin, nips07]
    • One of the most efficient global methods for MMC
  • IterSVR [Zhang et.al., icml07]
    • An efficient algorithm for MMC
  • CPMMC [Zhao et.al., sdm08]
    • Another state-of-the-art efficient method for MMC
win tie loss
Win-tie-loss
  • Global method vs local method
    • Global method are better than local method.
  • LG-MMC vs GMMC
    • LG-MMC is competitive to GMMC method.
speed
Speed

LG-MMC is about 10 times faster than GMMC

However, In general, local methods are faster than global method.

outline28
Outline
  • Introduction
  • The Proposed LG-MMC Method
  • Experiment Results
  • Conclusion
conclusion
Conclusion
  • Main Contribution
    • In this paper, we propose a scalable and global optimization method for maximum margin clustering
    • To our best knowledge, it is first time to use label-generation strategy for clustering which might be useful in other domains
  • Further work
    • In further, we will extend the proposed approach for semi-supervised learning.

Thank you