Max-margin Clustering: Detecting Margins from Projections of Points on Lines

Download Presentation

Max-margin Clustering: Detecting Margins from Projections of Points on Lines

Loading in 2 Seconds...

- 59 Views
- Uploaded on
- Presentation posted in: General

Max-margin Clustering: Detecting Margins from Projections of Points on Lines

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Max-margin Clustering: Detecting Margins from Projections of Points on Lines

Raghuraman Gopalan1, and Jagan Sankaranarayanan2

1Center for Automation Research, University of Maryland, College Park, MD USA

2NEC Labs, Cupertino, CA USA

E-mail: {raghuram,jagan}@umiacs.umd.edu

Problem Statement

- Given an unlabelled set of points forming k clusters, find a grouping with maximum separating margin among the clusters
- Prior work: (Mostly) Establish feedback between different label proposals, and run a supervised classifier on it
- Goal: To understand the relation between data points and margin regions by analyzing projections of data on lines

Two-cluster Problem

- Assumptions
- Linearly separable clusters
- Kernel trick for non-linear case

- No outliers in data (max margin exist only between clusters)
- Enforce global cluster balance

- Proposition 1
- SI* exists ONLY on line segments in margin region that are perpendicular to the separating hyperplane
- Such line segments directly provide cluster groupings

Multi-cluster Problem

SI* doesn’t exist

Location information of projected points (SI) alone is insufficient to detect margins

The Role of Distance of Projection

Proposition 2

For line intervals in margin region, perpendicular to the separating hyperplane,

Proposition 3

For line intervals inside a cluster of length more than Mm,

Proposition 4

An interval with SI having no projected points with distance of projection less than Dmin*, can lie only outside a cluster; where

γ2

CL2

γ3

CL3

CL1

Defn: Dmin of a line interval is the minimum distance of projection of points in that interval.

No outlier assumption: Max margin between points within a cluster

γ1

A Pair-wise Similarity Measure for Clustering

- f(xi,xj)=1, iff xi=xj
- f(xi,xj)<<1, iff xi and xj are from different clusters, and Intij is perpendicular to their separating hyperplane

Max-margin Clustering Algorithm

- Draw lines between all pairs of points
- Estimate the probability of presence of margins between a pair of points xi and xj by computing f(xi,xj)
- Perform global clustering using f between all point-pairs

Results

3D

2D

Summary

ClusteringDetecting margin regions

- Obtaining statistics of location and distance of projection of points that are specific to line segments in margin regions (Prop. 1 to 4)
- A pair-wise similarity measure to perform clustering, which avoids some optimization-related challenges prevalent in most existing methods