Loading in 5 sec....

Integrating Constraints and Metric Learning in Semi-Supervised ClusteringPowerPoint Presentation

Integrating Constraints and Metric Learning in Semi-Supervised Clustering

- 135 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Integrating Constraints and Metric Learning in Semi-Supervised Clustering' - calvin-skinner

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Integrating Constraints and Metric Learning in Semi-Supervised Clustering

Mikhail Bilenko, Sugato Basu, Raymond J. Mooney

ICML 2004

Presented by Xin Li

Semi-Supervised Clustering Semi-Supervised Clustering

K=4

Semi-Supervised Clustering Semi-Supervised Clustering

Semi-Supervised Clustering Semi-Supervised Clustering

How to exploit supervision in clustering Semi-Supervised Clustering

- Incorporate supervision as constraints
- Learn a distance metric using supervision
- Integration of these two approaches

K-means Clustering Semi-Supervised Clustering

X = {x1,x2,…}

L = {l1,l2,…,lk}

Euclidean Distance:

Minimizing:

Clustering with constraints Semi-Supervised Clustering

Pairwise constraints:

- M – Must-link pairs
- (xi, xj) should be in the same cluster

- C -- Cannot-link pairs
- (xi, xj) should be in different clusters

Learning a pairwise distance metric Semi-Supervised Clustering

Binary Classification: (xi, xj) 0/1

- M positive examples
- (xi, xj) are the same cluster

- C negative examples
- (xi, xj) are in different clusters

- Apply the learned distance metric in clustering
- Metric learning and clustering are disjointed

Maximizing the complete data log-likelihood under generalized K-means

Unsupervised Clustering with Metric LearningLearn a distance metric that optimize a quality function

Integrating Constraints and Metric Learning generalized K-means

Combining the previous two equations leads to the following objective function that minimizes cluster dispersion under that learned metrics while reducing constraint violations.

Penalty for violating constraints generalized K-means

- Penalty for violating a must-link constraints between distant points should be higher than that between nearby points.
- Penalty for violating a cannot-link constraints between nearby points should be lower than that between nearby points.

MPCK-MEANS Algorithm generalized K-means

- Constraints are utilized during cluster initialization and when assigning points to clusters.
- The distance metric is adapted by re-estimating the weights in matrices Ah.

Initialization generalized K-means

- An initial guess of the clusters.
- Assign each point x to one of K clusters in a way that satisfies the constraints.
- Compute the centroid of each cluster.

E-step generalized K-means

- Every point x is assigned to the cluster that minimizes the sum of the distance of x to the cluster centroid according to the local metric and the cost of any constraint violations incurred by the cluster assignment.

Experimental Setting generalized K-means

Single Metric, Diagonal Matrix A generalized K-means

Single Metric, Diagonal Matrix A generalized K-means

Multiple Metrics, Full Matrix A generalized K-means

Multiple Metrics, Full Matrix A generalized K-means

Conclusion and Discussion generalized K-means

- This paper has presented MPCK-MEANS, a new approach to semi-supervised clustering.
- Supervision and metric learning are helpful in clustering and multiple distance metrics are not necessary in most cases.
- Question 1: If we have supervision in clustering, why not utilize supervision in the same way as in a typical classification task ?
- Question 2: If there are infinite number of classes, can we gain from supervision on part of them ?

Download Presentation

Connecting to Server..