This presentation is the property of its rightful owner.
1 / 31

# Segmentation Techniques Luis E. Tirado PhD qualifying exam presentation Northeastern University PowerPoint PPT Presentation

Segmentation Techniques Luis E. Tirado PhD qualifying exam presentation Northeastern University. Segmentation. Spectral Clustering Graph-cut Normalized graph-cut Expectation Maximization (EM) clustering. Segmentation. Spectral Clustering Graph-cut Normalized graph-cut

Segmentation Techniques Luis E. Tirado PhD qualifying exam presentation Northeastern University

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## SegmentationTechniquesLuis E. TiradoPhD qualifying exam presentation Northeastern University

### Segmentation

• Spectral Clustering

• Graph-cut

• Normalized graph-cut

• Expectation Maximization (EM) clustering

### Segmentation

• Spectral Clustering

• Graph-cut

• Normalized graph-cut

• Expectation Maximization (EM) clustering

9/15/2014

### Graph Theory Terminology

A

B

• Graph G(V,E)

• Set of vertices and edges

• Numbers represent weights

• Graphs for Clustering

• Points are vertices

• Weights reduced with distance

• Segmentation: look for minimum cut in graph

9/15/2014

### Spectral Clustering

5

9

4

2

6

1

8

1

1

3

7

from Forsyth & Ponce

• Graph-cut

• Undirected, weighted graph G = (V,E) as affinity matrix A

• Use eigenvectors for segmentation

• Assume k elements and c clusters

• Represent cluster n with vector w of k components

• Values represent cluster association; normalize so that

• Extract good clusters

• Select wn which maximizes

• Solution is

• wn is an eigenvector of A; select eigenvector with largest eigenvalue

9/15/2014

### Spectral Clustering

• Normalized Cut

drawbacks of

graph-cut

• Define association

between vertex subset A and full set V as:

• Previously maximized assoc(A,A); now also wish to minimize assoc(A,V). Define normalized cut as:

9/15/2014

### Spectral Clustering

• Normalized Cuts Algorithm

• Definewhere A is affinity matrix.

• Define vector x depicting cluster membership

• xi = 1 if point i is in A, and -1, otherwise

• Define real approximation to x:

• We now wish to minimize objective function:

• This constitutes solving:

• Solution is eigenvector with second smallest eigenvalue

• If normcut is over some threshold, re-partition graph.

9/15/2014

### Probabilistic Mixture Resolving Approach to Clustering

• Expectation Maximization (EM) Algorithm

• Density estimation of data points in unsupervised setting

• Finds ML estimates when data depends on latent variables

• E step – likelihood expectation including latent variables as observed

• M step – computes ML estimates of parameters by maximizing above

• Segmentation: reformulate as missing data problem

• Latent variable Z provides labeling

• Gaussian bivariate PDF:

9/15/2014

### Probabilistic Mixture Resolving Approach to Clustering

• EM Process

• Maximize log-likelihood function:

• Not trivial; introduce Z, & denote complete data Y = [XTZT]T:

• We know above data; ML is easy:

9/15/2014

• EM steps

9/15/2014

### Conclusions

• For simple case like example of four Gaussians, both algorithms perform well, as can be seen from results

• From literature: (k = # of clusters)

• EM is good for small k; coarse segmentation for large k

• Needs to know number of components to cluster

• Initial conditions are essential; prior knowledge helpful to accelerate convergence and achieving a local/global maximum of likelihood

• Ncut gives good results for large k

• For fully connected graph, intensive space & computation time requirements

• Graph cut’s first eigenvector approach finds points in the ‘dominant’ cluster

• Not very consistent; literature advocates for normalized approach

• In end, tradeoff depending on source data

### References (for slide images)

• J. Shi & J. Malik “Normalized Cuts and Image Segmentation”

• http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf

• C. Bishop “Latent Variables, Mixture Models and EM”

• http://cmp.felk.cvut.cz/cmp/courses/recognition/Resources/_EM/Bishop-EM.ppt

• R. Nugent & L. Stanberry “Spectral Clustering”

• http://www.stat.washington.edu/wxs/Stat593-s03/Student-presentations/SpectralClustering2.ppt

• S. Candemir “Graph-based Algorithms for Segmentation”

• http://www.bilmuh.gyte.edu.tr/BIL629/special%20section-%20graphs/GraphBasedAlgorithmsForComputerVision.ppt

• W. H. Liao “Segmentation: Graph-Theoretic Clustering”

• http://www.cs.nccu.edu.tw/~whliao/acv2008/segmentation_by_graph.ppt

• D. Forsyth & J. Ponce “Computer Vision: A Modern Approach”

### K-means(used by some clustering algorithms)

• Determine Euclidean distance of each object in data set to (randomly picked) center points

• Construct K clusters by assigning all points to closest cluster

• Move the center points to the real centers of the resulting clusters

### Responsibilities

• Responsibilities assign data points to clusterssuch that

• Example: 5 data points and 3 clusters

data

prototypes

responsibilities

### Minimizing the Cost Function

• E-step: minimize w.r.t.

• assigns each data point to nearest prototype

• M-step: minimize w.r.t

• gives

• each prototype set to the mean of points in that cluster

• Convergence guaranteed since there is a finite number of possible settings for the responsibilities

### Limitations of K-means

• Hard assignments of data points to clusters – small shift of a data point can flip it to a different cluster

• Not clear how to choose the value of K – and value must be chosen beforehand.

• Solution: replace ‘hard’ clustering of K-means with ‘soft’ probabilistic assignments of EM

• Not robust to outliers – Far data from centroid may pull centroid away from real one.

### EM Algorithm – Informal Derivation

• Let us proceed by simply differentiating the log likelihood

• Setting derivative with respect to equal to zero givesgivingwhich is simply the weighted mean of the data

### Ng, Jordan, Weiss Algorithm

• Form the matrix

• Find , the k largest eigenvectors of L

• These form the columns of the new matrix X

• Note: have reduced dimension from nxn to nxk

### Ng, Jordan, Weiss Algorithm

• Form the matrix Y

• Renormalize each of X’s rows to have unit length

• Y

• Treat each row of Y as a point in

• Cluster into k clusters via K-means

• Final Cluster Assignment

• Assign point to cluster j iff row i of Y was assigned to cluster j

### Reasoning for Ng

• If we eventually use K-means, why not just apply K-means to the original data?

• This method allows us to cluster non-convex regions

### User’s Prerogative

• Choice of k, the number of clusters

• Choice of scaling factor

• Realistically, search over and pick value that gives the tightest clusters

• Choice of clustering method

### Comparison of Methods

• Perona & Freeman

• For block diagonal affinity matrices, the first eigenvector finds points in the “dominant” cluster; not very consistent

• Shi & Malik

• 2nd generalized eigenvector minimizes affinity between groups by affinity within each group; no guarantee, constraints

• Ng, Jordan, Weiss

• Again depends on choice of k

• Claim: effectively handles clusters whose overlap or connectedness varies across clusters

Affinity Matrix Perona/Freeman Shi/Malik

1st eigenv. 2nd gen. eigenv.

Affinity Matrix Perona/Freeman Shi/Malik

1st eigenv. 2nd gen. eigenv.

Affinity Matrix Perona/Freeman Shi/Malik

1st eigenv. 2nd gen. eigenv.