- 50 Views
- Uploaded on
- Presentation posted in: General

Segmentation Techniques Luis E. Tirado PhD qualifying exam presentation Northeastern University

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

SegmentationTechniquesLuis E. TiradoPhD qualifying exam presentation Northeastern University

- Spectral Clustering
- Graph-cut
- Normalized graph-cut

- Expectation Maximization (EM) clustering

- Spectral Clustering
- Graph-cut
- Normalized graph-cut

- Expectation Maximization (EM) clustering

9/15/2014

A

B

- Graph G(V,E)
- Set of vertices and edges
- Numbers represent weights

- Graphs for Clustering
- Points are vertices
- Weights reduced with distance
- Segmentation: look for minimum cut in graph

9/15/2014

5

9

4

2

6

1

8

1

1

3

7

from Forsyth & Ponce

- Graph-cut
- Undirected, weighted graph G = (V,E) as affinity matrix A
- Use eigenvectors for segmentation
- Assume k elements and c clusters
- Represent cluster n with vector w of k components
- Values represent cluster association; normalize so that

- Extract good clusters
- Select wn which maximizes
- Solution is
- wn is an eigenvector of A; select eigenvector with largest eigenvalue

9/15/2014

- Normalized Cut
- Address
drawbacks of

graph-cut

- Define association
between vertex subset A and full set V as:

- Previously maximized assoc(A,A); now also wish to minimize assoc(A,V). Define normalized cut as:

- Address

9/15/2014

- Normalized Cuts Algorithm
- Definewhere A is affinity matrix.
- Define vector x depicting cluster membership
- xi = 1 if point i is in A, and -1, otherwise

- Define real approximation to x:
- We now wish to minimize objective function:
- This constitutes solving:
- Solution is eigenvector with second smallest eigenvalue
- If normcut is over some threshold, re-partition graph.

9/15/2014

- Expectation Maximization (EM) Algorithm
- Density estimation of data points in unsupervised setting
- Finds ML estimates when data depends on latent variables
- E step – likelihood expectation including latent variables as observed
- M step – computes ML estimates of parameters by maximizing above

- Start with Gaussian Mixture Model:
- Segmentation: reformulate as missing data problem
- Latent variable Z provides labeling

- Gaussian bivariate PDF:

9/15/2014

- EM Process
- Maximize log-likelihood function:
- Not trivial; introduce Z, & denote complete data Y = [XTZT]T:
- We know above data; ML is easy:

9/15/2014

- EM steps

9/15/2014

- For simple case like example of four Gaussians, both algorithms perform well, as can be seen from results
- From literature: (k = # of clusters)
- EM is good for small k; coarse segmentation for large k
- Needs to know number of components to cluster
- Initial conditions are essential; prior knowledge helpful to accelerate convergence and achieving a local/global maximum of likelihood

- Ncut gives good results for large k
- For fully connected graph, intensive space & computation time requirements

- Graph cut’s first eigenvector approach finds points in the ‘dominant’ cluster
- Not very consistent; literature advocates for normalized approach

- In end, tradeoff depending on source data

- EM is good for small k; coarse segmentation for large k

- J. Shi & J. Malik “Normalized Cuts and Image Segmentation”
- http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf

- C. Bishop “Latent Variables, Mixture Models and EM”
- http://cmp.felk.cvut.cz/cmp/courses/recognition/Resources/_EM/Bishop-EM.ppt

- R. Nugent & L. Stanberry “Spectral Clustering”
- http://www.stat.washington.edu/wxs/Stat593-s03/Student-presentations/SpectralClustering2.ppt

- S. Candemir “Graph-based Algorithms for Segmentation”
- http://www.bilmuh.gyte.edu.tr/BIL629/special%20section-%20graphs/GraphBasedAlgorithmsForComputerVision.ppt

- W. H. Liao “Segmentation: Graph-Theoretic Clustering”
- http://www.cs.nccu.edu.tw/~whliao/acv2008/segmentation_by_graph.ppt

- D. Forsyth & J. Ponce “Computer Vision: A Modern Approach”

- Determine Euclidean distance of each object in data set to (randomly picked) center points
- Construct K clusters by assigning all points to closest cluster
- Move the center points to the real centers of the resulting clusters

- Responsibilities assign data points to clusterssuch that
- Example: 5 data points and 3 clusters

data

prototypes

responsibilities

- E-step: minimize w.r.t.
- assigns each data point to nearest prototype

- M-step: minimize w.r.t
- gives
- each prototype set to the mean of points in that cluster

- Convergence guaranteed since there is a finite number of possible settings for the responsibilities

- Hard assignments of data points to clusters – small shift of a data point can flip it to a different cluster
- Not clear how to choose the value of K – and value must be chosen beforehand.
- Solution: replace ‘hard’ clustering of K-means with ‘soft’ probabilistic assignments of EM

- Not robust to outliers – Far data from centroid may pull centroid away from real one.

- Let us proceed by simply differentiating the log likelihood
- Setting derivative with respect to equal to zero givesgivingwhich is simply the weighted mean of the data

- Form the matrix
- Find , the k largest eigenvectors of L
- These form the columns of the new matrix X
- Note: have reduced dimension from nxn to nxk

- Form the matrix Y
- Renormalize each of X’s rows to have unit length
- Y

- Treat each row of Y as a point in
- Cluster into k clusters via K-means
- Final Cluster Assignment
- Assign point to cluster j iff row i of Y was assigned to cluster j

- If we eventually use K-means, why not just apply K-means to the original data?
- This method allows us to cluster non-convex regions

- Choice of k, the number of clusters
- Choice of scaling factor
- Realistically, search over and pick value that gives the tightest clusters

- Choice of clustering method

- Perona & Freeman
- For block diagonal affinity matrices, the first eigenvector finds points in the “dominant” cluster; not very consistent

- Shi & Malik
- 2nd generalized eigenvector minimizes affinity between groups by affinity within each group; no guarantee, constraints

- Ng, Jordan, Weiss
- Again depends on choice of k
- Claim: effectively handles clusters whose overlap or connectedness varies across clusters

Affinity Matrix Perona/Freeman Shi/Malik

1st eigenv. 2nd gen. eigenv.

Affinity Matrix Perona/Freeman Shi/Malik

1st eigenv. 2nd gen. eigenv.

Affinity Matrix Perona/Freeman Shi/Malik

1st eigenv. 2nd gen. eigenv.