 Download Download Presentation Clustering appearance and shape by learning jigsaws Anitha Kannan, John Winn, Carsten Rother

# Clustering appearance and shape by learning jigsaws Anitha Kannan, John Winn, Carsten Rother

Download Presentation ## Clustering appearance and shape by learning jigsaws Anitha Kannan, John Winn, Carsten Rother

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Clustering appearance and shape by learning jigsaws Anitha Kannan, John Winn, Carsten Rother

2. Models for Appearance and Shape • Histograms • discard spatial info • Templates • articulation, deformation, variation • Patch-based approaches • a happy medium • size/shape of the patches is fixed

3. Jigsaw • Intended as a replacement for fixed patch model • Learn a jigsaw image such that: • Pieces are similar in appearance and shape to multiple regions in training image(s)‏ • All training images can be ~reconstructed using only pieces from the jigsaw • Pieces are as large as possible for a particular reconstruction accuracy

4. Jigsaw Model μ(z) – intensity value at pixel z λ-1(z) – variance at z l(i) – offset between image pixel i and corresp. jigsaw pixel

5. Generative Model

6. Generative Model • Each offset map entry is a 2D offset mapping point i in the image to pointz = (i – l(i)) mod |J| in the jigsaw, where|J| = (jigsaw width, jigsaw height)‏ • Product is over image pixels

7. Generative Model • E is the set of edges in a 4-connected grid, with nodes representing offset map values • γ influences the typical jigsaw piece size; set to 5 per channel • δ( true ) = 1, δ( false ) = 0

8. Generative Model • μ0 = 0.5, β = 1, b = 3 times data precision, a = b2 • Normal-Gamma prior allows for unused portions of the jigsaw to be well-defined

9. MAP Learning • Image set is known • Find J, Ls to maximize joint probability • Initialize jigsaw • Set precisions λ to expected value under the prior • Set means μ to Gaussian noise with same mean and variance as the data

10. MAP Learning • Iteration step 1: • Given J, I1..N, update L1..N using α-expansion graph-cut algorithm • Iteration step 2: • Repeat until convergence

11. α-expansion Graph-Cut • Start with arbitrary labeling f • Loop: • For each label α: • Find f' = arg min E(f') among f' within one α-expansion of f • If E(f') < E(f), set f := f' • Else return f

12. Determining Jigsaw Pieces • For each image, define region boundaries as the places where the offset map changes value. • Each region thus maps to a contiguous area of the jigsaw. • Cluster regions based on overlap: • Ratio of intersection to union of the jigsaw pixels mapped to by the two regions • Each cluster corresponds to a jigsaw piece.

13. Toy Example

14. Epitome • Another unfixed patch-based generative model • Patches have fixed size and shape, but not location • Patches can be subdivided (24x24, 12x12, 8x8)‏ • Patches can overlap (average value taken)‏ • Cannot capture occlusion w/o a shape model

15. Jigsaw vs. Epitome

16. Jigsaw for Multiple Images

17. Unsupervised Part Learning

18. The Good • Jigsaw allows automatically sized patches • Occlusion is modeled implicitly, i.e. patch shape is variable • Image segmentation is automatic • Unsupervised part learning an easy next step • Jigsaw reconstructions more accurate and better looking than equivalently sized Epitome model reconstructions

19. The Bad • At each iteration, must solve a binary graph cut for each jigsaw pixel • 30 minutes to learn 36x36 jigsaw from 150x150 toy image • No patch transformation • Can add specific transformations with linear cost increase • Can favor “similar” neighboring offsets in addition to identical ones

20. The Questions?

21. Normalized Cuts and Image Segmentation Jianbo Shi and Jitendra Malix

22. Recursive Partitioning • Segmentation/partitioning inherently hierarchical • Image segmentation from low-level cues should sequentially build hierarchical partitions • Partitioning done big-picture downward • Mid- and high-level knowledge can confirm groups are identify repartitioning candidates

23. Graph Theoretic Approach • Set of points represented as a weighted undirected graph G = (V,E)‏ • Each point is a node; G is fully-connected • w(i,j) is a function of the similarity between i and j • Find a partition of vertices into disjoint sets where by some measure in-set similarity is high, but cross-set similarity is low.

24. Minimum Graph Cut • Dissimilarity between two disjoint sets of vertices can be measured as total weight of edges removed: • The minimum cut defines an optimal bipartitioning • Can use minimum cut for point clustering

25. Minimum Cut Bias • Minimum cut favors small partitions • cut(A,B) increases with the number of edges between A and B • With w(i,j) inversely proportional to dist(i,j), B = n1 is the minimum cut.

26. Normalized Cut • Measure cut cost as a fraction of total edge connections to all nodes • Any cut that partitions small isolated points will have cut(A,B) close to assoc(A,B)‏

27. Normalized Association • Can also use assoc to measure similarity within groups • Minimizing Ncut equivalent to maximizing Nassoc • Makes minimizing Ncut a very good partitioning criterion

28. Minimizing Ncut is NP-Complete • Reformulate problem: • For i in V, xi = 1 if i is in A, -1 otherwise • di = sumj w(i,j)‏

29. Reformulation (cont.)‏ • Let D be an NxN diagonal matrix with d on the diagonal • Let W be an NxN symmetrical matrix with W(i,j) = wij • Let 1 be an Nx1 vector of ones • b = k/(1-k)‏ • y = (1 + x) – b(1 - x)‏

30. Reformulation (cont.)‏ • This is a Rayleigh quotient • By allowing y to take on real values, can minimize this by solving the generalized eigenvalue system (D – W)y = λDy. • But what about the two constraints on y?

31. First Constraint • Transform the previous into a standard eigensystem: D-1/2(D – W)D-1/2z = λz, where z = D1/2y • z0 = D1/21 is an eigenvector with eigenvalue 0. Since D-1/2(D – W)D-1/2 is symmetric positive semidefinite, z0 is the smallest eigenvector and all eigenvectors are perpendicular to each other.

32. First Constraint (cont.)‏ • Translating this back to the general eigensystem: • y0 = 1 is the smallest eigenvector, with eigenvalue 0 • 0 = z1Tz0 = y1TD1, where y1 is the second smallest eigenvector

33. First Constraint (cont.)‏ • Since we are minimizing a Rayleigh quotient with a symmetric matrix, we use the following property – under the constraint that x is orthogonal to the j-1 smallest eigenvectors x1,...,xj-1, the quotient is minimized by xj with the eigenvalue λj being the minimum value.

34. Real-valued Solution • y1 is thus the real valued solution for a minimal Ncut. • We cannot force a discrete solution – relaxing the second constraint makes this problem tractable. • Can transform y1 into a discrete solution by finding the splitting point such that the resulting partition has the best Ncut(A,B) value.

35. Lanczos Method • Graphs are often only locally connected – resulting eigensystem are very sparse • Only the top few eigenvectors are needed for graph partitioning • Need very little precision in resulting eigenvectors • These properties exploited by using Lanczos method; running time approximately O(n3/2)‏

36. Recursive Partitioning redux • After partitioning, the algorithm can be run recursively on each partitioned part • Recursion stops once the Ncut value exceeds a certain limit, or result is “unstable” • When subdividing an image with no clear way of breaking it, eigenvector will resemble a continuous function • Construct a histogram of eigenvector values – if the ratio of minimum to maximum bin size exceeds 0.06, reject partitioning

37. Simultaneous K-Way Cut • Since all eigenvectors will be perpendicular, can use third, fourth, etc. smallest to immediately subdivide partitions • Some such eigenvectors would have failed the stability criteria • Can use top n eigenvectors to partition, then iteratively merge segments • Mentioned by the paper, but no experimental results presented

38. Recursive Two-Way Ncut Algorithm • Given a set of features, construct weighted graph G, summarize information into W and D • Solve (D – W)x = λDx for the eigenvectors with the smallest eigenvalues • Find the splitting point in x1 and bipartition the graph • Check the stability of the cut and the value of Ncut • Recursively repartition segmented parts if necessary

39. Weighting Schemes • X(i) is the spatial location of node i • F(i) is a feature vector defined as • F(i) = 1, for point sets • F(i) = I(i), the intensity value, for brightness • F(i) = [v, v*s*sin(h), v*s*cos(h)](i), for color segmentation • F(i) = [|I*f1|,...,|I*fn|](i), where fi are DOOG filters, in the case of texture segmentation

40. Brightness Segmentation • Image sized 80x100, intensity normalized to lie in [0,1]. Partitions with Ncut value less than 0.04.

41. Brightness Segmentation • 126x106 weather radar image. Ncut value less than 0.08.

42. Color Segmentation • 77x107 color image (reproduced in grayscale in the paper). Ncut value less than 0.04.

43. Texture Segmentation • Texture features correspond to DOOG filters at six orientations and fix scales.

44. Motion Segmentation • Treat the image sequence as spatiotemporal data set. • Weighted graph is constructed by taking all pixels as nodes and connecting spatiotemporal neighbors. • d(i,j) represents “motion distance” between pixels i and j.

45. Motion Distance • Defined as one minus the cross correlation of motion profiles, where the motion profile estimates the probability distribution of image velocity at each pixel.

46. Motion Segmentation Results • Above: two consecutive frames • The head and body have similar motion but dissimilar motion profiles due to 2D textures.

47. Questions?