Order Structure, Correspondence, and Shape Based Categories

Order Structure, Correspondence, and Shape Based Categories Stefan Carlsson Presented by Piotr Dollar October 24, 2002

Visual Correspondence • No “correct” answer • Humans Exceptionally Good • Tell & Carlsson (later work) • Object Recognition

Basic Idea • What are the two groups here? • What makes these Similar?

Basic Idea (cont.) • Select Points • Tangent Lines • Structure and location of points and lines • More general than affine and projective transformation that preserve the side from which the image is taken.

Order Types Fundamentally different Structures

Interlude: Homogeneous Coordinates • Points and lines have same representation (3 element vector) • For point: [x y 1] (x and y coordinates) • For line v = [u v –d], the set of points x s.t. vT.x = 0. This line is orthogonal to the standard vector [u v] • Every point can be thought of as a line and vice versa (normalize lines to [u v 1])

Order type: Points • Order structure for points is orientation • Three points may be collinear, or, traversing them in order can give clockwise or counterclockwise rotation • Math: given three points x1, x2 and x3, the determinant of the matrix [ x1 x2 x3 ] is 0 if the points are collinear, negative if they define a clockwise rotation, positive otherwise. • Why: [show informal proof if time] • For >3 points, take all subsets of 3 points and get each subsets ordering. Maintain a consistent ordering of the points.

Order type: lines • Just as we took the orientation of 3 points, can get orientation of 3 lines • Sine each line corresponds to a unique point (scale the line to [ x y 1]), taking the sign of the determinant of the matrix of three lines gives us their orientation • Again, for multiple lines take each ordered subset of 3 lines

Order type: lines (II) • Actually, Carlsson normalizes the middle component of the line vector (i.e. it becomes [ a 1 b ]). • Nice because then it defines the set of points x s.t. vT.x = 0 or in other words ax + y + b =0. Useful for taking combination of points and lines. • However same basic idea.

Order type: Points & Lines • Since the defining vector for a line now always points up ([a 1 b]), we can think of line as having a direction (say always right). • Each point in plane now falls to the right or left of the line (or on the line). • Can use vT.x = 0. Let y2 = -ax – b (y2 is the y coordinate of the point with x coordinate x that falls on the line). Now, if y>y2, (i.e. the point falls to the left of the line) then ax + y +b < 0 is, and likewise y>y2 implies vT.x > 0. • Hence can get orientation of a point relative to a line.

Order type: “0” • The orientation “0” for 3 points (i.e. they are collinear) is an unstable orientation. By this I mean that a miniscule change in the alignment of the three points will cause the orientation to go to either +1 or -1. • Good idea to not use collinear points. What does Carlsson do? Hmm… • One could also try to filter against noise by discarding points that are nearly collinear – i.e. have a determinant that is within some epsilon of 0.

Combining the information • So now for each set of points, lines, and points and lines we can get a series of binary values that contain information about their structure. • What information is useful? • Specifically: How do we filter out dependencies that arose because of how we numbered our lines and points?

Equivalence Classes for Points • Initial ordering of points determines their orientation, in fact all sets of 3 points have same “structure”, even though they can have different orientations. • For sets of >3 points, however, there are different structures. • We can use the information from the orientation of each subset of 3 points to deduce the actual equivalence class for the points.

Equivalence Classes for Points (II) • Two Equivalence classes

Equivalence Classes for Points (III) • Thus we can learn the equivalence class or the order type of a set of points. • Three different order types exist for 5 points

Canonical Orderings • For each equivalence class, we define an arbitrary but fixed numbering scheme. This is the “canonical ordering”. • If cyclic equivalence class, we will call the leftmost point #1. • So, for five points:

Combining the information (II) • Now we have a fixed numbering scheme! • Using this number scheme (both for the points and their corresponding tangent lines), we now get the orientation information for the sets of lines for each pair of a point and line. • This information is now directly useful. No false orientation from the initial order of the points. • Note that we used the point ordering differently from the line and point/line ordering.

Combining the information (III) • For any subset of points, we can get a single number representing their structure. We call this the order structure index. • Two different sets of points with the same order structure index are very similar. We use this as the basis for our algorithm.

The Algorithm • The algorithm is very simple. It is divided into two stages: model building and actual indexing. • The two stages are similar, except in the first we use the first image to build a table, in the second we use the second image and the table to vote on point correspondences.

The Algorithm: Model Building For each subset of 5 points: • Compute the order structure index (OSI) for the points. • The OSI identifies a location in the table, place the 5 point set at that location in the table.

The Algorithm: Indexing For each subset of 5 points (p1,…p5): • Compute the (OSI) for the points, and the index in the table the OSI indicates. • For every set of points (m1,…,m5) at that index, give one vote for p1 corresponding to m1, p2 to m2, and so on. This gives us a table that indicates how strongly points correspond.

Results: (normalized vote counts)

Discussion • Using point structure and tangent lines seemed to give good results. • Just one of many possible schemes to extract feature relations. • If n is the number of points extracted from an image, the running time is at least O(n^5). Thus this algorithm is useful only for small number of points (authors say it is 25-30). • Also used for object recognition (multiple models), classify an image. Choose model with the highest number of point correspondences.

References • Combinatorial Geometry for Shape Representation and Indexing, Carlsson 1996.

Order Structure, Correspondence, and Shape Based Categories