3-Point Alignment: Linear Transformation Techniques

Venus

Classification

Faces – Different

Faces -- Same

Lighting affects appearance

Three-point alignment

Object Alignment Given three model points P1, P2, P3, and three image points p1, p2, p3, there is a unique transformation (rotation, translation, scale) that aligns the model with the image. (SR + d)Pi = pi

Alignment -- comments • The projection is orthographic projection (combined with scaling). • The 3 points are required to be non-collinear. • The transformation is determined up to a reflection of the points about the image plane and translation in depth.

Proof of the 3-point Alignment: The 3 3-D points are P1, P2, P3. We can assume that they are initially in the image plane. In the 2-D image we get q1, q2, q3. The transformation P1 > q1, P2 > q2, P3 > q3, defines a unique linear transformation of the plane, L(x). We can easily recover this transformation. L is a 2*2 matrix. We fix the origin at P1 = q1. We have two more points that define 4 linear equations for the elements of L. We now choose two orthogonal vectors E1 and E2 in the original plane of P1, P2, P3. We can compute E1’ = L(E1), E2’ = L(E2). We seek a scaling S, Rotation R, so that the projection of SR(E1) = E1’ and SR(E2) = E2’. Let SR(E1) (without the projection) be V1 and SR(E2) = V2. V1 is E1’ plus a depth component, that is, V1 = E1’ + c1z, where z is a unit vector in the z direction. Similarly, V2 = E2’ + c2z. We wish to recover c1 and c2. This will give the transformation between the points (show that it is unique, and it will be possible to recover the transformation). We know that the scalar product (V1 V2) = 0. (E1’ + c1z) (E1’ + c1z) = 0 Therefore c1c2 = -(E’1 E’2). The magnitude -(E’1 E’2) is measurable in the image, call it C12, therefore c1c2= c12. Also |V1| = |V2|. Therefore (E1’ + c1z) (E1’ + c1z) = (E1’ + c1z) (E1’ + c1z). This implies c12 - c22 = k12, where k12 is a measurable quantity in the image (it is |E’12| - |E’22|. The two equation of c1 c2 are: c1c2= c12 c12 - c22 = k12 and they have a unique solution. One way of seeing this is by setting a complex number Z = c1 + ic2. Then Z2 = k12 + ic12. Therefore, Z2 is measurable. We take the square root and get Z, therefore c1, c2. There are exactly two roots giving the two mirror reflection solutions.

Car Recognition

Car Models

Alignment: Cars

Alignment: Unmatch

Face Alignment

Linear Combination of Views

Linear Combination of Views O is a set of object points. I1, I2, I3, are three images of O from different views. N is a novel view of O. Then O is the linear combination of I1, I2, I3.

Car Recognition

VW – SAAB

LC – Car Images

Linear Combination: Faces

Classification

Structural descriptions

RBC

Structural Description G1 Above G2 G2 Touch Above G3 Left-of Right-of G4 G4

Fragment-based Representation

Mutual Information Mutualinformation Entropy Binary variable -H(C) = P(C=1)Log(P(C=1) + P(C=0)Log(P(C=0)

Mutual information H(C) F=0 F=1 H(C) when F=1 H(C) when F=0 I(C;F) = H(C) – H(C/F)

Selecting Fragments

Fragments Selection • For a set of training images: • Generate candidate fragments • Measure p(F/C), p(F/NC) • Compute mutual information • Select optimal fragment • After k fragments: Maximizing the minimal addition in mutual information with respect to each of the first k fragments

Optimal Face Fragments

1e. 1d. 1-st. Merit 0.20 0.18 0.18 0.17 0.16 0.11 0.10 0.09 Weight 6.5 5.5 6.45 5.45 3.52 2.9 2.9 2.86 2-nd 3-rd 4-th Face Fragments by Type

Low-resolution Car FragmentsFront – Middle - Back

100 x Merit, weight 100 x Merit, weight 15 6 5 4 10 3 x Merit 2 5 1 a. 100 b. 100 0 0 0 1 2 3 4 0 1 2 3 1 . 5 Relative object size Relative object size 1 0 . 5 Relative mutual info. 100 x Merit, weight 0 1 . 2 0 1 2 3 1 - 0 . 5 0 . 8 Relative object size 0 . 6 0 . 4 0 . 2 0 0 0 . 5 1 1 . 5 2 Relative resolution Intermediate Complexity

Fragment ‘Weight’ Likelihood ratio: Weight of F:

Combining fragments w1 wk w2 D1 D2 Dk

Non-optimal Fragments Same total area covered (8*object), on regular grid

Training & Test Images • Frontal faces without distinctive features (K:496,W:385) • Minimize background by cropping • Training images for extraction: 32 for each class • Training images for evaluation: 100 for each class • Test images: 253 for Western and 364 for Korean

Training – Fragment Extraction

Western Fragment Korean Fragment Score Score 0.92 0.92 0.82 0.82 0.77 0.77 0.76 0.76 0.75 0.75 0.74 0.74 0.72 0.72 0.68 0.68 0.67 0.67 0.65 0.65 Weight Weight 3.42 3.42 2.40 2.40 1.99 1.99 2.23 2.23 1.90 1.90 2.11 2.11 6.58 6.58 4.14 4.14 4.12 4.12 6.47 6.47 Extracted Fragments

Classifying novel images Detect Fragments Compare Summed Weights Decision kF Westerner Unknown wF Korean

Effect of Number of Fragments • 7 fragments: 95%, 80 fragments: 100% • Inherent redundancy of the features • Slight violation of independence assumption

Comparison with Humans • Algorithm outperformed humans for low resolution images

Class examples

3-Point Alignment: Linear Transformation Techniques

3-Point Alignment: Linear Transformation Techniques

Presentation Transcript

Venus

Venus

Venus

Venus

Venus

Venus

Venus

VENUS

Venus

Venus

Venus

Venus

VENUS

Venus

Venus

VENUS.

Venus

Venus

VENUS

Venus

Venus

Venus