Camera Calibration Models and Coordinate Systems

Lecture 20 Calibration CSE 6367 – Computer Vision Spring 2010 Vassilis Athitsos University of Texas at Arlington

y axis x axis P(A) B z axis pinhole focal length f P(B) A image plane Pinhole Model • Terminology: • image plane is a planar surface of sensors. The response of those sensors to light forms the image. • The focal length f is the distance between the image plane and the pinhole. • A set of points is collinear if there exists a straight line going through all points in the set.

y axis x axis P(A) B z axis pinhole focal length f P(B) A image plane Pinhole Model • Pinhole model: • light from all points enters the camera through an infinitesimal hole, and then reaches the image plane. • The focal length f is the distance between the image plane and the pinhole. • the light from point A reaches image location P(A), such that A, the pinhole, and P(A) are collinear.

y axis x axis P(A) B z axis pinhole focal length f P(B) A image plane Different Coordinate Systems • World coordinate system (3D): • Pinhole is at location t, and at orientation R. • Camera coordinate system (3D): • Pinhole is at the origin. • The camera faces towards the positive side of the z axis.

y axis x axis P(A) B z axis pinhole focal length f P(B) A image plane Different Coordinate Systems • Normalized image coordinate system (2D): • Coordinates on the image plane. • The (x, y) values of the camera coordinate system. • We drop the z value (always equal to f, not of interest). • Center of image is (0, 0). • Image (pixel) coordinate system (2D): • pixel coordinates.

x y z cx cy cz c cu cv c u v x y z Homogeneous Coordinates • Homogeneous coordinates are used to simplify formulas, so that camera projection can be modeled as matrix multiplication. • For a 3D point: • instead of writing we write where c can be any constant. • How many ways are there to write in homogeneous coordinates? • INFINITE (one for each real number c). • For a 2D point : we write it as .

Ax Ay Az 1 1 0 0 -Cx 0 1 0 -Cx 0 0 1 -Cx 0 0 0 1 Camera Translation • Suppose that: • Initially, camera coordinates = world coordinates. • Then, the camera moves (without rotating), so that the pinhole is at (Cx, Cy, Cz). • Then, if point A is represented in world coordinates, T*A gives us the camera-based coordinates for A. T * A Translation matrix T World coordinates of point A Camera-based coordinates of point A

1 0 0 0 0 cosθx -sinθx 0 0 sinθx cosθx 0 0 0 0 1 cosθy 0 sinθy 0 0 1 0 0 -sinθy 0 cosθy 0 0 0 0 1 cosθz -sinθz 0 0 sinθz cosθz 0 0 0 0 1 0 0 0 0 1 Camera Rotation • Any rotation R can be decomposed into three rotations: • a rotation Rx by θxaround the x axis. • a rotation Ry by θyaround the y axis. • a rotation Rz by θzaround the z axis. • Rotation of point A = R * A = Rz * Ry * Rx * A. • ORDER MATTERS. • Rz * Ry * Rx * A is not the same as Rx * Ry * Rz * A. Ry Rz Rx

r11 r12 r13 0 r21 r22 r23 0 r31 r32 r33 0 0 0 0 1 1 0 0 0 0 cosθx -sinθx 0 0 sinθx cosθx 0 0 0 0 1 cosθy 0 sinθy 0 0 1 0 0 -sinθy 0 cosθy 0 0 0 0 1 cosθz -sinθz 0 0 sinθz cosθz 0 0 0 0 1 0 0 0 0 1 Rotation Matrix • The rotation matrix R has 9 unknown values, but they all depend on three parameters: θx, θy, and θz. • If we know θx, θy, and θz, we can compute R. Ry Rz Rx Rotation matrix R = Rz * Ry * Rx

Sx 0 u0 0 Sy v0 0 0 1 Homography • The matrix mapping normalized image coordinates to pixel coordinates is called a homography. • A homography matrix H looks like this: H = where: • Sx and Sy define scaling (typically Sx = Sy). • Sx and Sy are the size, in world coordinates, of a rectangle on the image plane that corresponds to a single pixel. • u0 and v0 translate the image so that its center moves from (0, 0) to (u0, v0).

-fSx 0 u0 0 0 -fSy v0 0 0 0 1 0 From Camera Coordinates to Pixels • Matrix H can easily be modified to directly map from camera coordinates to pixel coordinates: H = • f is the camera focal length.

Ax Ay Az 1 u’ v’ w’ 1 0 0 -Cx 0 1 0 -Cy 0 0 1 -Cz0 0 0 1 r11 r12 r13 0 r21 r22 r23 0 r31 r32 r33 00 0 0 1 -fSx 0 u0 0 0 -fSy v00 0 0 1 0 Perspective Projection • Let A= , R = , T = , H = . • What pixel coordinates (u, v) will A be mapped to? • = H * R * T * A. • u = u’/w’, v = v’/w’. • (H * R * T) is called the camera matrix. • H is called the calibration matrix. • It does not change if we rotate/move the camera.

Ax Ay Az 1 u’ v’ w’ 1 0 0 -Cx 0 1 0 -Cy 0 0 1 -Cz0 0 0 1 r11 r12 r13 0 r21 r22 r23 0 r31 r32 r33 00 0 0 1 Sx 0 0 u0 0 Sy 0v0 0 0 0 1 Orthographic Projection • Let A= , R = , T = , H = . • What pixel coordinates (u, v) will A be mapped to? • = H * R * T * A. • u = u’/w’, v = v’/w’. • Main difference from perspective projection: z coordinate gets ignored. • To go from camera coordinates to normalized image coordinates, we just drop the z value.

Ax Ay Az 1 u’ v’ w’ 1 0 0 -Cx 0 1 0 -Cy 0 0 1 -Cz0 0 0 1 r11 r12 r13 0 r21 r22 r23 0 r31 r32 r33 00 0 0 1 -fSx 0 u0 0 0 -fSy v0 0 0 0 1 0 Calibration • Let A= , R = , T = , H = . • = H * R * T * A. • C = (H * R * T) is called the camera matrix. • Question: How do we compute C? • The process of computing C is called camera calibration.

c11 c12 c13 c14 c21 c22 c23 c24 c31 c32 c33 1 Calibration • Camera matrix C is always of the following form: C = • C is equivalent to any sC, where s != 0. • Why?

c11 c12 c13 c14 c21 c22 c23 c24 c31 c32 c33 1 Calibration • Camera matrix C is always of the following form: C = • C is equivalent to any sC, where s != 0. • That is why we can assume that c34 = 1. If not, we can just multiply by s = 1/c34. • To compute C, one way is to manually establish correspondences between points in 3D world coordinates and pixels in the image.

Using Correspondences • Suppose that [xj, yj, zj, 1] maps to [uj, vj, 1]. • This means that C * [xj, yj, zj, 1]’ = [sjuj, sjvj, sj]’. • Note that vectors [xj, yj, zj, 1] and [sjuj, sjvj, sj] are transposed. • This gives the following equations: • sjuj = c11 * xj + c12 * yj + c13 * zj + c14. • sjvj = c21 * xj + c22 * yj + c23 * zj + c24. • sj= c31 * xj + c32 * yj + c33 * zj + 1. • Multiplying Equation 3 by uj we get: • sjuj = c31 * uj * xj + c32 * uj * yj + c33 * uj * zj + uj. • Multiplying Equation 3 by vj we get: • sjvj = c31 * vj * xj + c32 * vj * yj + c33 * vj * zj + vj.

Obtaining a Linear Equation • We combine two equations: • sjuj = c11 * xj + c12 * yj + c13 * zj + c14. • sjuj = c31 * uj * xj + c32 * uj * yj + c33 * uj * zj + uj. to obtain: c11xj+c12yj+c13zj+c14 = c31ujxj+c32ujyj+c33ujzj+uj => uj = c11xj+c12yj+c13zj+c14 - c31ujxj-c32ujyj-c33ujzj => uj = [xj,yj,zj,1,-ujxj,-ujyj,-ujzj]*[c11,c12,c13,c14, c31,c32, c33]trans => uj = [xj, yj, zj, 1, 0, 0, 0, 0, -ujxj, -ujyj, -ujzj] *[c11, c12, c13, c14, c21, c22, c23, c24, c31, c32, c33]trans • In the above equations: • What is known, what is unknown?

Obtaining a Linear Equation • We combine two equations: • sjuj = c11 * xj + c12 * yj + c13 * zj + c14. • sjuj = c31 * uj * xj + c32 * uj * yj + c33 * uj * zj + uj. to obtain: c11xj+c12yj+c13zj+c14 = c31ujxj+c32ujyj+c33ujzj+uj => uj = c11xj+c12yj+c13zj+c14 - c31ujxj-c32ujyj-c33ujzj => uj = [xj,yj,zj,1,-ujxj,-ujyj,-ujzj]*[c11,c12,c13,c14, c31,c32, c33]trans => uj = [xj, yj, zj, 1, 0, 0, 0, 0, -ujxj, -ujyj, -ujzj] *[c11, c12, c13, c14, c21, c22, c23, c24, c31, c32, c33]trans • In the above equations: • c11,c12,c13,c14,c21,c22,c23,c24,c31, c32, c33 are unknown. • xj, yj, zj, uj, vj are assumed to be known.

Obtaining Another Linear Equation • We combine two equations: • sjvj = c21 * xj + c22 * yj + c23 * zj + c24. • sjvj = c31 * vj * xj + c32 * vj * yj + c33 * vj * zj + vj. to obtain: c21xj+c22yj+c23zj+c24 = c31vjxj+c32vjyj+c33vjzj+vj => vj = c21xj+c22yj+c23zj+c24 - c31vjxj-c32vjyj-c33vjzj => vj = [xj,yj,zj,1,-vjxj,-vjyj,-vjzj]*[c21,c22,c23,c24, c31,c32, c33]trans => vj = [ 0, 0, 0, 0, xj, yj, zj, 1, -vjxj, -vjyj, -vjzj] *[c11, c12, c13, c14, c21, c22, c23, c24, c31, c32, c33]trans • In the above equations: • What is known, what is unknown?

Obtaining Another Linear Equation • We combine two equations: • sjvj = c21 * xj + c22 * yj + c23 * zj + c24. • sjvj = c31 * vj * xj + c32 * vj * yj + c33 * vj * zj + vj. to obtain: c21xj+c22yj+c23zj+c24 = c31vjxj+c32vjyj+c33vjzj+vj => vj = c21xj+c22yj+c23zj+c24 - c31vjxj-c32vjyj-c33vjzj => vj = [xj,yj,zj,1,-vjxj,-vjyj,-vjzj]*[c21,c22,c23,c24, c31,c32, c33]trans => vj = [ 0, 0, 0, 0, xj, yj, zj, 1, -vjxj, -vjyj, -vjzj] *[c11, c12, c13, c14, c21, c22, c23, c24, c31, c32, c33]trans • In the above equations: • c11,c12,c13,c14,c21,c22,c23,c24,c31, c32, c33 are unknown. • xj, yj, zj, uj, vj are assumed to be known.

Setting Up Linear Equations • Let A = xj, yj, zj, 1,0, 0,0,0, -xjuj, -yjuj, -zjuj 0, 0, 0, 0, xj, yj, zj, 1, -xjvj, -yjvj, -zjvj • Let x = [c11, c12, c13, c14, c21, c22, c23, c24, c31, c32, c33]’. • Note the transpose. • Let b = [uj, vj]’. • Again, note the transpose. • Then, A*x = b. • This is a system of linear equations with 11 unknowns, and 2 equations. • To solve the system, we need at least 11 equations. • How can we get more equations?

Solving Linear Equations • Suppose we use 20 point correspondences between [xj, yj, zj, 1] and [uj, vj, 1]. • Then, we get 40 equations. • They can still be jointly expressed as A*x = b, where: • A is a 40*11 matrix. • x is an 11*1 matrix. • b is a 40 * 1 matrix. • Row 2j-1 of A is equal to: xj, yj, zj, 1,0, 0,0,0, -xjuj, -yjuj, -zjuj. • Row 2j of A is equal to:0, 0, 0, 0, xj, yj, zj, 1, -xjvj, -yjvj, -zjvj • Row 2j-1 of b is equal to uj. • Row 2j of b is equal to vj. • x = [c11, c12, c13, c14, c21, c22, c23, c24, c31, c32, c33]’. • How do we solve this system of equations?

Solving A*x = b • If we have > 11 equations, and only 11 unknowns, then the system is overconstrained. • If we try to solve such a system, what happens?

Solving A*x = b If we have > 11 equations, and only 11 unknowns, then the system is overconstrained. There are two cases: (Rare). An exact solution exists. In that case, usually only 11 equations are needed, the rest are redundant. (Typical). No exact solution exists. Why?

Solving A*x = b If we have > 11 equations, and only 11 unknowns, then the system is overconstrained. There are two cases: (Rare). An exact solution exists. In that case, usually only 11 equations are needed, the rest are redundant. (Typical). No exact solution exists. Why? Because there is always some measurement error in estimating world coordinates and pixel coordinates. We need an approximate solution. Optimization problem. We take the standard two steps: Step 1: define a measure of how good any solution is. Step 2: find the best solution according to that measure. Note. “solution” here is not the BEST solution, just any proposed solution. Most “solutions” are really bad!

Least Squares Solution Each solution produces an error for each equation. Sum-of-squared-errors is the measure we use to evaluate a solution. The least squares solution is the solution that minimizes the sum-of-squared-errors measure. Example: let x2 be a proposed solution. Let b2 = A * x2. If x2 was the mathematically perfect solution, b2 = b. The error e(i) at position i is defined as |b2(i) – b(i)|. The squared error at position i is defined as |b2(i) – b(i)|2. The sum of squared errors is sum(sum((b2(i) – b(i)).^2)).

Least Squares Solution • Each solution produces an error for each equation. • Sum-of-squared-errors is the measure we use to evaluate a solution. • The least squares solution is the solution that minimizes the sum-of-squared-errors measure. • Finding the least-squares solution to a set of linear equations is mathematically involved. • However, in Matlab it is really easy: • Given a system of linear equations expressed as A*x = b, to find the least squares solution, type: • x = A\b

Producing World Coordinates • Typically, a calibration object is used. • Checkerboard patterns and laser pointers are common. • A point on the calibration object is designated as the origin. • The x, y and z directions of the object are used as axis directions of the world coordinate system. • Correspondences from world coordinates to pixel coordinates can be established manually or automatically. • With a checkerboard pattern, automatic estimation of correspondences is not hard.

Two types of radial distortion: barrel distortion and pincushion distortion. Images from Wikipedia Calibration in the Real World • Typically, cameras do not obey the perspective model closely enough. • Radial distortion is a common deviation. • Calibration software needs to account for radial distortion.

Camera Calibration Models and Coordinate Systems

Camera Calibration Models and Coordinate Systems

Presentation Transcript

Lecture 20:

LECTURE 20

Lecture 20

Lecture 20

Lecture 20

Lecture 20

Lecture # 20

Lecture 20

Lecture 20

Lecture 20

Lecture 20

Lecture 20:

WinTR-20 Calibration Procedures

Lecture 20

Lecture 20

Lecture 20

Lecture 20

WinTR-20 Calibration Procedures

Lecture 20

Lecture 20

Lecture 20

Lecture 20