Formation et Analyse d’Images Session 12

Formation et Analyse d’ImagesSession 12 Daniela Hall 16 January 2006

Course Overview • Session 1 (19/09/05) • Overview • Human vision • Homogenous coordinates • Camera models • Session 2 (26/09/05) • Tensor notation • Image transformations • Homography computation • Session 3 (3/10/05) • Camera calibration • Reflection models • Color spaces • Session 4 (10/10/05) • Pixel based image analysis • 17/10/05 course is replaced by Modelisation surfacique

Course overview • Session 5 + 6 (24/10/05) 9:45 – 12:45 • Contrast description • Hough transform • Session 7 (7/11/05) • Kalman filter • Session 8 (14/11/05) • Tracking of regions, pixels, and lines • Session 9 (21/11/05) • Gaussian filter operators • Session 10 (5/12/05) • Scale Space • Session 11 (12/12/05) • Stereo vision • Epipolar geometry • Session 12 (16/01/06): exercises and questions

Exam • Date: to be defined • Duration: to be defined (last year it was 3h) • Documents needed for the exam • Class notes • Pocket calculator • Kalman tutorial • Isard, Blake: Active Contours, chap 12.1, 12.2

Exercises

Exercises • You have a camera that observes a corridor. • People can enter at the left or the right of the image. • Your task is to count the number of people that walk by. • What approach do you propose?

Exercise • How can you count the number of flowers in the image and determine their scale?

Rectifying images • You need to display the image on the paper display. You have a steerable video projector and a camera. How do you proceed?

Exercise How can you automatically count the number of objects in the image?

Robust tracking of objects List of predictions Predict Detection List of targets Correct Measurements Trigger regions New targets Detection

Detection methods • Background differencing • used to detect targets that have different color than the background • detection image is the difference between the current image and the background image. • Measure the energy of the detection image • If the energy is above a threshold, a target is detected. • The position of the new target is described by first and second moment of the thresholded detection image. • Image differencing • used to detect moving targets. The detection image contains only the borders of the target. • detection image is the difference between the current image and the previous image. • Measure the energy of the detection image • If the energy is above a threshold, a target is detected. • The position of the new target is more difficult to describe, because detection image contains only the borders of the object. • Color histogram for detection • make a color histogram of the empty detection region Htot • at each frame make a color histogram of the detection region Hobj • if the sum_i Hobj(i)/Htot(i) > threshold, a target is detected. • make a detection image where each pixel is marked by Hobj(i)/Htot(i) • The position of the new target is described by first and second moment of the thresholded detection image.

BG differencing Img differencing

Session overview • Tracking of point neighborhoods • using SSD • using CC and NCC • using Gaussian receptive fields

Tracking of point neighborhoods • When we have additive noise, the euclidean norm is the optimal method for neighborhood matching, because it minimises the error probability. • Goal: which position (i,j) of the image I(i,j) is the most similar to the pattern X(i,j). • Hypotheses: • additive Gaussian noise • No image rotation (2D) • No rotation in space (3D) • No scale changes • The euclidean norm is known as SSD (« sum of squared distances ») • The method is efficient and precise, but sensible.

Sum of squared distances (SSD) • Definition: • Let X(m,n) be the pattern with 0<m<M-1, 0<n<N-1 • Let I(i,j) be the image with 0<i<I-1, 0<j<J-1, (M<<I, N<<J) • The position (i,j) of the image I(i,j) is the most similar to the pattern X(i,j) is computed as

Sum of squared distances • Searching a pattern X(m,n) within an image I(i,j) corresponds to placing the pattern at all possible positions (i,j) and computing the SSD(i,j). • Depending on the size of the pattern and the image, this can be costly. • SSD is sensitive to rotations, scale and noise.

Pattern as a feature vector • Any image patch can be seen as a vector. • To transform an 2D image patch to a vector, you need to concatenate the lines one after another. For an image of size MxN, you obtain a vector with M*N dimensions.

SSD using feature vectors • Transform the pattern X(m,n) and the neighborhood of size MxN at the position (i,j) of the image I to vectors. • SSD is the norm of the difference of these two vectors.

Cross Correlation (CC) • Another method for pattern matching is cross correlation (scalar product). The best match is characterised by maximising the product. • In the case of normalised vectors, the scalar product is the cos of the angles between the vectors. This is the definition of the normalised cross correlation (NCC). -1 <NCC<1

Relation of SSD and NCC • The best match minimises SSD and maximises NCC. • We note:

Tracking by correlation • Computation time of tracking by correlation depends on the size of the pattern (target) and the size of the image. • When all possible positions in the image are tested, this is slow. • How can we optimise tracking by correlation (reduce the computation time): • Reduce the number of tests by testing only one position out of two. Increases speed by 4, reduces precision of the result. Problem: if too little positions are tested, the target might be missed. • Reduce the number of tests by restricting the search to a small search region (region of interest, ROI).

Speed up of tracking • The search region can be determined from the position of the target at time t-1 and its maximum speed. This is measured in pixels/delta t. • If we can reduce the search region, we can process more images (reduce delta t), which allows us to reduce the search region more, .... • Problem: speed depends on the distance of the object to the camera. Close objects have higher speeds than objects far away.

Example

Example • Person traverses entry hall in 5.2s (130 frames*25frames/s) • Distance is 288 pixels, target size is 45x35 pixels • Speed 55.4pixels/s • Let maximum speed be twice the measured speed 110.8pixels/s • Then we need a search region of size target size + • ROI = target size +/- 4.4 pixels = 54 x 44 pixels.

Example • Number of tests exhaustive search (searching whole image of size 384x288 pixels) • (384-45)(288-35)=85767 tests • Number of tests using search region (54x44 pixels) • (54-45)(44-35)=81 tests • Speed up factor 85767/81= 1090

Formation et Analyse d’Images Session 12

Formation et Analyse d’Images Session 12

Presentation Transcript