Homography Based Multiple Camera Detection and Tracking of People in a Dense Crowd

Homography Based Multiple Camera Detection and Tracking of People in aDense Crowd Presented by: IdanAharoni Ran Eshel and Yael Moses

Motivation • Usually for surveillance, but not only. • Many cameras, create enormous amount of data, impossible to track manually. • Many real life scenes are crowded.

Single camera tracking • Many papers about this issue, some of them were presented in this course: • Floor fields for tracking in high density crowds • Unsupervised Bayesian detection of independent motion in crowds • Particle Filters. • etc.

Single camera tracking problems • Not isolated body parts (human shape trackers) • Targets interactions • Target blocking each other • …

Algorithm Overview • Combining data from a set of cameras over looking the same scene. • Based on that data, try to detect human head tops. • Track after detected head tops by using assumptions on the expected trajectory.

Scene Example

What Is a Homography? • Homography is a coordinates transformation from one image to another – represented by a 3x3 matrix. • Possible in 2 case only. • Camera rotation. • Same plane

More Homographys • Translation: • Rotation: • Affine:

More Homographys • Need only 4 points to calculate. (defined up to scaling factor) • Projection: • It describes what happens to the perceived positions of observed objects when the point of view of the observer changes.

Not a Homography! • Barrel Correction:

Homography Points Detection • For each camera, we want to slice find 4 points in each height plane.

Homography Points Detection

Height Calculation • Cross ratio of 4 pixels:

Floor Plane Projection • We can define a homography from the image to itself, that will transform a height plane to another height plane. • Again, all we need are 4 points of each height.

Head Top Detection • Head Top – The highest 2D patch of a person. • The detection is based on co-temporal frames – frames that were taken at the same times, from different cameras.

Head Top Detection Camera A Camera B B projected onto A plane A on B

Background Subtraction • First stage of the algorithm. • All the next stages are performed on foreground pixels only. • Subtract each frame from an offline background sample.

What is Hyper-Pixel? • A hyper pixel is a Nx1 vector (N denotes the number of cameras) • q – Reference image pixel. • - Homography related pixels in the rest of the images. • - Homography transformation of image i onto the reference image (opposite of ). • I – Intensity level.

Hyper-Pixel Usage • Hyper pixel is calculated for each foreground pixel of the reference image. • By using the hyper pixel intensity variance we can estimate the correlation between the pixels from the different image.

Hyper Pixel Variance Low Variance Low Variance High Variance Low Variance

2D patches • Now we have a map of variances, for each pixel. • We need to obtain candidates for real projected pixels. • Use variance thresholds and head size clustering (K-Means).

K-Means Clustering • Partition N observations into K clusters • Each observation belongs to the cluster with the nearest mean. Repeat until convergence… Thanks Wiki!

Back to Floor Projection… • A person can be detected on more than one height plane. • All heights are projected to the floor, and only highest patch is taken… • A Head!

Example Reference foreground Projected foregrounds Variance map Single height detection All heights detection Track

Tracking • So far we have a map of potential heads and heights. • Tracking should remove false positives and false negatives. • For that we define a few prior based measurements.

Tracking – First Stage • In this stage, we aim to remove false negatives. • For that we have two head maps. One with high threshold, and one with low threshold, (projected to the floor) • High threshold yields less false positives, but more false negatives.

Tracking – First Stage • High threshold map: • If we have a hole, we try to make fill it in the low threshold map.

Tracking – First Stage • If no match could be found in high and low maps, we stop the tracking after this track.

Tracking – Second Stage • Now we have a list of fractioned tracks • Very easy for a human to figure out which one goes where…

Tracking – Second Stage • In this stage we aim to connect fragmented tracks, by using priors of how people move. • For that we define a score, which is calculated out of 6 parameters, for each pair of time overlapped tracks.

Second Stage - Scores • The difference in direction 2. Direction change required

Second Stage - Scores 3) Amount of overlap between tracks (4). 4) Minimal distance along the tracks (3). 5) Average distance along the tracks.

Second Stage - Scores 6) Height change – Not very likely in a tracking time frame…

Tracking - Scores • Score calculation: • : Maximum expected value of score

Tracking – Final stage • We now have full length set of trajectories. • In this stage, tracks that are suspected as false positives are removed.

Tracking – Final stage • For each trajectory, we use a consistency score between each 2 consecutive frames • Consistency score is made of weighted average of: • Un-natural speed changes. • Un-natural direction changes. • Changes in height. • Too short track Length.

Results - scene • Cameras: • 3 - 9 grey level cameras. • 15 fps, 640x512. • 30⁰ related to each other. • 45⁰ below horizon. • Scene: • 3x6 meters

Criteria • True Positive (TP): 75% - 100% of the trajectory is tracked (might be with IDC) • Perfect True Positive (PTP) – 100% of the trajectory is tracked (no IDC). • Detection Rate (DR): percent of frames tracked compare to full trajectory. • ID Change (IDC) • False Negative (FN): less than 75% of the trajectory is tracked. • False Positive (FP): Track with no real trajectory.

Results – Summary

Varying the number of cameras • It seems like we need at least 8-9 cameras…

Questions?

Homography Based Multiple Camera Detection and Tracking of People in a Dense Crowd

Homography Based Multiple Camera Detection and Tracking of People in a Dense Crowd

Presentation Transcript

Image-Based Target Detection and Tracking

Topic Detection and Tracking

Real-time Tracking of Multiple People Using Stereo

Vision-Based Detection, Tracking and Classification of Vehicles using Features and Patterns with Automatic Camera Calibr

Web camera based Eye tracking and Head tracking ADE

Multiple Camera Object Tracking

A General Framework for Tracking Multiple People from a Moving Camera

Video-Based People Tracking

Principal Axis-Based Correspondence between Multiple Cameras for People Tracking

Vision-Based Multiple Vehicle Detection and Tracking for Driver Assistant System

Trajectory-Based Ball Detection and Tracking with Aid of Homography in Broadcast Tennis Video

Vision-Based Multiple Vehicle Detection and Tracking for Driver Assistant System

Pedestrians Detection and Tracking

Multiple Camera Tracking of Interacting and Occluded Human Motion

Tracking Multiple Objects using Sensor Networks and Camera Networks

Real-Time Face Detection and Tracking Using Multiple Cameras

Topics Detection and Tracking

Handover and Tracking in a Camera Network

Homography-Based Visual Control of Nonholonomic Vehicles

Image-Based Target Detection and Tracking

PLANAR VEHICLE TRACKING USING A MONOCULAR BASED MULTIPLE CAMERA VISUAL POSITION SYSTEM

Homography Based Multiple Camera Detection and Tracking of People in a Dense Crowd