Homography based multiple camera detection and tracking of people in a dense crowd
1 / 41

Homography Based Multiple Camera Detection and Tracking of People in a Dense Crowd - PowerPoint PPT Presentation

  • Uploaded on

Homography Based Multiple Camera Detection and Tracking of People in a Dense Crowd. Presented by: Idan Aharoni. Ran Eshel and Yael Moses. Motivation. Usually for surveillance, but not only. Many cameras, create enormous amount of data, impossible to track manually.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Homography Based Multiple Camera Detection and Tracking of People in a Dense Crowd' - isleen

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Homography based multiple camera detection and tracking of people in a dense crowd

Homography Based Multiple Camera Detection and Tracking of People in aDense Crowd

Presented by: IdanAharoni

Ran Eshel and Yael Moses


  • Usually for surveillance, but not only.

  • Many cameras, create enormous amount of data, impossible to track manually.

  • Many real life scenes are crowded.

Single camera tracking
Single camera tracking

  • Many papers about this issue, some of them were presented in this course:

    • Floor fields for tracking in high density crowds

    • Unsupervised Bayesian detection of independent motion in crowds

    • Particle Filters.

    • etc.

Single camera tracking problems
Single camera tracking problems

  • Not isolated body parts (human shape trackers)

  • Targets interactions

  • Target blocking each other

Algorithm overview
Algorithm Overview

  • Combining data from a set of cameras over looking the same scene.

  • Based on that data, try to detect human head tops.

  • Track after detected head tops by using assumptions on the expected trajectory.

What is a homography
What Is a Homography?

  • Homography is a coordinates transformation from one image to another – represented by a 3x3 matrix.

  • Possible in 2 case only.

    • Camera rotation.

    • Same plane

More homographys
More Homographys

  • Translation:

  • Rotation:

  • Affine:

More homographys1
More Homographys

  • Need only 4 points to calculate. (defined up to scaling factor)

  • Projection:

  • It describes what happens to the perceived positions of observed objects when the point of view of the observer changes.

Not a homography
Not a Homography!

  • Barrel Correction:

Homography points detection
Homography Points Detection

  • For each camera, we want to slice find 4 points in each height plane.

Height calculation
Height Calculation

  • Cross ratio of 4 pixels:

Floor plane projection
Floor Plane Projection

  • We can define a homography from the image to itself, that will transform a height plane to another height plane.

  • Again, all we need are 4 points of each height.

Head top detection
Head Top Detection

  • Head Top – The highest 2D patch of a person.

  • The detection is based on co-temporal frames – frames that were taken at the same times, from different cameras.

Head top detection1
Head Top Detection

Camera A

Camera B

B projected onto A plane

A on B

Background subtraction
Background Subtraction

  • First stage of the algorithm.

  • All the next stages are performed on foreground pixels only.

  • Subtract each frame from an offline background sample.

What is hyper pixel
What is Hyper-Pixel?

  • A hyper pixel is a Nx1 vector (N denotes the number of cameras)

  • q – Reference image pixel.

  • - Homography related pixels in the rest of the images.

  • - Homography transformation of image i onto the reference image (opposite of ).

  • I – Intensity level.

Hyper pixel usage
Hyper-Pixel Usage

  • Hyper pixel is calculated for each foreground pixel of the reference image.

  • By using the hyper pixel intensity variance we can estimate the correlation between the pixels from the different image.

Hyper pixel variance
Hyper Pixel Variance

Low Variance

Low Variance

High Variance

Low Variance

2d patches
2D patches

  • Now we have a map of variances, for each pixel.

  • We need to obtain candidates for real projected pixels.

  • Use variance thresholds and head size clustering (K-Means).

K means clustering
K-Means Clustering

  • Partition N observations into K clusters

  • Each observation belongs to the cluster with the nearest mean.

Repeat until convergence…

Thanks Wiki!

Back to floor projection
Back to Floor Projection…

  • A person can be detected on more than one height plane.

  • All heights are projected to the floor, and only highest patch is taken…

  • A Head!


Reference foreground

Projected foregrounds

Variance map

Single height detection

All heights detection



  • So far we have a map of potential heads and heights.

  • Tracking should remove false positives and false negatives.

  • For that we define a few prior based measurements.

Tracking first stage
Tracking – First Stage

  • In this stage, we aim to remove false negatives.

  • For that we have two head maps. One with high threshold, and one with low threshold, (projected to the floor)

  • High threshold yields less false positives, but more false negatives.

Tracking first stage1
Tracking – First Stage

  • High threshold map:

  • If we have a hole, we try to make fill it in the low threshold map.

Tracking first stage2
Tracking – First Stage

  • If no match could be found in high and low maps, we stop the tracking after this track.

Tracking second stage
Tracking – Second Stage

  • Now we have a list of fractioned tracks

  • Very easy for a human to figure out which one goes where…

Tracking second stage1
Tracking – Second Stage

  • In this stage we aim to connect fragmented tracks, by using priors of how people move.

  • For that we define a score, which is calculated out of 6 parameters, for each pair of time overlapped tracks.

Second stage scores
Second Stage - Scores

  • The difference

    in direction

    2. Direction change


Second stage scores1
Second Stage - Scores

3) Amount of overlap between tracks (4).

4) Minimal distance along the tracks (3).

5) Average distance along the tracks.

Second stage scores2
Second Stage - Scores

6) Height change – Not very likely in a tracking time frame…

Tracking scores
Tracking - Scores

  • Score calculation:

  • : Maximum expected value of score

Tracking final stage
Tracking – Final stage

  • We now have full length set of trajectories.

  • In this stage, tracks that are suspected as false positives are removed.

Tracking final stage1
Tracking – Final stage

  • For each trajectory, we use a consistency score between each 2 consecutive frames

  • Consistency score is made of weighted average of:

    • Un-natural speed changes.

    • Un-natural direction changes.

    • Changes in height.

    • Too short track Length.

Results scene
Results - scene

  • Cameras:

    • 3 - 9 grey level cameras.

    • 15 fps, 640x512.

    • 30⁰ related to each other.

    • 45⁰ below horizon.

  • Scene:

    • 3x6 meters


  • True Positive (TP): 75% - 100% of the trajectory is tracked (might be with IDC)

  • Perfect True Positive (PTP) – 100% of the trajectory is tracked (no IDC).

  • Detection Rate (DR): percent of frames tracked compare to full trajectory.

  • ID Change (IDC)

  • False Negative (FN): less than 75% of the trajectory is tracked.

  • False Positive (FP): Track with no real trajectory.

Varying the number of cameras
Varying the number of cameras

  • It seems like we need at least 8-9 cameras…