Video Synopsis

Video Synopsis Yael Pritch Alex Rav-Acha Shmuel Peleg The Hebrew University of Jerusalem

Detective Series: “Elementary”

Video Surveillance Problem Cologne Train Bombs, 31-7-06 Terrorists, London tube, 7-7-05 • It took weeks to find these events in video archives. • Cost of a lost information or a delay may be very high.

Challenges in Video Surveillance • Millions of surveillance cameras are installed, capturing data 24/365 • Number of cameras and their resolution increases rapidly • Not enough people to watch captured data • Human Attention is Lost after ~20 Minutes • Result: Recorded Video is Lost Video • Less than 1% of surveillance video is examined

Handling Surveillance Video • Object Detection and Tracking • Background Subtraction • Object Recognition • Individual people • Activity Recognition • Left luggage; Fight • A lot of progress done. More work remains.

Handling Surveillance VideoVideo Synopsis • Object Detection and Tracking • Background Subtraction (Assume Done) • Object Recognition (Do not use) • Individual people • Activity Recognition (Do not use) • Left luggage; Fight • A lot of progress done. More work remains. • Let People do the Recognition

Video Synopsis Original video Video Synopsis • A fast way to browse & index video archives. • Summarize a full day of video in a few minutes. • Events from different times appear simultaneously. • Human inspection of synopsis!!!

Synopsis of Surveillance VideosHuman Inspection of Search Results • Serve queries regarding each camera: • Generate a 3 minutes video showing most activities in the last 24 hours • Generate the shortest video showing all activities in the last 24 hours • Each presented activity points back to original time in the original video • Orthogonal to Video Analytics

Non-Chronological Time Dynamic Mosaicing Video Synopsis The Hebrew University of Jerusalem Salvador Dali

Dynamic Mosaics Non Chronological Time

HandheldStereo Mosaic

u strips Original frames t Mosaic Image

ub ua u Frame tk Space-Time Slice Visibility region Frame tl  t Mosaic Image

Creating Dynamic Panoramic Movies u First Mosaic - Appearance First Slice t play Last Slice Last Mosaic - Disappearance

t u Dynamic Panorama: Iguazu Falls

From Video In to Video Out Constructing an aligned Space-Time Volume u t dt a α v b Alignment: Parallax, Dynamic Scenes, etc.

Aligned ST Volume: View from Top u u k k k+1 k+1 t t Stationary Camera Panning Camera

Generate Output Video Sweeping a “Time Front” surface Interpolation Time is not chronological any more

u t Mapping each TF to a new frame using spatio-temporal interpolation x Evolving Time Front u t

Example: Demolition

Example: Racing

Dynamic Panorama: Thessaloniki

Creating Panorama: 4D min-cut Aligned space-time volume t x

Mosaic Stitching Examples

Video Synopsis and Indexing Making a Long Video Short • 11 million cameras in 2008 • Expected 30 million in 2013 • Recording 24 hours a day, every day

Explosive growth in cameras… 24m 11m 2009 2014 31

Handling the Video Overflow • Not enough people to watch captured data • Guards are watching 1% of video • Automatic Video Analytics covers less than 5% • Only when events can be accurately defined & detected • Most video is never watched or examined!!!

A Recent Example

Related Work (Video Summary) • Key frames C. Kim and J. Hwang. An integrated scheme for object-based video abstraction. In ACM Multimedia, pages 303–311, New York, 2000. • Collection of short video sequences A. M. Smith and T. Kanade. Video skimming and characterization through the combination of image and language understanding. In CAIVD, pages 61–70, 1998. • Adaptive Fast Forward N. Petrovic, N. Jojic, and T. Huang. Adaptive video fast forward. Multimedia Tools and Applications, 26(3):327–344, August 2005. Entire frames are used as the fundamental building blocks • Mosaic images together with some meta-data for video indexing M. Irani, P. Anandan, J. Bergen, R. Kumar, and S. Hsu. Efficient representations of video sequences and their applications. Signal Processing: Image Communication, 8(4):327–351, 1996. • Space Time Video montage H. Kang, Y. Matsushita, X. Tang, and X. Chen. Space-time video montage. In CVPR’06, pages 1331–1338, New-York, June 2006.

Object Based Video Summary • We proposed Objects / Events based summary as opposed to Frames based summary • Enables to shorten a very long video into a short time • No fast forward of objects (preserve dynamics) • Causality is not necessarily kept

Video Synopsis • Browse Hours in Minutes • Index back to Original Video Video Synopsis: 1 minute Original video: 24 hours

Video Synopsis Shift Objects in Time Synopsis VideoS(x,y,t) Input Video I(x,y,t) t

How does Video Synopsis work? Original: 9 hours 09:03 Objects Extracted to Database 10:00 Video Synopsis: 30 seconds 14:38 11:08 18:45 21:50 38 38

How Does Video Synopsis works Original: 9 hours Video Synopsis: 30 seconds

Steps in Video Synopsis • Detect and track objects, store in database. • Select relevant objects from database • Display selected objects in a very short “Video Synopsis” • In “Video Synopsis”, objects from different times can appear simultaneously • Index from selected objects into original video • Cluster similar objects

Object “Packing” Input Video • Compute object trajectories • Pack objects in shorter time (minimize overlap) • Overlay objects on top of time-laps background t Synopsis Video x 42

Example: Monitoring a Coffee Station t x

Original Movie Stroboscopic Movie

Panoramic Synopsis Original Panoramic synopsis is possible when the camera is rotating. Panoramic Video Synopsis

Endless video – Challenges • Endless video – finite storage (“forget” events) • Background changes during long time periods • Stitching object on a background from a different time • Fast response to user queries

Online Monitoring Online Monitoring (real time) Compute background (background model) Find Activity Tubes and insert to database Handle a queue of objects Query Service Collect tubes with desired properties (time…) Generate Time Lapse Background Pack tubes into desired length of synopsis Stitching of objects to background 2 Phase approach

Extract TubesObject Detection and Tracking • We used a simplification of Background-Cut* • combining background subtraction with min-cut • Connect space time tubes component • Morphological operations * J. Sun, W. Zhang, X. Tang, and H. Shum. Background cut. In ECCV, pages 628–641, 2006

Extract Tubes

Video Synopsis

Video Synopsis

Presentation Transcript

Synopsis

SYNOPSIS?

SYNOPSIS

Video Synopsis by Heterogeneous Multi-Source Correlation

Nonchronological Video Synopsis and Indexing

Nonchronological Video Synopsis and Indexing

SYNOPSIS

Synopsis

Synopsis

Synopsis

Nonchronological Video Synopsis and Indexing

Synopsis

Synopsis

synopsis

Synopsis:

Synopsis

Synopsis

Synopsis