1 / 24

Real-time and Retrospective Analysis of Video Streams using MPEG-7

This paper discusses the use of MPEG-7 standards to analyze HD video streams and still image collections. The paper presents a framework for real-time and retrospective analysis, including descriptor extraction and search techniques. The application architecture, user interface, and challenges are also discussed.

peterl
Download Presentation

Real-time and Retrospective Analysis of Video Streams using MPEG-7

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Real-time and Retrospective Analysis of Video Streams and Still Image Collections using MPEG-7 Ganesh Gopalan, College of Oceanic and Atmospheric Sciences, Oregon State University

  2. Introduction • HD video streams have potential to improve understanding of deep sea eco-systems • However, volume and complexity associated with the HD streams and formats can be overwhelming • Our approach: Use industry standards to transform video into a data type vs. treating it as viewing material

  3. MPEG-7 Overview • Multimedia content description interface • Consists of low-level descriptors and high-level description schemes • Low-level descriptors provide statistical information about the pixel values in content • Description Schemes are used to represent semantic information

  4. Low Level Descriptors • Structures that describe content in terms of the distribution of edges, colors, textures, shapes and motion • Descriptors extracted using MPEG-7 Experimental Model (XM) software • The input is a still image or a frame from video • The output is an XML description of the statistical information

  5. Examples of Low Level Descriptors • Edge Histogram • Homogeneous Texture • Color Layout • Color Structure • Motion Activity • Descriptors are rotation and scaling invariant

  6. Descriptor Extraction and Search • Phase 1: descriptor XML for collection of frames/still images is generated and cached • Phase 2: difference between query image descriptor from those values cached in phase one is computed • The cache can be augmented with the descriptors from a new video or still image collection

  7. Description Schemes • Description Schemes attempt to model the reality behind the content • Low level descriptors can be used to tag objects of interest; the tags are then used to construct a high level description • A search can then be performed against the higher level description schemes

  8. High Definition Video Search Engine • Applied MPEG-7 to the development of an HD search engine • Extracted descriptors for approximately 10,000 frames from 2.5 hours of high definition content • Content provided by the University of Washington from “Visions 05 Cruise” • Also applied to search for eddies in satellite image collections; super-cells in radar images

  9. Application Architecture • .NET Windows Forms front end with an embedded Windows Media Player • SQL Server back-end • Common Language Run-time Integration for development of stored procedures to manage MPEG-7 XML • Procedures can be written in .NET languages rather than SQL

  10. Creating a CLR Stored Procedure CREATE FUNCTION FindUsingVisualDescriptor ( @uid int, @token uniqueidentifier, @queryImage varbinary(MAX), @descriptorName nvarchar(256) ) RETURNS nvarchar(MAX) AS EXTERNAL NAME MPEG7Document.StoredProcedures.FindUsingVisualDescriptor; GO

  11. Creating an HTTP Endpoint CREATE ENDPOINT MPEG7 STATE = Started AS HTTP ( SITE = ‘XXX.XXX.XX.XXX', PATH = '/MPEG7Endpoint', AUTHENTICATION = (BASIC), PORTS = (SSL), SSL_PORT = 444 ) FOR SOAP (WEBMETHOD 'FindUsingVisualDescriptor' (NAME = 'looking.dbo.FindUsingVisualDescriptor', FORMAT = ALL_RESULTS), …)

  12. User Interface • UI allows conversion of video into frames using ffmpeg • Descriptors of choice are then generated for all frames • Descriptors are persisted to the server

  13. Retrospective Search • A query image initiates the search • The descriptor value for the given image is compared with those cached from the video frames or still images • The top 100 frames that are closest to the query image are returned

  14. Retrospective Search Example

  15. Real-time Event Detection • In this case, we have a set of known images that have objects of interest • Descriptors of frames from a real-time stream are compared on a continuous basis with those in the “event library” • When the difference in descriptor values is below a threshold, an event has been detected

  16. Example of an Event

  17. Reference Event

  18. Use of Multi-Core Systems • The descriptor extraction process can be made faster by taking advantage of multiple processors or cores • The total number of frames can be divided up amongst the available processors • Threads extract the descriptors concurrently to generate chunks of XML • The threads then signal each other to combine the chunks into a single file with the descriptor XML

  19. Challenges • Shadows and other lighting issues can create false positives • May be necessary to use multiple descriptors for classification • Processing high definition video at 30fps is computationally intensive • Scaling to a large number of images such as on the web presents a challenge

  20. Conclusion • MPEG-7 supports a rich framework for content-based searches through its low level descriptors • Detected content can be tagged effectively using the high level description schemes that can be used to locate, search through and distribute content

  21. Future Directions • Need to explore ways to speed up descriptor extraction using GPUs or hybrid GPGPUs. • Explore Cloud Services to implement video services – transcoding video on the fly for different devices, descriptor extraction using HPC clusters, streaming services • Explore the Surface Computer as a UI

  22. Acknowledgements • We are thankful to Professor John Delaney from the University of Washington for providing the HD footage • We are also thankful to the NSF funded LOOKING team for supporting this effort

More Related