Video Retrieval

Video Retrieval

Topics • Shot detection algorithm • Video indexing • key frame-based video indexing • Adaptive video indexing technique • Automatic relevance feedback network for video retrieval • Experiment on iARM: video search engine

Video Data • Video is a continuous media but for database storage and manipulation such as random access it is important to be able to deal with portions of video object. • Video Segmentation―cutting long video into portions: shot, scene, and clip • Shot define a low level syntactic building blocks of video sequence. • Scene is the logical grouping of shots into semantic unit. • Clip is not clearly defined so it can last from a few seconds to several hours.

Video Segmentation

Organization of Video Data

Shot Boundaries • Shot boundary detection can be easy, or difficult, depending • cut: hard boundary complete change of shot between consecutive frames • fade: fade-out or fade-in, a gradual fade to/from completely back (white?) frame • dissolve: simultaneous fade-out and fade-in • …..others • Each of these post-production technique make the detection of shot boundaries more difficult.

Cut Fade in Fade out Dissolve

Frame-to-frame comparison is the color histogram of the j-th frame

Shot Boundary Detection

Key-Frame for Shot Representation key-frame key-frame

Content Representation • Content of video shot is describe by a low-level feature (e.g., color histogram) of the corresponding key-frame. • The m-th video shot is indexed by Key-Frame Content Descriptor Video Shot

Querying Video Database Video Shot 1 Video Shot 2 Video Shot 3 Query Shot Matching Video Shot J Database

GUI for Key Frame-Based Video Retrieval Query Shot Play shot

Problems • Compared to an image, video data contains both spatial and temporal information • Key frame-based video indexing (KFVI) method can deal with spatial content but does not take into account temporal information. • Furthermore, KFVI is not well adapted for representing video at scene and story levels

Adaptive Video Indexing (AVI) Technique • A better technique in capturing temporal content as well as the spatial content for effective video indexing • AVI provide multiple access to video database at three levels: • shot • group-of-shot • Story

Video shot database group of shots story Database Organization based on AVI Multiple-level access to video database Where is the descriptor of the video interval

Fundamental of AVI • Video sequence is a collection of visual templates (i.e., image frame) • Similar video contains use similar visual templates V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 V 9

Fundamental of AVI V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 V 9 Descriptor of shot 1: [0 0 2 0 3 0 0 2 0 ...] Descriptor of shot 2: [0 0 0 0 3 2 0 0 0 ...] V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 V 9

Template Generation • Given a set of initial visual templates, and training vectors • The templates are optimized through the following steps: • Randomly choose the input vector • If is the closest node to such that • Then,

Template-frequency modeling (TFM) • Let be a set of descriptors for the video interval I, where is the histogram corresponding to the video frame • Each is mapped to a Voronoi space through where and is the label of the n-th cell neighboring to the best match cell,

TFM Cont. • The resulting of all frames from the mapping of the entire video interval are used as a representation of the video through a weight scheme: where is the number of times the template is mentioned in the content of the video , N denotes the total number of videos in the system, and denotes the number of videos in which the index template appears.

Test Data Description of sequences in the database: CNN broadcast news (at 352 resolution and 30 frames/sec.)

Retrieval Results (b) (a) A comparison of the retrieval performance at the shot level; (a) obtained by KFVI; and (b) obtained by the AVI

Performance Comparison Precision results averaged over 25 queries, compared between adaptive video indexing (AVI) and key-frame based video indexing (KFVI), using video database containing 844 video shots

Query-by-Video-Clip (b) (a) • Precision and recall rates obtained by retrieval of: • video groups, employing two links: shot-to-group (STG) and group-to-group (GTG) • video story, employing two links: shot-to-story (STS) and group-to-story (GTS)

Query-by-Video-Clip Cont.. (a) Query clip, <1.8 sec> (b) Rank 1, <1.8 sec> (c) Rank 2, <2.4 sec> (d) Rank 3, <1.9 sec> (e) Rank 4, <2.7 sec> (f) Rank 5, <3.3 sec>

Relevance Feedback for Video Retrieval: A client-server architecture Search Engine with Relevance Feedback

Problem with Relevance Feedback (RF) • user have to play each retrieved video in a feedback cycle • compared to an image, video files are usually very large • time consuming • high bandwidth in RF training process

Automatic and Semi-Automatic RFs Search Engine with Automatic Relevance Feedback Network

Automatic Relevance Feedback Network (ARFN) • Goal: implementation of adaptive system to improve retrieval accuracy • Strategy: incorporate self-learning neural network in the relevance feedback module in order to avoid user’s interaction during the retrieval process

ARFN Architecture number of nodes in the second layer = number of visual templates number of nodes in the third layer = number of video in the database

Signal Propagation (a) (b) (c) (a) Forward propagation; (b) Backward propagation; (c) New video template nodes in (b) introduce a new video node. This process results in the activation of new video nodes by expanding the original query templates, analogous to the traditional relevance feedback technique

Signal Propagation Cont.. • Activation level at the video template nodes, can be calculate according to two criterion: • Positive feedback • Positive and negative feedback where is the activation of the j-th video node, Pos is the set of positive video nodes, Neg is the set of negative video nodes

Results Average Precision Rate, APR (%) obtained by retrieving 25 video shot queries. ARFN results are quoted relative to the APR observed with simple retrieval.

Experiment: Video Search Engine

Goals • Setting video search engine at the shot level, using JSP and J2EE server • Implementing video indexing using AVI and compared it with KFVI • Implementing a simple user-controlled interactive retrieval method within the search engine

GUI in iARM search engine Query Shot Selected method

Step I • Copy all the files in the folder “Experiments” to drive C: • Feature Database • key-frame Database Jsp file and Java Beans Video Shot Database

Step II: load feature vectors to database >> java COM.cloudscape.tools.cview Open the video feature database “C:\Experiments\database\video

Step III: deploy application deploytool New Application

Add Web components • to the Application • index.jsp • autoFeedback.class • CompType.class • MyDateJose.class • MyLocalRbf.class • userData.class

Deploy the Application

Open the search engine: “http://localhost:8000/iARM/index.jsp”

Video Retrieval

Video Retrieval

Presentation Transcript

Video Analysis: Annotation technology for retrieval

Image and Video Retrieval

RIAO 2004 2 video retrieval systems

Content-Based Video Retrieval System

Content-based Video Indexing, Classification & Retrieval

Video Information Retrieval

Video Data Retrieval

Spatio Temporal Video Retrieval

Face Recognition and Retrieval in Video

TREC Video Retrieval Evaluation TRECVID

Video Google – A google approach to Video Retrieval

Image and Video Retrieval

Video Retrieval

CM613 Multimedia storage and retrieval Video compression

Concept-based Image and Video Retrieval

Content-Based Video Retrieval System

Video Information Retrieval

Content-based Video Indexing, Classification & Retrieval

Video Retrieval

Video Retrieval

Presentation Transcript

Video Analysis: Annotation technology for retrieval

Image and Video Retrieval

RIAO 2004 2 video retrieval systems

Content-Based Video Retrieval System

Content-based Video Indexing, Classification &amp; Retrieval

Video Information Retrieval

Video Data Retrieval

Spatio Temporal Video Retrieval

Face Recognition and Retrieval in Video

TREC Video Retrieval Evaluation TRECVID

Video Google – A google approach to Video Retrieval

Image and Video Retrieval

Video Retrieval

CM613 Multimedia storage and retrieval Video compression

Concept-based Image and Video Retrieval

Content-Based Video Retrieval System

Video Information Retrieval

Content-based Video Indexing, Classification &amp; Retrieval

Content-based Video Indexing, Classification & Retrieval

Content-based Video Indexing, Classification & Retrieval