Video Retrieval - PowerPoint PPT Presentation

video retrieval n.
Skip this Video
Loading SlideShow in 5 Seconds..
Video Retrieval PowerPoint Presentation
Download Presentation
Video Retrieval

play fullscreen
1 / 43
Video Retrieval
Download Presentation
Download Presentation

Video Retrieval

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Video Retrieval

  2. Topics • Shot detection algorithm • Video indexing • key frame-based video indexing • Adaptive video indexing technique • Automatic relevance feedback network for video retrieval • Experiment on iARM: video search engine

  3. Video Data • Video is a continuous media but for database storage and manipulation such as random access it is important to be able to deal with portions of video object. • Video Segmentation―cutting long video into portions: shot, scene, and clip • Shot define a low level syntactic building blocks of video sequence. • Scene is the logical grouping of shots into semantic unit. • Clip is not clearly defined so it can last from a few seconds to several hours.

  4. Video Segmentation

  5. Organization of Video Data

  6. Shot Boundaries • Shot boundary detection can be easy, or difficult, depending • cut: hard boundary complete change of shot between consecutive frames • fade: fade-out or fade-in, a gradual fade to/from completely back (white?) frame • dissolve: simultaneous fade-out and fade-in • …..others • Each of these post-production technique make the detection of shot boundaries more difficult.

  7. Cut Fade in Fade out Dissolve

  8. Frame-to-frame comparison is the color histogram of the j-th frame

  9. Shot Boundary Detection

  10. Key-Frame for Shot Representation key-frame key-frame

  11. Content Representation • Content of video shot is describe by a low-level feature (e.g., color histogram) of the corresponding key-frame. • The m-th video shot is indexed by Key-Frame Content Descriptor Video Shot

  12. Querying Video Database Video Shot 1 Video Shot 2 Video Shot 3 Query Shot Matching Video Shot J Database

  13. GUI for Key Frame-Based Video Retrieval Query Shot Play shot

  14. Problems • Compared to an image, video data contains both spatial and temporal information • Key frame-based video indexing (KFVI) method can deal with spatial content but does not take into account temporal information. • Furthermore, KFVI is not well adapted for representing video at scene and story levels

  15. Adaptive Video Indexing (AVI) Technique • A better technique in capturing temporal content as well as the spatial content for effective video indexing • AVI provide multiple access to video database at three levels: • shot • group-of-shot • Story

  16. Video shot database group of shots story Database Organization based on AVI Multiple-level access to video database Where is the descriptor of the video interval

  17. Fundamental of AVI • Video sequence is a collection of visual templates (i.e., image frame) • Similar video contains use similar visual templates V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 V 9

  18. Fundamental of AVI V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 V 9 Descriptor of shot 1: [0 0 2 0 3 0 0 2 0 ...] Descriptor of shot 2: [0 0 0 0 3 2 0 0 0 ...] V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 V 9

  19. Template Generation • Given a set of initial visual templates, and training vectors • The templates are optimized through the following steps: • Randomly choose the input vector • If is the closest node to such that • Then,

  20. Template-frequency modeling (TFM) • Let be a set of descriptors for the video interval I, where is the histogram corresponding to the video frame • Each is mapped to a Voronoi space through where and is the label of the n-th cell neighboring to the best match cell,

  21. TFM Cont. • The resulting of all frames from the mapping of the entire video interval are used as a representation of the video through a weight scheme: where is the number of times the template is mentioned in the content of the video , N denotes the total number of videos in the system, and denotes the number of videos in which the index template appears.

  22. Test Data Description of sequences in the database: CNN broadcast news (at 352 resolution and 30 frames/sec.)

  23. Retrieval Results (b) (a) A comparison of the retrieval performance at the shot level; (a) obtained by KFVI; and (b) obtained by the AVI

  24. Performance Comparison Precision results averaged over 25 queries, compared between adaptive video indexing (AVI) and key-frame based video indexing (KFVI), using video database containing 844 video shots

  25. Query-by-Video-Clip (b) (a) • Precision and recall rates obtained by retrieval of: • video groups, employing two links: shot-to-group (STG) and group-to-group (GTG) • video story, employing two links: shot-to-story (STS) and group-to-story (GTS)

  26. Query-by-Video-Clip Cont.. (a) Query clip, <1.8 sec> (b) Rank 1, <1.8 sec> (c) Rank 2, <2.4 sec> (d) Rank 3, <1.9 sec> (e) Rank 4, <2.7 sec> (f) Rank 5, <3.3 sec>

  27. Relevance Feedback for Video Retrieval: A client-server architecture Search Engine with Relevance Feedback

  28. Problem with Relevance Feedback (RF) • user have to play each retrieved video in a feedback cycle • compared to an image, video files are usually very large • time consuming • high bandwidth in RF training process

  29. Automatic and Semi-Automatic RFs Search Engine with Automatic Relevance Feedback Network

  30. Automatic Relevance Feedback Network (ARFN) • Goal: implementation of adaptive system to improve retrieval accuracy • Strategy: incorporate self-learning neural network in the relevance feedback module in order to avoid user’s interaction during the retrieval process

  31. ARFN Architecture number of nodes in the second layer = number of visual templates number of nodes in the third layer = number of video in the database

  32. Signal Propagation (a) (b) (c) (a) Forward propagation; (b) Backward propagation; (c) New video template nodes in (b) introduce a new video node. This process results in the activation of new video nodes by expanding the original query templates, analogous to the traditional relevance feedback technique

  33. Signal Propagation Cont.. • Activation level at the video template nodes, can be calculate according to two criterion: • Positive feedback • Positive and negative feedback where is the activation of the j-th video node, Pos is the set of positive video nodes, Neg is the set of negative video nodes

  34. Results Average Precision Rate, APR (%) obtained by retrieving 25 video shot queries. ARFN results are quoted relative to the APR observed with simple retrieval.

  35. Experiment: Video Search Engine

  36. Goals • Setting video search engine at the shot level, using JSP and J2EE server • Implementing video indexing using AVI and compared it with KFVI • Implementing a simple user-controlled interactive retrieval method within the search engine

  37. GUI in iARM search engine Query Shot Selected method

  38. Step I • Copy all the files in the folder “Experiments” to drive C: • Feature Database • key-frame Database Jsp file and Java Beans Video Shot Database

  39. Step II: load feature vectors to database >> java Open the video feature database “C:\Experiments\database\video

  40. Step III: deploy application deploytool New Application

  41. Add Web components • to the Application • index.jsp • autoFeedback.class • CompType.class • MyDateJose.class • MyLocalRbf.class • userData.class

  42. Deploy the Application

  43. Open the search engine: “http://localhost:8000/iARM/index.jsp”