1 / 18

Video Event Recognition: Multilevel Pyramid Matching

Video Event Recognition: Multilevel Pyramid Matching. Dong Xu and Shih-Fu Chang Digital Video and Multimedia Lab Department of Electrical Engineering Columbia University *Courtesy to Eric Zavesky for preparing for the slides. Video Event Recognition: Problem.

genero
Download Presentation

Video Event Recognition: Multilevel Pyramid Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Video Event Recognition:Multilevel Pyramid Matching • Dong Xu and Shih-Fu Chang • Digital Video and Multimedia Lab • Department of Electrical Engineering • Columbia University • *Courtesy to Eric Zavesky for preparing for the slides

  2. Video Event Recognition: Problem • Online video search and video indexing • Events characterized by an evolution of scenes, objects and actions over time • 56 events are defined in LSCOM Airplane Flying Car Exiting

  3. Video Event Recognition: Challenges • Geometric and photometric variances • Clutter background • Complex camera motion and object motion

  4. Object Detection & Localization Inference Tracking “Airplane Landing” Event Recognition: Object Tracking • Detect interest object, track over time, and model spatio-temporal dynamics • Hard to detect events without explicit object motion, such as Riot ?

  5. Keyframe Feature Similarity 18% 15% 50% ... ... Event Recognition: Key-Frame based Matching • Only key-frame is used for matching. • Low-level feature extraction, compare to other frames, overall decision on matching

  6. feature extraction concept detectors EMD distance a ... ... X Event Recognition: Multi-level Pyramid Matching multi-level pyramid matching

  7. edge directionhistogram σ Gabortexture σ σ μ γ μ γ μ γ grid colormoment Content Representation: Low-level Features

  8. Image Database + - Content Representation: Mid-level Semantic Concept Scores Concept Detectors • Train detectors on low-level features • Mid-level semantic concept feature is more robust • Developed and released 374 semantic concept detectors

  9. Earth Mover’s Distance (EMD): Approach SupplierP is with agiven amount of goods ReceiverQis with a given limited capacity dij 1 1/2 1/2 Weights:Solved by linear programming • Temporal shift:a frame at the beginning of P can be mapped to a frame at the end of Q • Scale variations: a frame from P can be mapped to multiple frames in Q

  10. Multi-level Pyramid Matching: Motivations • One Clip = several subclips(stages of event evolution) • No prior knowledge about the number of stages in an event • Videos of the same event may include only a subset of stages Solution: Multi-level pyramid matching in temporal domain

  11. Multi-level Pyramid Matching: Algorithm Smoke Fire • Temporally Constrained Hierarchical Agglomerative Clustering Level-2 Level-2 Level-1 • Alignment of different subclips (Level-1 as an example) Level-1 Level-0 Level-0 EMD Distance Matrix between Sub-clips Integer-value Alignment Level-2 Level-2 Level-1 Level-1 • Fusion of information from different levels. Smoke Fire

  12. Pyramid Matching: Projected Illustration

  13. Pyramid Matching: Animated Example

  14. Experiments: Keyframe based feature performance Evaluation Metric: Average Precision Dataset: TRECVID2005

  15. Experiments: EMD concept performance

  16. Experiments: Benefits of multi-level pyramid fusion

  17. Video Event Recognition: Conclusions • Single-level EMD outperforms key-frame based method. Multi-level Pyramid Matching further improves event detection accuracy. • First systematic study of diverse visual event recognition in the unconstrained broadcast news domain.

  18. Thank you very much!

More Related