1 / 22

Automatic Video Tagging using Content Redundancy

Automatic Video Tagging using Content Redundancy. Stefan Siersdorfer 1 , Jose San Pedro 2 , Mark Sanderson 2 1 L3S Research Center, Germany 2 University of Sheffield, UK SIGIR 2009 2009. 11. 06. Summarized and Presented by Hwang Inbeom , IDS Lab., Seoul National University.

kali
Download Presentation

Automatic Video Tagging using Content Redundancy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Video Tagging using Content Redundancy Stefan Siersdorfer1 , Jose San Pedro2, Mark Sanderson2 1L3S Research Center, Germany 2University of Sheffield, UK SIGIR 2009 2009. 11. 06. Summarized and Presented by Hwang Inbeom, IDS Lab., Seoul National University

  2. Large Amount of Data on YouTube • Traffic to/from YouTube accounts for over 20% of the web total • Comprising 60% of on-line watched videos • Growing beyond human perception • Necessity to provide effective knowledge mining and retrieval tools

  3. Knowledge Mining and Retrieval • Making use of human annotation: Folksonomy • Provides relevant results at a relatively low cost • Applications • Topic detection and tracking • Information filtering • Document ranking • Etc. • However, content-based retrieval techniques are not mature enough • Folksonomy-based techniques outperform content-based techniques

  4. Problem: Poorly Annotated YouTube Videos • Hard to annotate videos • Intellectually expensive process • Time consuming job • Low-quality tags • Often very sparse • Lack consistency • Present numerous irregularities • Difficult to provide retrieval and knowledge extraction relying on textual features

  5. Motivation • Significant amount of near-duplicate videos • Over 25% near-duplicate videos detected in search results • Has been considered as a problem of online videos • Authors have seen this redundancy as a feature • Linkage between two different videos • Exploit redundancies to obtain richer video annotations

  6. PageRank-like Graph of Videos

  7. PageRank-like Graph of Videos Overlap Graph GO = (VO, EO)

  8. Edge in Graph Video j Video i • Means video i and j has redundant visual information • Three types of links • Duplicate videos • Part-of relationship • Overlapping

  9. Related Work: VisualRank (WWW 2008) Builds a graph of images using visual similarity between two images Runs PageRank algorithm to re-rank images

  10. Automatic Tagging • Different approach with that of VisualRank • Aims to enrich annotations • Not to improve search result • Three methods • Simple neighbor-based tagging • Overlap redundancy aware tagging • TagRank: Context-based tag propagation in video graphs

  11. Simple Neighbor-based Tagging Video j Video i w(vi, vj) w(vj, vi) • Transforms GO • Into the directed graph G’O(V’O, E’O) of overlapping videos • Weighting function of (i,j) describes to what degree video j is covered by video i

  12. Simple Neighbor-based Tagging (contd.) if vj is tagged with t otherwise • Gets tag t’s relevance score for a video from information of adjacent videos • Weighted sum of influences of overlapping videos tagged by t • Counts only adjacent videos’ tags

  13. An Example t t t t t’s relevance score

  14. Overlap Redundancy Aware Tagging Potential high increase of relevance score if a video has multiple redundant overlaps Contribution of same tag is reduced by relaxation parameter

  15. TagRank t • Tag weight propagates through the overlap graph • Relevance scores are computed in matrix form • TR converges into a certain value: solved with power iteration method • Start power iteration with original tagging information and limited number of iteration • To keep original tag relevance • To prevent TR(t) converging into uniform value

  16. Evaluation • Two kinds of evaluation: Machine-oriented and human-oriented view • Data organization with automatically generated tags • Classification • Clustering • User-based evaluation

  17. Data Collection • 38,283 videos: initial set C • Returned videos with top 500 general queries • Together with related videos given with results • Redundancy analysis • Over 35% of videos (VO) overlap with one or more other videos

  18. Data Organization • Classification with 7 YouTube categories • Each of them is containing over 900 videos in VO • Binary classification with SVM • Feature vectors constructed with original tags/automatically generated tags • Four strategies • BaseOrig: Only considering user-provided tags • NTag: Simple Neighbor-based tagging • RedNTag: Overlap redundancy aware tagging • TagRankΓ: TagRank with Γ iterations

  19. Data Organization • Clustering • k-Means clustering • Partition videos into k categories • Neighbor-based tagging and overlap redundancy aware tagging outperform baseline and TagRank methods in both experiments

  20. User-based Evaluation • Assessors rate new tags with web interface • Increasingly higher average score when considering tags having higher autotag relevance score

  21. Conclusions • Content redundancy in social sharing systems can be used to obtain richer annotations • Additional information obtained by automatic tagging can largely improve automatic organization of content • There is information gain for users also • Future work • Authors plan to generalize this work to consider different domains • Photos in Flickr • Text in Delicious • Analysis and generation of deep tags • Tags linked to a small part of larger media source

  22. Discussion • Good idea and good formalization • Would be better if performance of TagRank were good • Considering only neighbors is too naïve method • How can we deal with overhead of visual processing? • Would it be scalable enough to apply it to all videos in YouTube?

More Related