Automatic Video Tagging using Content Redundancy

Automatic Video Tagging using Content Redundancy Stefan Siersdorfer1 , Jose San Pedro2, Mark Sanderson2 1L3S Research Center, Germany 2University of Sheffield, UK SIGIR 2009 2009. 11. 06. Summarized and Presented by Hwang Inbeom, IDS Lab., Seoul National University

Large Amount of Data on YouTube • Traffic to/from YouTube accounts for over 20% of the web total • Comprising 60% of on-line watched videos • Growing beyond human perception • Necessity to provide effective knowledge mining and retrieval tools

Knowledge Mining and Retrieval • Making use of human annotation: Folksonomy • Provides relevant results at a relatively low cost • Applications • Topic detection and tracking • Information filtering • Document ranking • Etc. • However, content-based retrieval techniques are not mature enough • Folksonomy-based techniques outperform content-based techniques

Problem: Poorly Annotated YouTube Videos • Hard to annotate videos • Intellectually expensive process • Time consuming job • Low-quality tags • Often very sparse • Lack consistency • Present numerous irregularities • Difficult to provide retrieval and knowledge extraction relying on textual features

Motivation • Significant amount of near-duplicate videos • Over 25% near-duplicate videos detected in search results • Has been considered as a problem of online videos • Authors have seen this redundancy as a feature • Linkage between two different videos • Exploit redundancies to obtain richer video annotations

PageRank-like Graph of Videos

PageRank-like Graph of Videos Overlap Graph GO = (VO, EO)

Edge in Graph Video j Video i • Means video i and j has redundant visual information • Three types of links • Duplicate videos • Part-of relationship • Overlapping

Related Work: VisualRank (WWW 2008) Builds a graph of images using visual similarity between two images Runs PageRank algorithm to re-rank images

Automatic Tagging • Different approach with that of VisualRank • Aims to enrich annotations • Not to improve search result • Three methods • Simple neighbor-based tagging • Overlap redundancy aware tagging • TagRank: Context-based tag propagation in video graphs

Simple Neighbor-based Tagging Video j Video i w(vi, vj) w(vj, vi) • Transforms GO • Into the directed graph G’O(V’O, E’O) of overlapping videos • Weighting function of (i,j) describes to what degree video j is covered by video i

Simple Neighbor-based Tagging (contd.) if vj is tagged with t otherwise • Gets tag t’s relevance score for a video from information of adjacent videos • Weighted sum of influences of overlapping videos tagged by t • Counts only adjacent videos’ tags

An Example t t t t t’s relevance score

Overlap Redundancy Aware Tagging Potential high increase of relevance score if a video has multiple redundant overlaps Contribution of same tag is reduced by relaxation parameter

TagRank t • Tag weight propagates through the overlap graph • Relevance scores are computed in matrix form • TR converges into a certain value: solved with power iteration method • Start power iteration with original tagging information and limited number of iteration • To keep original tag relevance • To prevent TR(t) converging into uniform value

Evaluation • Two kinds of evaluation: Machine-oriented and human-oriented view • Data organization with automatically generated tags • Classification • Clustering • User-based evaluation

Data Collection • 38,283 videos: initial set C • Returned videos with top 500 general queries • Together with related videos given with results • Redundancy analysis • Over 35% of videos (VO) overlap with one or more other videos

Data Organization • Classification with 7 YouTube categories • Each of them is containing over 900 videos in VO • Binary classification with SVM • Feature vectors constructed with original tags/automatically generated tags • Four strategies • BaseOrig: Only considering user-provided tags • NTag: Simple Neighbor-based tagging • RedNTag: Overlap redundancy aware tagging • TagRankΓ: TagRank with Γ iterations

Data Organization • Clustering • k-Means clustering • Partition videos into k categories • Neighbor-based tagging and overlap redundancy aware tagging outperform baseline and TagRank methods in both experiments

User-based Evaluation • Assessors rate new tags with web interface • Increasingly higher average score when considering tags having higher autotag relevance score

Conclusions • Content redundancy in social sharing systems can be used to obtain richer annotations • Additional information obtained by automatic tagging can largely improve automatic organization of content • There is information gain for users also • Future work • Authors plan to generalize this work to consider different domains • Photos in Flickr • Text in Delicious • Analysis and generation of deep tags • Tags linked to a small part of larger media source

Discussion • Good idea and good formalization • Would be better if performance of TagRank were good • Considering only neighbors is too naïve method • How can we deal with overhead of visual processing? • Would it be scalable enough to apply it to all videos in YouTube?

Automatic Video Tagging using Content Redundancy