1 / 40

Commentary-based Video Categorization and Concept Discovery

Commentary-based Video Categorization and Concept Discovery. By Janice Leung. Agenda. Introduction to Video Sharing Sites Current Problem Previous Works Commentary-based Video Clustering Conclusion Future Works. Video Sharing Sites. Allows users to upload videos Shares videos worldwide

hung
Download Presentation

Commentary-based Video Categorization and Concept Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Commentary-based Video Categorization and Concept Discovery By Janice Leung

  2. Agenda • Introduction to Video Sharing Sites • Current Problem • Previous Works • Commentary-based Video Clustering • Conclusion • Future Works

  3. Video Sharing Sites • Allows users to upload videos • Shares videos worldwide • Example: • Dailymotion • YouTube • MySpace

  4. De Facto • YouTube • More than 65,000 new videos every day • 100 million videos views daily • 20 million unique visitors per month

  5. Immense amount of videos Incredible growth of videos How to search for desired video? YouTube: Tags + simple Categorization

  6. YouTube • Predefined categories • Videos • Title • Description • Tags • Category • Comments Provided by the one who uploads the video Provided by many users

  7. Related Works • Classify videos: • Video features: color, grayscale histogram, pixel information • Keywords from description • Tags • Find user interests: • Object fetching information • Tags

  8. Problems • Video features • Cannot tell exactly what the video is about • No users interest is considered • Keywords from description • Description provided by the one who uploaded the video • Not sufficient information

  9. Problems (Cont.) • Tags • Not sufficient information • May reflect users feelings on videos but too brief to represent the complex idea of the videos • Object fetching information • Reflects users interests but no information about the videos at all

  10. Video Categorization and Concept Discovery • Site: YouTube • Videos: involving Hong Kong singers

  11. Comments Given by many users Can be large amount Express users opinions Rich words describe fine-grained level ideas Tags Given by only one person (the one who uploaded the video) Few tags Describe the video in a very brief way Singer name Song name Comment vs Tag

  12. Comments • Include: • Video content • Music styles • Music ages • Singer description • Appearance • Style • News etc.

  13. Commentary-based Video Categorization • Objective: Categories videos based on user interests and discover the concept of videos • Cluster videos by using comments • Group videos based on user interests • Find video concepts • Clustering algorithm: multi-assignment NMF

  14. Video clustering • Bi-clustering: videos and words • Clusters videos and words into k groups by matrix factorization • Video-word matrix X as input • Video-word matrix X is derived by tf-idf

  15. Tf-idf • Term frequency (tf) • Suppose there are t distinct terms in document j where fi,jis the number of occurrence of term i in document j

  16. Tf-idf (Cont.) • Inverse document frequency where N is the total number of documents in dataset and ni is number of documents containing term i

  17. Tf-idf (Cont.) • Importance weight of term i to document j • Matrix X as input to NMF is defined as

  18. Video Clustering (Cont.) • Decompose X into non-negative matrices W and H by minimizing where Ref. : Document Clustering Based On Non-negative Matrix Factorization (Xu et al SIGIR’03)

  19. Video Clustering (Cont.) NMF decomposition for video clustering

  20. Video Clustering (Cont.) • Suppose • Number of videos: N • Number of distinct terms: M • Threshold: β • W in size M x K • wn,k: coefficient indicates how video n belongs to cluster k

  21. Video-cluster assignment • Videos can belongs to multiple groups • Multi-cluster assignment • Video n belongs to cluster k if • Set of clusters that video n belongs to: where K is set if all clusters

  22. Video-cluster assignment (Cont.) • Threshold, β • Many irrelevant videos for each cluster • Coefficient distribution varies for different clusters • Coefficient distribution dependant • Different for different clusters

  23. Concept Discovery • Matrix H in size of K x M • hk,m: how likely term m belongs to cluster k • Term belongs to a cluster describes the videos in that cluster • Concept words of cluster k videos • Top 10 words of cluster k

  24. Experiment • 19305 videos • 102 Hong Kong singers • 7271 users • Number of cluster, k: 20

  25. Experiment (Cont.) • Threshold, β • Coefficient distribution dependant • Threshold for cluster i is defined as

  26. Experiment (Cont.) • Video coefficients may distribute in an extremely uneven manner • Cause poor result • To compensate, threshold can be set as

  27. Experiment (Cont.) C1 C2 C3 V1 0.65 0.65 0.23 0.23 0.12 0.12 V2 0.35 0.35 0.64 0.64 0.01 0.01 V3 0.65 0.65 0.22 0.22 0.13 0.13 V4 0.05 0.05 0.64 0.64 0.31 0.31 V5 0.00 0.00 0.30 0.30 0.70 0.70

  28. Experiment (Cont.)

  29. Experiment (Cont.)

  30. Experiment (Cont.)

  31. Concept Words vs Tags

  32. Concept Words vs Tags Percentage of videos with tags covering concept words across groups

  33. Singer Relationship Discovery • Comments on videos may talk about singers • Singer styles, appearance, news • Singer clustering using comments • Reveals relationships between singers • Discovers hidden phenomenon

  34. Singer Relationship Discovery (Cont.)

  35. Conclusion • Captures user interests more accurately and fairly than that of the human predefined categories • Categories can be changed dynamically, user interest changes from time to time • Obtain clusters with fine-grained level ideas • Ease the task of video search by categorizing videos and refining index

  36. Future Works • Extend to user clustering • Obtain relationships videos, singers and users of the entire social network • Study the social culture • Ease the job of advertising to target customers • Connect people who share the same interests

  37. Q & A Questions?

More Related