video google a google approach to video retrieval n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Video Google – A google approach to Video Retrieval PowerPoint Presentation
Download Presentation
Video Google – A google approach to Video Retrieval

Loading in 2 Seconds...

play fullscreen
1 / 59

Video Google – A google approach to Video Retrieval - PowerPoint PPT Presentation


  • 177 Views
  • Uploaded on

Video Google – A google approach to Video Retrieval. Introduction. Problem: Retrieve key frames and shots that of a video containing a particular object or scene with the ease and accuracy of Google. Approach: Effectively precompute matches Textual analogy. Architecture. Visual Word.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Video Google – A google approach to Video Retrieval


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Video Google – A google approach to Video Retrieval

    2. Introduction • Problem: • Retrieve key frames and shots that of a video containing a particular object or scene with the ease and accuracy of Google. • Approach: • Effectively precompute matches • Textual analogy

    3. Architecture Visual Word User-End Storage Indexing

    4. Dhruvan Dileep Nishant Pradeep Pramod Sunil Video Google –Visual words

    5. MSER Maximally Stable Extremal Regions A Maximally Stable Extremal Region (MSER) is a connected component of an appropriately thresholded image

    6. SA The Shape Adapted regions are invariant to affine transformations. The SA regions tend to be centered on corner like features.

    7. SIFT Scale Invariant Feature Transform Invariant to image scaling and rotation Partially invariant to changes in illumination and viewpoint 128 dimensional descriptor

    8. Clustering • Aim : To vector quantize descriptors into clusters to be used as Visual words • Clustering Techniques • Agglomerative • O(n2) space. • Kmeans • O(n+k) space, O(n*k*e) time complexity • Fast Kmeans • Triangulation inequality used. • O(n*k) space. • Distance calculations reduced to ~ n than n*k*e

    9. Statistics • 19 Half an hour videos: • Classification 1060842 points – 9 hours

    10. Clustering Evaluation

    11. DB and API Indexing/Retrieval Visual Words UI

    12. Results

    13. Future Work • Vocabulary Tree for interest point classification • Increase the visual vocabulary through efficient clustering.

    14. Indexing and Retrieval in Vgoogle D Pavan Kumar B Rakesh Babu B Naveen Kumar Ankur Jaiswal V Sreekanth P Kowshik J Shashank

    15. Overview Visual Words Indexing Results Query

    16. Input format Pre-processing Query Set of visual words in the query rectangle

    17. Output format Retrieved Results

    18. Objectives Efficient Indexing Fast Retrieval Time Good Recall

    19. Approach … Removing the common words Reverse Indexing Ranking of results

    20. Indexing and Retrieval in Document Retrieval Stop list Used to remove the common words. Inverse File Structure An entry for each word in the corpus followed by a list of all the documents in which it appears. Spatial Consistency Ranking Use the ordering and separation of words to calculate the relevance of a document.

    21. Stop list In textual context Words are extracted from text. Words are filtered based on the level of usefulness. For instance words which are independent of subject or event being described are filtered out. Removing such words will have no effect on the results. E.g.: The way the school is long and hard when walking in the rain. Removing `the` will have no effect on the result.

    22. Stop list (contd.) In the current context Stop list - list of visual words. Occur very often or very less. Determine stop list boundaries empirically. Advantages Reduce number of mismatches Reduce size of inverted file Meaningful visual vocabulary

    23. Stop list (contd…)

    24. Inverse File Structure Inverted File structure for Indexing Popular DS in Document Retrieval Mapping from words to Document Less query time compared to Forward indexing Forward Indexing – Sequential Inverted Indexing – Random

    25. Words D1051 Movie D3 D23 D25 D1 Spain D1 D3 D8 ……. D2029 Table D2 D8 D100 ……. ……. ……. ……. ……. D12 D1078 D102 D25 Song

    26. Visual Analogy Words ~ Visual Words Documents ~ Frames Query vector ~ visual words in Sub-Part of frame

    27. Visual words D1051 V1 D3 D23 D25 D1 V2 D1 D3 D8 ……. D2029 V3 D2 D8 D100 ……. ……. ……. ……. ……. D12 D1078 D102 D25 Vn

    28. Ranking the results - tf-idf Document – vector of word frequencies Each component of the vector is given some weight Standard Weighting Method TF-IDF

    29. Each document is represented as a vector < t1, t2, t3, … ti,…, tk-1, tk > nid - number of occurrences of ith word in document d. nd - total number of words in document d. ni - number of occurrences of ith visual word in whole database. N - number of documents in the whole database IDF – down weights most frequent words Ranked by cosine of angle between query vector and all document vectors. Ranking the results - tf-idf

    30. Ranking the results – Spatial Consistency “Google increases the probability of documents having all the search words close to one another" Thereit is. That’s what I….. been.... have …. I have been thereonce , while ……..

    31. Spatial Consistency Ranking Spatial arrangement of objects in images. Spatial consistency measure - Re-rank the results Neighboring matches in the query region lie in a surrounding area in the retrieved image.

    32. Spatial Consistency Ranking Search area is defined by 15 nearest neighbors. A neighbor in the surrounding area in the retrieved image counts as a vote. Match with no support / hits is rejected. Repeat this for every match. Total number of votes decides the rank.

    33. V V Number of votes = 3

    34. Frame 1 Frame 2 Frame 3 Frame N Visual words 4 V1 10 7 0 8 V2 4 3 0 ……. 14 V3 9 0 2 ……. ……. ……. ……. ……. 8 Vn 0 2 0 57 36 23 4

    35. Initial Match After Stoplist After Spatial Consistency

    36. Future Work More efficient implementation of spatial consistency. Improve the retrieval time.

    37. USER INTERFACE Chetan Chhaya Nishant Revanth Sandeep Sheetal

    38. Objective Build a web interface for retrieving shots from news video database which matches the given image query Display the ranked list of shots eg Date, Channel, Maximum match, Month

    39. Input & Output

    40. About The Interface… The interface constitutes of the following three parts. Database Schema Data Directories Source Code Files

    41. Database Schema All the videos and metadata corresponding to the videos is stored in SQL database which can be queried using MySQL. Following two tables used: Table1 Table 2

    42. Data Directories Contains following five directories where data is stored Thumbnails Keyframes Stories Shots videos

    43. Source Files The interface part consists of 8 files. index.cgi server.cgi shots.cgi keyframes.cgi SelectRect.js display.cgi play.cgi conf.py Each file is a module.

    44. index.cgi Home page of the Interface. This page lists todays videos as thumbnail of first keyframe corresponding to the first shot of the video. It also gives the user option to select specific videos based upon the criterias of date and channel through comboboxes.

    45. Server.cgi User can be directed to this page from any of the pages since all give the user choice to select from the combo boxes. This page lists the results of the user selection from the comboboxes(based upon the criterias of date and channels) The displayed result shows the thumbnail of first keyframe of each video.

    46. shots.cgi Page used to display the shots of the video selected from the previous page. The constituting stories of the videos are displayed on the screen one after another. Corresponding to each story ,we display the thumbnail of the keyframe of all the shots in that particular story.