1 / 16

SAPIR Search in Audio-visual content using P2p IR

Yosi Mass, Raul Santos. SAPIR Search in Audio-visual content using P2p IR. Why SAPIR?. Searchable space created by the growing amounts of existing video and multimedia files may greatly exceed the area searched by major engines.

naif
Download Presentation

SAPIR Search in Audio-visual content using P2p IR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Yosi Mass, Raul Santos SAPIRSearch in Audio-visual content using P2p IR Chorus cluster meeting, Vilamoura 16-17 April 2008

  2. Chorus cluster meeting, Vilamoura 16-17 April 2008 Why SAPIR? • Searchable space created by the growing amounts of existing video and multimedia files may greatly exceed the area searched by major engines. • Traditional search engines are limited to searching in the associated text and meta-data of the multimedia content. If content providers don't clearly or accurately describe their multimedia files, or use inaccurate tags, the current method falls short. • Current internet search is geared mainly to relatively powerful desktop machines and accessed via regular web browsers, not lightweight mobile devices with their connectivity and interactivity limitations.

  3. Chorus cluster meeting, Vilamoura 16-17 April 2008 SAPIR Objectives • Develop cutting-edge technology to index and search large scale audio-visual information by content. • Make information available on many devices, enhanced by social networking while keeping privacy and preventing fraud • Support new trends in MM content production: personal producer VS professional producers

  4. Chorus cluster meeting, Vilamoura 16-17 April 2008 SAPIR challenges • Dimensions of the search problem: • Efficiency (scalability is the key issue) • Effectiveness (quality measures of results) • Efficiency challenges • Scale in collection size • Scale in number of users • Effectiveness challenges • New search paradigm combining text + audio-visual content • Usability challenges

  5. Chorus cluster meeting, Vilamoura 16-17 April 2008 SAPIR Consortium

  6. Chorus cluster meeting, Vilamoura 16-17 April 2008 SAPIR approach-P2P Architecture

  7. Image Database Search using the Query by Example Paradigm • Search for information about a physical object by taking an image of it with a mobile phone or find a song by humming the melody. • Support similarity search for metric spaces Chorus cluster meeting, Vilamoura 16-17 April 2008

  8. Chorus cluster meeting, Vilamoura 16-17 April 2008 Feature extraction <SapirMMObject> <title>when waves collide</title> <Mpeg7> <VisualDescriptor type=“ScalableColorType”> <VisualDescriptor type=“ColorStructureType”> <VisualDescriptor type=“ColorLayoutType”> <VisualDescriptor type=“EdgeHistogramType”> <VisualDescriptor type=“HomogeneousTextureType”> </Mpeg7> <comments> <comment id=“…" author=“…">beautiful…</comment> <comment ...>very powerful…</comment> </comments> <tags> <tag id="254" author=“12@N00">waves</tag> <tag …>Victoria beach</tag> </tags> </SapirMMObject>

  9. Chorus cluster meeting, Vilamoura 16-17 April 2008 Visual Descriptors Overlay Metric index Text Overlay Text index Indexing <SapirMMObject> <title>when waves collide</title> <Mpeg7> <VisualDescriptor type=“ScalableColorType”> <VisualDescriptor type=“ColorStructureType”> <VisualDescriptor type=“ColorLayoutType”> <VisualDescriptor type=“EdgeHistogramType”> <VisualDescriptor type=“HomogeneousTextureType”> </Mpeg7> <comments> <comment id=“…" author=“…">beautiful…</comment> <comment ...>very powerful…</comment> </comments> <tags> <tag id="254" author=“12@N00">waves</tag> <tag …>Victoria beach</tag> </tags> </SapirMMObject>

  10. Chorus cluster meeting, Vilamoura 16-17 April 2008 Tag: names Visual Descriptors Overlay Text Overlay Merge Results Approximation Querying <Mpeg7Query weight=“1”> <VisualDescriptor type=“ScalableColorType”> <VisualDescriptor type=“ColorStructureType”> <VisualDescriptor type=“ColorStructureType”> </Mpeg7Query> </Mpeg7Query weight=“0.5”> <tag>waves</tag> </Mpeg7Query>

  11. Chorus cluster meeting, Vilamoura 16-17 April 2008 Project status for Apr 2008 • A scalable, extensible and versatile architecture for P2P was defined. APIs for P2P content management, indexing and search were defined and implemented • Several Scenarios were defined and tested in Focus groups • Definition of a common schema for feature representation using MPEG-7 was defined. • A demo for Indexing and search in 10M Flickr files using a combination of content based image search combined with text and metadata was implemented using the SAPIR APIs. • Testbed of 50M Flickr files crawled by the EGEE grid aiming at 100M towards the Year End. This testbed collection will be available for scientific experiments (CoPhir – http://cophir.isti.cnr.it site) • Next demo (due Nov ’08) will include search in music, video and speech as well as some scenario integration.

  12. Chorus cluster meeting, Vilamoura 16-17 April 2008 Tests • P2P architecture for search in Audio-Visual content • Efficiency – Some initial results: • 1M FlickrXML files – ~500msec per query – 50 peers (8CPU, 16Gb) • 10M FlickrXML files - ~500msec per query – 500 peers (16CPU, 64Gb) • Effectiveness • Text + image improves over text or image only

  13. Chorus cluster meeting, Vilamoura 16-17 April 2008 WP9 – Dissemination and exploitation • Public website • http://www.sapir.eu • Dissemination • First DUP was published • Participate in Chorus meetings and road map • Workshops – SIGIR’07, ECIR’08, SAC’08 • Demos • Publications • More than 20 SAPIR related publications so far • Contacts with Standards Bodies • MPEG-21, MPEG-A, MPEG-7 • Exploitation

  14. Chorus cluster meeting, Vilamoura 16-17 April 2008 WP9 – Dissemination and exploitation • Proposed contribution to standards • Extension to MPEG-7 for music and speech. • Proposals for MPQF (MPEG-7 Query Format) • A DRM implementation for P2P based on Chillout • Propose a call for MPEG-21 Query Format

  15. Chorus cluster meeting, Vilamoura 16-17 April 2008 Thank You! For more info visit http://www.sapir.eu

  16. Chorus cluster meeting, Vilamoura 16-17 April 2008 Results (Jan 2007 – Mar 2008) • WP1 – Scenarios and a complete guideline for usability and user interface design • WP2 – Architecture for P2P and APIs • WP3 - Definition of a common schema for feature representation using MPEG-7. • WP4, WP5 – Demo of indexing and search in 10M Flickr files combining text and low level visual descriptors • WP6 – Work on interoperable DRM solution (Chillout) for P2P networks • WP7 – initial design of Social networking and support for mobile devices

More Related