1 / 24

MILOS: An Architecture for Multimedia Digital Libraries and Content Management Applications

MILOS: An Architecture for Multimedia Digital Libraries and Content Management Applications. Pasquale Savino I.S.T.I. Scope of Digital Library technology. High. Semistructured data. Databases. Knowledge of Users/Tasks. Digital Library Technologies. Information Retrieval. Semantic Web.

marcie
Download Presentation

MILOS: An Architecture for Multimedia Digital Libraries and Content Management Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MILOS: An Architecture for Multimedia Digital Libraries and Content Management Applications Pasquale Savino I.S.T.I.

  2. Scope of Digital Library technology High Semistructured data Databases Knowledge of Users/Tasks Digital Library Technologies Information Retrieval Semantic Web Web Low High Structure of Data

  3. Digital Libraries today • Focus on Cultural Heritage preservation and access • Access to OLAP (Online Public Access Catalog) of public libraries, museums, etc. from the Web • New libraries (documents, images, audio/video) with digital multimedia content. • Access based on standardized Metadata, generic (eg. DublinCore) or area-specific • Distributed Web-based architectures • New services available: • Multilingual access • Personalization • Recommendation • Annotation • Collection support

  4. Digital Library Vision • Digital libraries should enable any citizen to access all human knowledgeany time and anywhere, in a friendly,multi-modal, efficient, and effectiveway, by overcoming barriers of distance, language, and culture and by using multiple Internet-connected devices

  5. Digital Library Vision • DL Functionalities • Rich information needs • Multiple sources of related information • Heterogeneous information • Rich data sources • Multimedia information • Defined user populations • Motivated users • Task-orientation • Domain-orientation • Cross-lingual access • Collaboration

  6. Application areas • Multimedia digital archives • Publishing support • Broadcasting support • Production support • E-Learning • Corporate content management • Health and medicine • Biology • Government and Public Administration • …

  7. Why a Content Management System • Digital libraries are used to manage documents of many different types of data • Many different metadata models • DL software components are actually built only for a specific use • Lack of general purpose building components

  8. The main characteristics of the MCMS • Flexibility • Management of different types of data stored in different repositories with different storage strategies • Capability of describing documents with arbitrary, and possibly heterogeneous, metadata • Support of custom/personalized views on the metadata schema used • Scalability • Management of DLs of different sizes • Dealing with DL evolution • Efficiency

  9. The MILOS MCMS • MILOS is a general purpose Multimedia Content Management System • Manages and serves anymultimedia documents • Manages any metadata of documents • MILOS is based on a standard platform • Developed by using the Web Service technology, which provides, in many cases support for authentication, authorization management, distribution, etc. • Mainly developed in Java • Very easy installation (Drag and Drop) • Exploitation of advanced XML native database technology

  10. The MILOS MCMS • Search capabilities: • Traditional fielded search capabilities • Full text search (e.g. on video transcripts) • Search on automatically associated classification categories • Visual content similarity search • The system is not tied to a specific metadata schema • Any XML encoded metadata can be managed by the system (e.g. DC, MPEG-7, ECHO, proprietary model) • Metadata mapping techniques are used to provide users with a homogeneous view • Several different and heterogeneous applications can be supported

  11. Combined Search capabilities: Retrieve all videos with mountains in the background discussing about the Afghanistan heart quake, and classified as foreign affair. XML Search Engine: Structure search Fielded search Full text search Multimedia search Schema independent XQuery support(SOAP Web Service) Search Browser MDEdit. Multimedia Content Management Server:(SOAP Web Service) Web services: Metadata Editor: Visual Basic (SOAP Comm.) Interface Logic Multimedia doc. serv.:(SOAP Web Service) Retrieval Interface: JSP(SOAP Comm.) Repository Metadata Integrator Business Logic Metadata independence: The schema seen in the interface logic can be different of the one(s) used in the repository Full Text Index Multimedia Server Metadata Storage Retrieval Topic Cat. Index Data Logic Visual Cnt. Index ECHO MPEG-7 Dublin Core … MPEG-1 MPEG-2 JPEG … …

  12. Metadata Storage and Retrieval • Based on a native XML database/repository • Solutions based on the use of DB technology, may be too inefficient for complex metadata models • Metadata represented in XML • Arbitrary metadata structure allowed • Export/import of metadata easily managed • No XML schema definition is needed • Arbitrary and heterogeneous metadata representations • Search based on XQuery extended with similarity search support • Optional index definition for performance improvements • The system administrator can associate an index to specific XML elements • Support for free text search • Image similarity search

  13. Multimedia Server • Storage of data of any media • Support of different storage strategies, which may depend on the application (data size, access and transfer time). The required strategy may change over time. • DL application developers must not specify how and where data are stored, but only what is the performance they want • Use of a mapping between URNs and actual location • Use of rules (based on MIME types) to enforce specific storage strategies

  14. Repository Metadata Integrator • Metadata independence (via metadata mapping) • Use of schema mapping rules to map application metadata into Metadata Storage • Each rule specifies how to translate a metadata field known to the application into an XPath expression used to access that field in the Metadata Storage • Mapping rules are used to specify the XQuery statements executed in the Metadata Storage and to transform them back into application metadata

  15. Access to heterogeneous metadata repositories MILOS repository based on ECHO metadata MILOS repository based on MPEG-7 metadata Application providing a Dublin Core view on metadata

  16. Ingestion of existing data and metadata in MILOS Repository using a proprietary metadata model Ingestion of data and metadata in MILOS New metadata immediately accessible. Possibility to define indexes to speed-up retrieval MILOS repository based on the proprietary metadata

  17. Distribution and multiple disk storage Multimedia Server MILOS repository using multiple disk storage Multimedia Server

  18. Examples of DL archives Four DL have been ingested • Reuters Data Set • 810000 news agencies (2,6 GB), text and metadata encoded in XML • ACM Sigmod Record and DBLP data sets • Sigmod Record composed of 46 XML files • DBLP – one single XML file (187MB) • Different structure, one single interface through mapping mechanisms • The ECHO data set • About 50 hours of historical documentaries (8000 videos), coming from 4 different countries • 43000 XML files (36MB), 21GB MPEG-1 video and Jpeg • Image similarity search based on MPEG-7 image descriptors

  19. Main components: Entry point Indexing Workflows: Metadata editing station Automaticprocessing services: Speech recognition, Segmentation, Summarisation, … New Film Filmrepository Entry point AutomaticProcessing Video and Metadatarepository Metadatarepository Manualmetadataediting Indexing videos

  20. Searching videos Main components: Examples of queries • Metadata associated to the entire video • E.g. find b&w videos produced before II world war by Istituto Luce • Metadata associated to video shots • E.g. find a shot where the audio transcript contains the words “Attentato Banca Nazionale dell’Agricoltura” • Metadata associated to single frames • E.g. find a video that contains a frame similar to this image [the image is provided as an example] • Any combination of the previous cases • Video Search • Access to metadata DB • Full text search on transcripts • Image similarity search • Cross-language retrieval on selected metadata fields and transcript • Query formulation • Metadata fields • Audio transcripts • Video key frames • Cross-language queries Transcript Repository Video key frames Repository Video and Metadatarepository

  21. ECHO metadata model • Supports a multi-layer and hierarchical description of audio-video documents • Description of different aspects of the same document • The model can be adapted to specific application needs • Describes metadata that are automatically extracted as well as metadata manually extracted

  22. ECHO metadata model Extends the IFLA-FRBR model Four entities used to describe different aspect of a resource: • WORK • EXPRESSION • MANIFESTATION • ITEM Describes a distinct intellectual or artistic creation It is the abstract idea of a creation We do not specify if we realize a book, a film, or a cartoon This is described by the Expression Entity Examples of WORK are The terrorist attack at Banca Nazionale dell’Agricoltura 2001: A space Odyssey, ……. Describes a distinct intellectual or artistic creation Intellectual or artistic realisation of a work in the form of alphanumeric, musical, or choreographic notation, sound, image, etc.. No information on the physical embodiment is given Examples of Expression are: TV news on the terrorist attack A documentary on the terrorist attack Interviews on the terrorist attack ……… Intellectual or artistic realization of a work Physical embodiment of an expression Physical embodiment of an expression E.g. manuscripts, books, maps, sound, CD_ROM A single exemplar of a manifestation A single exemplar of a manifestation

  23. The ECHO metadata model

  24. MILOS Demo • Start

More Related