1 / 20

DILIGENT Digital libraries powered by the Grid Bhaskar Mehta, Fhg IPSI, Germany

DILIGENT Digital libraries powered by the Grid Bhaskar Mehta, Fhg IPSI, Germany Work Package Leader, Content and Metadata Management Bhaskar.Mehta@ipsi.fraunhofer.de. Overview. Introduction to DILIGENT Grid: Oppurtunity and Challenge Challenges in Information Management

tejana
Download Presentation

DILIGENT Digital libraries powered by the Grid Bhaskar Mehta, Fhg IPSI, Germany

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DILIGENT Digital libraries powered by the Grid Bhaskar Mehta, Fhg IPSI, Germany Work Package Leader, Content and Metadata Management Bhaskar.Mehta@ipsi.fraunhofer.de

  2. Overview • Introduction to DILIGENT • Grid: Oppurtunity and Challenge • Challenges in Information Management • Data Management in DILIGENT • Open Issues and Next Steps International Symposium on Grid Computing, Taipei, 3rd May 2006

  3. Introduction to DILIGENT • DILIGENT: A Digital Library Infrastructure on Grid-Enabled Technology • Duration: 3 years • Commencement Date: September 2004 • Effort: 1024 p/m • Cost: 9.8 M Euro • European Union funding: 6.3 M Euro International Symposium on Grid Computing, Taipei, 3rd May 2006

  4. Partners • Consiglio Nazionale delle Ricerche – ISTI (Italy, Scientific Co-ordinator) • European Research Consortium for Informatics and Mathematics (France, Adm Coordinator) • University of Athens (Greece) • Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. – IPSI (Germany) • University for Health Informatics and Technology Tyrol (Austria)/ETH Zürich/UNI Basel • University of Strathclyde (United Kingdom) • Engineering Ingegneria Informatica SpA (Italy) • Fast Search & Transfer ASA (Norway) • 4D SOFT Software Development Ltd. (Hungary) • European Organization for Nuclear Research (Switzerland) • European Space Agency – ESA (Italy) • Scuola Normale Superiore (Italy) • RAI Radio Televisione Italiana (Italy) International Symposium on Grid Computing, Taipei, 3rd May 2006

  5. Grid Jobs DILIGENT Objectives • To create an advanced test-bed that will allow members of dynamic virtual e-Science organizations to access shared knowledge and to collaborate in a secure, coordinated, dynamic and cost-effective way. • Expected Outcome • A Digital library infrastructure which is Grid based • A test bed based on this infrastructure • Two implemented Scenarios: Earth Science and Cultural Heritage International Symposium on Grid Computing, Taipei, 3rd May 2006

  6. Motivation for NGDLs: Digital library challenges • Cost & Time • Construction and management of a DL requires high investments and specialized personnel • Years are spent in designing and setting up a DL • Shared Infrastructure with Authoring Capabilities • New functionality is computationally expensive and evolving • Multimedia indexing, clustering: e.g. LSI, pLSA • Multimedia querying: e.g. Image retrieval by feature vectors • Multimedia processing: e.g. Satellite images, Partial encrpytion for video • Service Based Digital Libaries, with process management/distribution support • Heterogeneity & Distribution • DLs (and underlying components) use different models, apis, data formats, etc • DLs are distributed/replicated • Basing DLs on standards • Providing support for federated/distributed search, data brokering International Symposium on Grid Computing, Taipei, 3rd May 2006

  7. Grid as an Oppurtunity... and a Challenge Grid DL Potential DigitalLibary Challenge Grid OS International Symposium on Grid Computing, Taipei, 3rd May 2006

  8. Some Methodological Challenges • Service Oriented, Distributed Architecture • Requires open systems for • indexing, searching, feature extraction, metadata management • Distributed Search • Query Optimization • Semantic Data Integration • On Demand Service Activation • Satellite Images • Extraction • Virtual Organizations • Content Security • Resource Security International Symposium on Grid Computing, Taipei, 3rd May 2006

  9. Some Technological Challenges • Grid Technology is File centric: DLs are collection centric • Metadata mmgt with the Grid: Based on key-value pairs • Lack of support for structered data (e.g. XML) • Retrieval Support is limited • Availibility and Replication Support: file based • Real time processing vs batch processing • DL users require instantaneous response (ala Google) • Grid processes usually can‘t provide real time response International Symposium on Grid Computing, Taipei, 3rd May 2006

  10. DILIGENT Architecture International Symposium on Grid Computing, Taipei, 3rd May 2006

  11. CM MM Emulation Common Layer XMLDB gLite Data Management in DILIGENT (1) • Common functionality for Content and Metadata management  Effort duplication • Storage , Replication • Change Notification, Association Consistancy • gLite functionality • Seperate pipelines for Content and Metadata • Incomplete functionality (e.g. replication) • Insufficient for DILIGENT • FileSystem vs Data Model • Flat records vs XML CM MM CM MM Common Layer gLite gLite International Symposium on Grid Computing, Taipei, 3rd May 2006

  12. Content Management Storage Management Base Layer Layer Metadata Management Data Management in DILIGENT (2) • Indentifying 3 basic layers • Base layer : glite functionality (SE, Catalog, FTS) • Storage Layer: (Replication, change notification, transactional support) • Service Layer: Service specific functionality, API/WS view. International Symposium on Grid Computing, Taipei, 3rd May 2006

  13. API / WSDL API / WSDL Metadata Management Annotation Manager Query Processor Content Manager Metadata Broker Content Security Metadata Catalog Storage Layer Base Layer Data management in DILIGENT (3) International Symposium on Grid Computing, Taipei, 3rd May 2006

  14. Current Status and Future Steps • Detailed design has been completed • APIs under implementation • 1st Experimental prototype based on OpenDLib • Extensive testing and deployment of gLite 1.1 -> 3.0 • Next Steps • Integrate finished components • Deploy Diligent on the Grid Infrastructure • Develop prototypes based on Diligent • Testing & User Feedback International Symposium on Grid Computing, Taipei, 3rd May 2006

  15. Types of Involvement for Observers • Information about project activities (www.diligentproject.org) • Involvement in workshops • Possible involvement in validation • Feedback for DILIGENT development • Candidates for adoption of DILIGENT infrastructure International Symposium on Grid Computing, Taipei, 3rd May 2006

  16. Contact us Co-operation with other projects/communities is welcome www.diligentproject.org • Contact people: • Donatella Castelli, Pasquale Pagano, ISTI-CNRdonatella.castelli/pasquale.pagano@isti.cnr.it • Jessica Michael, ERCIMjessica.michel@ercim.org • Bhaskar Mehta, Fraunhofer IPSIbhaskar.mehta@ipsi.fraunhofer.de International Symposium on Grid Computing, Taipei, 3rd May 2006

  17. Questions ? International Symposium on Grid Computing, Taipei, 3rd May 2006

  18. Research today • Research is carried out by groups of individuals, belonging to different institutions, that dynamically aggregate to carry out projects together • By sharing their resources these individuals create better conditions for their research • Digital libraries that maintain the produced knowledge and make it accessible worldwide are becoming key instruments for scientific collaboration in many research areas International Symposium on Grid Computing, Taipei, 3rd May 2006

  19. Complementary User Scenarios • Earth Science Domain: • Well-established tradition in exploiting new technologies • Wide variety of content types (maps, satellite images, measurements, text) • Very large, dynamic data sets • Support for community events, report generation, disaster management. • Cultural Heritage Domain: • IT technology exploitation still in infancy • Multidisciplinary collaborative research • Image based retrieval/semantic analysis of images • Support for research and teaching International Symposium on Grid Computing, Taipei, 3rd May 2006

  20. DILIGENT Goals & Approach • Service Oriented Digital Library Infrastructure • Based on EGEE • High computing and storage capabilities for handling a wide variety of information objects • Controlled Sharing of Resources • Basic Services for • DL Creation & Management • Indexing, Search, Data Fusion • Content & Metadata Management • Process Management • Testbed for User Scenarios • Earth Sciences • Cultural Heritage International Symposium on Grid Computing, Taipei, 3rd May 2006

More Related