1 / 22

OGSA-DAI data access and integration

OGSA-DAI data access and integration. NERC GridGIS workshop eSI, 1 February 2006. Overview. The Data Deluge challenges of increasing data availability benefits of bringing data together OGSA-DAI overview use as a data integration base layer. Data Services: challenges to management.

sun
Download Presentation

OGSA-DAI data access and integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OGSA-DAIdata access and integration NERC GridGIS workshop eSI, 1 February 2006

  2. Overview • The Data Deluge • challenges of increasing data availability • benefits of bringing data together • OGSA-DAI • overview • use as a data integration base layer NERC GridGIS workshop - 1 February 2006

  3. Data Services: challenges to management • Scale • Many sites, large collections, many uses • Longevity • Research requirements outlive technical decisions • Diversity • No “one size fits all” solutions will work • Primary Data, Data Products, Meta Data, Administrative data, … • Many Data Resources • Independently owned & managed • No common goals • No common design • Work hard for agreements on foundation types and ontologies • Autonomous decisions change data, structure, policy, … • Geographically distributed • and I haven’t even mentioned security yet! NERC GridGIS workshop - 1 February 2006

  4. What is a data service? • An interface to a stored collection of data • e.g. Google and Amazon • web services • But the data could be: • replicated • shared • federated • virtual • incomplete • Don’t care about the underlying representation • do care about the information it represents • Adding a service layer to existing data sources can improve composability NERC GridGIS workshop - 1 February 2006

  5. Use Cases for Data Services • Data Filtering: • Single source producing large amounts of data distributed to many sites downstream • Data Discovery: • many sources, many query entry points in a linked system • Data Translation: • source to sink, conversion of data model / structure • Data Federation: • many sources, linked to provide view as a single source • Data Replication • full or partial copies to improve throughput • Data Integration (model aggregation) • e.g. integration of time variant data, streams, files • Data Integration (knowledge expansion) • forming links between databases to increase knowledge NERC GridGIS workshop - 1 February 2006

  6. OGSA-DAI In One Slide • An extensible framework for data access and integration. • Expose heterogeneous data resources to a grid through web services. • Interact with data resources: • Queries and updates. • Data transformation / compression • Data delivery. • Customise for your project using • Additional Activities • Client Toolkit APIs • Data Resource handlers • A base for higher-level services • federation, mining, visualisation,… NERC GridGIS workshop - 1 February 2006

  7. The OGSA-DAI Framework Application Client Toolkit OGSA-DAI service Engine XPath readFile SQLQuery XSLT GZip GridFTP Activities JDBC XMLDB File Data Resources MySQL SQL Server DB2 XIndice SWISS PROT Data- bases NERC GridGIS workshop - 1 February 2006

  8. Intermediary • Simple intermediary • potential to accelerate development, logging, or filtering • Persistent intermediary • e.g. to allow efficient local indexing NERC GridGIS workshop - 1 February 2006

  9. Redirector, Coordinator, Network • Allowing composition and decentralisation NERC GridGIS workshop - 1 February 2006

  10. SQL SQL SQL SQL JDBC JDBC JDBC JDBC Extensibility Example OGSA-DAI service Engine SQLQuery SQLQuery Multiple SQL GDS JDBC MySQL NERC GridGIS workshop - 1 February 2006

  11. browser EDINA OGC Service GIS Internet Oracle Map Retrieval: Current NERC GridGIS workshop - 1 February 2006

  12. Basic client to demonstrate proof of concept EDINA SO-OGC OGC OGSA-DAI 1 GIS Oracle Client Map Retrieval: Grid Prototype NERC GridGIS workshop - 1 February 2006

  13. Portlet Map Retrieval: Security • Exploit NGS infrastructure to provide secure access layer EDINA NGS Authentication Allowed users dn SO-OGC OGC ODS 1 GIS Oracle NERC GridGIS workshop - 1 February 2006

  14. JDBC NGS Authentication Oracle Census ODS 1 SQL/XML SO-OGC OGC Portlet ODS 2 GIS Oracle SO-OGC ODS 3 Application data Map Retrieval: Integration • Exploit OGSA-DAI extensibility to add e.g. overlay NERC GridGIS workshop - 1 February 2006

  15. OGSA-DAI / EDINA prototyping work • Stage 1: Using existing OGSA-DAI technology • Stage 2: Extending OGSA-DAI OGSA-DAI service Input Parameters URL GIS Client DeliverFrom URL GIS Activities Image/XML File HTTP Request WMS Server HTTP Data Resource HTTP Response NERC GridGIS workshop - 1 February 2006

  16. 3,4 reduce op_call (Blast) exchange hash_join (proteinId) reduce exchange reduce 1 2 table_scan (protein) table_scan termID=S92 (proteinTerm) Distributed Query Processing • Higher level services building on OGSA-DAI • specialised metadata extraction • Execute queries in parallel over multiple data resources • Queries mapped to algebraic expressions for evaluation • Parallelism represented by partitioning queries • Use exchange operators • Equality based joins in current release • supported types: long, integer, string, double and float NERC GridGIS workshop - 1 February 2006

  17. DQP architecture NERC GridGIS workshop - 1 February 2006

  18. Contributing to OGSA-DAI • Additional functionality: • Provide activities which implement specific functionality • Provide extra client functionality • Provide different security mechanisms • Provide higher level components and applications • Different levels of contributions • Based on OGSA-DAI? • Works with OGSA-DAI? • Part of OGSA-DAI? NERC GridGIS workshop - 1 February 2006

  19. In the near future • A new version of the OGSA-DAI Engine • should look mostly the same externally • better support for concurrency, sessions and monitoring • Implementing new versions of specifications • DAIS Specifications • Key things that we will be addressing: • Performance • A Security Model which can be applied across platforms • Full Transactions framework, distributed transactions • More data integration facilities • Better abstraction over DBMS variation • Application centric queries • collaborating with other projects • Research projects looking at: • schema mapping • extended data resources NERC GridGIS workshop - 1 February 2006

  20. Associated Meetings and Workshops • DIALOGUE Workshops (http://www.datagrids.org) • Data Integration Applications: Linking Organisations to Gain Understanding and Experience • Bringing together Data Integration middleware and application providers with users • Next one at NeSC: 9-10th February 2006 • http://www.nesc.ac.uk/esi/events/636/ • Next Generation Distributed Data Management (HPDC15, Paris) • http://www.isi.edu/~annc/distributedDataWorkshop.html • Data Management on Grids (VLDB’06, Seoul) NERC GridGIS workshop - 1 February 2006

  21. Conclusions • The benefits of trying to integrate data are hindered by challenges such as heterogeneity, scale and distribution • A common data service layer should make data integration easier • OGSA-DAI provides an extensible, data service based framework which makes it easier to implement data integration • GIS data is amenable to integration using data services NERC GridGIS workshop - 1 February 2006

  22. Further information • The OGSA-DAI Project Site: • http://www.ogsadai.org.uk • The DAIS-WG site: • http://forge.gridforum.org/projects/dais-wg/ • OGSA-DAI Users Mailing list • users@ogsadai.org.uk • General discussion on grid DAI matters • Formal support for OGSA-DAI releases • http://bugs.ogsadai.org.uk/ • OGSA-DAI training courses NERC GridGIS workshop - 1 February 2006

More Related