1 / 19

Data Management Challenge - The View from OGF

Data Management Challenge - The View from OGF. OGF22 – February 28, 2008 Cambridge, MA, USA. Erwin Laure <Erwin.Laure@cern.ch> David E. Martin <martinde@us.ibm.com> Data Area Directors. Early Grid View of Grids. Early Grid systems had a quite simplistic view: Dispatch a job to machine

quana
Download Presentation

Data Management Challenge - The View from OGF

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Management Challenge -The View from OGF OGF22 – February 28, 2008 Cambridge, MA, USA Erwin Laure <Erwin.Laure@cern.ch> David E. Martin <martinde@us.ibm.com> Data Area Directors

  2. Early Grid View of Grids • Early Grid systems had a quite simplistic view: • Dispatch a job to machine • GridFTP files to the machine from “Somewhere” • Run the job • GridFTP results to “Somewhere” • Grids defined “Computing Elements (CE)” • Data and storage was considered to be “there” • Storage Elements (SE) concept came much later • Barely OK for Initial Data Analysis • Physics, Geosciences, etc 2

  3. Then Data kicked in … • Compute jobs have to deal with input/output data, transient data • Data is • Heterogeneous (storage, data formats) • Distributed • Independently managed 3

  4. The Grid Grows Up • Databases Access • DAIS • Storage/File Management • SRM • File/Data Transfer • gridFTP, RTF, FTS • Data Location • RLS, LFC • Metadata • Data Management Systems • SRB • … 4

  5. SRM Interactions Client 4 SRM 1 2 3 5 Storage • The client asks the SRM for the file providing an SURL (Site URL) • The SRM asks the storage system to provide the file • The storage system notifies the availability of the file and its location • The SRM returns a TURL (Transfer URL), i.e. the location from where the file can be accessed • The client interacts with the storage using the protocol specified in the TURL

  6. Application Client Toolkit OGSA-DAI service Engine XPath SQLQuery readFile GZip XSLT GridFTP Activities JDBC XMLDB File Data Resources DB2 SQL Server MySQL XIndice SWISS PROT Data- bases

  7. Control Control Control Control Data Data Data Data GridFTP and RFT RFT Client SOAP Messages Notifications(Optional) globus-url-copy RFT Service

  8. gLite FTS • Logical unit of management • Represent a directed network pipe between two sites • Mono-directional, Dedicated link • Independently manageable • State • Number of streams • Number of concurrent transfers • Inter-VO scheduling • VO share • No Routing involved • Non-dedicated channels • E.g. star channel

  9. SRB as a Data Grid DB MCAT SRB SRB SRB SRB SRB SRB Data Grid has arbitrary number of servers Complexity is hidden from users Data Management in Production Grids

  10. Need for Grid Data Architecture • and Standards • OGF OGSA Data Architecture WG • Started in October 2005 • Data Architecture document published as GFD.121 10

  11. Serviceinterface Resourceinterface OGSA-Data Architecture Client APIs (non-OGSA) / Other services Sink/ Source Sink/ Source Description Storage Access Access Description Data Service Data Service Storage Management Stored Data Resources Other Data Resources Managed Storage 11

  12. Serviceinterface Resourceinterface OGSA-Data: Data Replication/Transfer Client APIs (non-OGSA) / Other services Replication Transfer Replication Transfer Sink/ Source Description Sink/ Source Access Access Description Data Service Data Service Data Resources Data Resources Transfer Protocols 12

  13. OGF Data Area WGs I • Data Format Description Language WG (dfdl-wg) • Describe the structure of binary and character encoded files and data streams • Database Access and Integration Services WG (dais-wg) • Provide consistent access to existing, autonomously managed databases from web services • Grid File System Working Group (gfs-wg) • Service interface(s) and architecture of a logical file system • Grid Storage Management WG (gsm-wg) • Provide dynamic space allocation and file management of shared storage components on the Grid (Storage Resource Manager – SRM) • GridFTP WG (gridftp-wg) • Improvements of FTP suitable for grid applications. 13

  14. OGF Data Area WGs II • Info Dissemination WG (infod-wg) • Develop a model for Information Dissemination • OGSA ByteIO Working Group (byteio-wg) • Define a minimal Web Service interface for providing "POSIX-like" file functionality • OGSA Data Movement Interface WG (ogsa-dmi-wg) • Managed data movement • OGSA-Data Working Group (ogsa-d-wg) • Data Architecture 14

  15. Activities related to file system and data movement • GFS: • Resource Namespace Service Specification (GFD.101) • Byte-IO: • Byte-IO OGSA WSRF Basic Profile Rendering (GFD.88) • GSM • The Storage Resource Manager Interface Specification Version 2.2 (in public comment) • DMI • OGSA-DMI Specification (in public comment) 15

  16. Data Architecture: Gaps • Standardized metadata • Identify query languages, data formats, transport protocols, … • Needed in DAIS, DMI, ByteIO, … • Data catalogs & Registries • Discovery an important part of Grids • Replication/Caching • Data Federation 16

  17. Standards Gaps • Caching and Replication • Integrated Data Management • Transactions in a Grid • Storage Provisioning • Virtualization • Provenance, Integrity, Policy • File Metadata • Streaming • Versioning 17

  18. Standards Gaps • Dependencies • Security: IETF, OGF • Management: DMTF, SNIA • WS-*: OASIS and W3C 18

  19. Main Focus for Future Work Where can we exploit synergies with SNIA? • File systems • NFSv4, pNFS • Interface to Metadata stores • Policies (not only Data) • Name your favorite 19

More Related