Data management challenge the view from ogf
This presentation is the property of its rightful owner.
Sponsored Links
1 / 19

Data Management Challenge - The View from OGF PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

Data Management Challenge - The View from OGF. OGF22 – February 28, 2008 Cambridge, MA, USA. Erwin Laure <[email protected]> David E. Martin <[email protected]> Data Area Directors. Early Grid View of Grids. Early Grid systems had a quite simplistic view: Dispatch a job to machine

Download Presentation

Data Management Challenge - The View from OGF

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Data management challenge the view from ogf

Data Management Challenge -The View from OGF

OGF22 – February 28, 2008

Cambridge, MA, USA

Erwin Laure <[email protected]>

David E. Martin <[email protected]>

Data Area Directors


Early grid view of grids

Early Grid View of Grids

  • Early Grid systems had a quite simplistic view:

    • Dispatch a job to machine

    • GridFTP files to the machine from “Somewhere”

    • Run the job

    • GridFTP results to “Somewhere”

  • Grids defined “Computing Elements (CE)”

    • Data and storage was considered to be “there”

    • Storage Elements (SE) concept came much later

  • Barely OK for Initial Data Analysis

    • Physics, Geosciences, etc

2


Then data kicked in

Then Data kicked in …

  • Compute jobs have to deal with input/output data, transient data

  • Data is

    • Heterogeneous (storage, data formats)

    • Distributed

    • Independently managed

3


The grid grows up

The Grid Grows Up

  • Databases Access

    • DAIS

  • Storage/File Management

    • SRM

  • File/Data Transfer

    • gridFTP, RTF, FTS

  • Data Location

    • RLS, LFC

  • Metadata

  • Data Management Systems

    • SRB

4


Srm interactions

SRM Interactions

Client

4

SRM

1

2

3

5

Storage

  • The client asks the SRM for the file providing an SURL (Site URL)

  • The SRM asks the storage system to provide the file

  • The storage system notifies the availability of the file and its location

  • The SRM returns a TURL (Transfer URL), i.e. the location from where the file can be accessed

  • The client interacts with the storage using the protocol specified in the TURL


Data management challenge the view from ogf

Application

Client Toolkit

OGSA-DAI service

Engine

XPath

SQLQuery

readFile

GZip

XSLT

GridFTP

Activities

JDBC

XMLDB

File

Data

Resources

DB2

SQL

Server

MySQL

XIndice

SWISS

PROT

Data-

bases


Gridftp and rft

Control

Control

Control

Control

Data

Data

Data

Data

GridFTP and RFT

RFT Client

SOAP Messages

Notifications(Optional)

globus-url-copy

RFT Service


Glite fts

gLite FTS

  • Logical unit of management

    • Represent a directed network pipe between two sites

  • Mono-directional, Dedicated link

  • Independently manageable

    • State

    • Number of streams

    • Number of concurrent transfers

  • Inter-VO scheduling

    • VO share

  • No Routing involved

  • Non-dedicated channels

    • E.g. star channel


Srb as a data grid

SRB as a Data Grid

DB

MCAT

SRB

SRB

SRB

SRB

SRB

SRB

Data Grid has arbitrary number of servers

Complexity is hidden from users

Data Management in Production Grids


Need for grid data architecture

Need for Grid Data Architecture

  • and Standards

  • OGF OGSA Data Architecture WG

    • Started in October 2005

    • Data Architecture document published as GFD.121

10


Ogsa data architecture

Serviceinterface

Resourceinterface

OGSA-Data Architecture

Client APIs (non-OGSA) / Other services

Sink/ Source

Sink/ Source

Description

Storage

Access

Access

Description

Data Service

Data Service

Storage Management

Stored Data Resources

Other Data Resources

Managed Storage

11


Ogsa data data replication transfer

Serviceinterface

Resourceinterface

OGSA-Data: Data Replication/Transfer

Client APIs (non-OGSA) / Other services

Replication

Transfer

Replication

Transfer

Sink/ Source

Description

Sink/ Source

Access

Access

Description

Data Service

Data Service

Data Resources

Data Resources

Transfer Protocols

12


Ogf data area wgs i

OGF Data Area WGs I

  • Data Format Description Language WG (dfdl-wg)

    • Describe the structure of binary and character encoded files and data streams

  • Database Access and Integration Services WG (dais-wg)

    • Provide consistent access to existing, autonomously managed databases from web services

  • Grid File System Working Group (gfs-wg)

    • Service interface(s) and architecture of a logical file system

  • Grid Storage Management WG (gsm-wg)

    • Provide dynamic space allocation and file management of shared storage components on the Grid (Storage Resource Manager – SRM)

  • GridFTP WG (gridftp-wg)

    • Improvements of FTP suitable for grid applications.

13


Ogf data area wgs ii

OGF Data Area WGs II

  • Info Dissemination WG (infod-wg)

    • Develop a model for Information Dissemination

  • OGSA ByteIO Working Group (byteio-wg)

    • Define a minimal Web Service interface for providing "POSIX-like" file functionality

  • OGSA Data Movement Interface WG (ogsa-dmi-wg)

    • Managed data movement

  • OGSA-Data Working Group (ogsa-d-wg)

    • Data Architecture

14


Activities related to file system and data movement

Activities related to file system and data movement

  • GFS:

    • Resource Namespace Service Specification (GFD.101)

  • Byte-IO:

    • Byte-IO OGSA WSRF Basic Profile Rendering (GFD.88)

  • GSM

    • The Storage Resource Manager Interface Specification Version 2.2 (in public comment)

  • DMI

    • OGSA-DMI Specification (in public comment)

15


Data architecture gaps

Data Architecture: Gaps

  • Standardized metadata

    • Identify query languages, data formats, transport protocols, …

    • Needed in DAIS, DMI, ByteIO, …

  • Data catalogs & Registries

    • Discovery an important part of Grids

  • Replication/Caching

  • Data Federation

16


Standards gaps

Standards Gaps

  • Caching and Replication

  • Integrated Data Management

  • Transactions in a Grid

  • Storage Provisioning

  • Virtualization

  • Provenance, Integrity, Policy

  • File Metadata

  • Streaming

  • Versioning

17


Standards gaps1

Standards Gaps

  • Dependencies

    • Security: IETF, OGF

    • Management: DMTF, SNIA

    • WS-*: OASIS and W3C

18


Main focus for future work

Main Focus for Future Work

Where can we exploit synergies with SNIA?

  • File systems

    • NFSv4, pNFS

  • Interface to Metadata stores

  • Policies (not only Data)

  • Name your favorite

19


  • Login