GridFTP - PowerPoint PPT Presentation

Gridftp
Download
1 / 26

  • 151 Views
  • Uploaded on
  • Presentation posted in: General

GridFTP. Steve Tuecke Argonne National Laboratory. Overview. Motivation for GridFTP Working Group Requirements GridFTP Solution GridFTP Working Group Documents Role of GridFTP Working Group. GridFTP Working Group Motivation.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

GridFTP

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Gridftp

GridFTP

Steve Tuecke

Argonne National Laboratory


Overview

Overview

  • Motivation for GridFTP Working Group

  • Requirements

  • GridFTP Solution

  • GridFTP Working Group Documents

  • Role of GridFTP Working Group


Gridftp working group motivation

GridFTP Working Group Motivation

  • Data transfer solutions have been developed by the Globus Project over past ~5 years, GridFTP is 3rd generation

  • Grid Forum started ~1 year ago to promote and develop Grid technologies

    • Critical mass of people working in this area

  • Grid Forum GridFTP working group formed to foster the further specification and development of GridFTP

    • Community effort to move GridFTP forward


Some important definitions

Some Important Definitions

  • Resource

  • Network protocol

  • Network enabled service

  • Application Programmer Interface (API)

  • Syntax

  • Software Development Kit (SDK)


Resource

Resource

  • Entity that is to be shared

    • Includes computers, storage, data, software

  • Does not have to be physical entity

    • Condor pool, distributed file system, …

  • Defined in terms of interfaces, not devices

    • E.g. LSF defines compute resource

    • Open/close/read/write defines access to a distributed file system, e.g. NFS, AFS, DFS


Network protocol

Network Protocol

  • A formal description of message formats and a set of rules for message exchange

    • Rules may define sequence of message exchanges

    • Protocol may define state-change in endpoint, e.g. state change

  • Good protocols designed to do one thing

    • Protocols can be layered

  • Examples of protocols

    • IP, TCP, TLS, FTP, HTTP, Kerberos


Network enabled services

FTP Server

Web Server

HTTP Protocol

FTP Protocol

Telnet Protocol

TLS Protocol

TCP Protocol

TCP Protocol

IP Protocol

IP Protocol

Network Enabled Services

  • Implementation of a protocol that defines a set of capabilities

    • Protocol defines interaction with service

    • All services require protocols

    • Not all protocols are used to provide services (e.g. IP, TLS)

  • Examples: FTP and Web servers


Api application programming interface

API(Application Programming Interface)

  • A specification for a set of routines to facilitate application development

    • Refers to definition, not implementation, e.g. there are many implementations of MPI

  • Spec often language-specific (or IDL)

    • Routine name, number, order and type of arguments; mapping to language constructs

    • Behavior or function of routine

  • Examples

    • GSS API, MPI


Syntax

Syntax

  • A specification for how a defined set of information is encoded into bits

    • A syntax may be defined as part of a protocol or API

      • Protocol messages have defined syntax

      • A syntax may be used as API function argument

    • But syntax can also stand alone

  • Good syntax designed to do one thing

    • Syntaxes can be layered

  • Examples

    • XML, ASN.1, X.509, LDIF


Sdk software development kit

SDK(Software Development Kit)

  • A particular instantiation of an API

  • SDK consists of libraries and tools

    • Provides implementation of API specification

  • Can have multiple SDKs for an API

  • Examples of SDKs

    • MPICH, Motif Widgets


Multiple apis but a single protocol example tcp ip

Multiple APIs but a Single ProtocolExample: TCP/IP

  • Multiple APIs: BSD sockets, Winsock, System V streams, …

  • Different programs use different APIs

  • Interoperability: programs using different APIs can exchange information

Application

Application

WinSock API

Berkeley Sockets API

TCP/IP Protocol: Reliable byte streams


Single api but multiple protocols e g gss api

Application

Application

GSS-API

GSS-API

GSI SDK

Kerberos SDK

GSI protocol

Kerberos protocol

Different message formats, exchange sequences, etc.

TCP/IP

TCP/IP

Single API, but Multiple ProtocolsE.g., GSS-API

  • GSS-API provides portability: any correct program compiles & runs on a platform

  • Does not provide interoperability: all processes must link against same SDK

    • E.g., GSI and Kerberos versions of GSS-API


I e standard apis and protocols are both important for different reasons

I.e., Standard APIs and Protocols are Both Important: For Different Reasons

  • Standard APIs/SDKs are important

    • They enable application portability

    • But w/o standard protocols, interoperability is hard (every SDK speaks every protocol?)

  • Standard protocols are important

    • Enable cross-site interoperability

    • Enable shared infrastructure

    • But w/o standard APIs/SDKs, application portability is hard (different platforms access protocols in different ways)


Grid data needs

Grid Data Needs

  • Transfer of large amounts of data (petabytes or terabytes) between storage systems

  • Access to large amounts of data (terabytes or gigabytes) by many geographically distributed applications and users for analysis, visualization, etc.


Requirements

Requirements

  • Grid Security Infrastructure (GSI) and Kerberos support

  • Third-party control of data transfer

  • Parallel data transfer

  • Striped data transfer

  • Partial file transfer

  • Automatic negotiation of TCP buffer/window size

  • Support for reliable/recoverable data transfer


Candidate standards

Candidate Standards

  • FTP

    • Defined by a set of IETF RFCs

    • No partial file, parallel/striped, GSI, etc

    • Separate control & data channels

  • WebDAV

    • New extension to http

    • No third party transfer, parallel/striped, etc.

    • Combined control & data channel


Separate control data channels

Separate Control & Data Channels

  • WebDAV combines control and data over single channel

  • FTP splits control and data

    • Supports multiple, user selectable data channel protocols

  • Advantage to split channels

    • Third party transfers handled cleanly

    • Can (cleanly) define new data channel protocols

      • E.g. parallel/striped transfer, automatic TCP buffer/window negotiation

    • Amenable to high-performance proxies

      • E.g. For firewalls, load balancing, etc.


Gridftp solution

GridFTP Solution

  • Built on existing FTP standards

    • RFC 949: File Transfer Protocol

    • RFC 2228: FTP Security Extensions

    • RFC 2389: Feature Negotiation for the File Transfer Protocol

    • Draft: FTP Extensions

  • Extends standards with

    • Additions to security extensions, partial file transfer, parallel/striped transfer, TCP buffer/window size tuning,


Gridftp implementation status

GridFTP Implementation Status

  • Modified wu-ftpd server

    • Most features

  • Modified ncftp client

    • Security, TCP buffer setting

  • Modified HPSS & Unitree ftpd server

    • Security

  • Globus Toolkit client and server SDKs, and command line tools

    • Most features

  • Striped FTP server (aka DPSS2)


Gridftp working group documents

GridFTP Working Group Documents

  • GridFTP: A Data Transfer Protocol for the Grid

    • Overview of working group activities and documents

    • Requirements

    • Informational draft

  • GridFTP: FTP Extensions for the Grid

    • Protocol specification


Gridftp protocol specifications

GridFTP Protocol Specifications

  • Existing standards

    • RFC 949: File Transfer Protocol

    • RFC 2228: FTP Security Extensions

    • RFC 2389: Feature Negotiation for the File Transfer Protocol

    • Draft: FTP Extensions

  • New drafts

    • GridFTP: FTP Extensions for the Grid


Gridftp apis

GridFTP APIs

  • Should there be standard API(s)?

    • Posix I/O

    • SRB client

    • grid_storage

    • globus_ftp_client

    • MPI-IO

    • HDF5

    • etc

  • Beyond scope of this working group

  • Common protocol beneath these APIs would allow interoperability


Role of gridftp working group

Role of GridFTP Working Group

  • Bring together those who are interested in the future of GridFTP to help foster the…

    • continued specification and standardization of GridFTP

    • development of inter-operable GridFTP implementations

    • widespread adoption of GridFTP as a transfer protocol for the Grid

  • Develop drafts which together define GridFTP

    • May submit some of them to IETF

  • Move GridFTP forward to better address Grid data transfer requirements


Not goals of gridftp working group

NOT Goals of GridFTP Working Group

  • This working group will not start from first principles

    • Starting point is roughly GridFTP as it now exists

    • FTP base is assumed

  • Its not design by committee

    • Seeking rough consensus, with broad input

    • Draft authors and WG chair have final say


Gf5 gridftp working session

GF5 GridFTP Working Session

  • Is this appropriate for Grid Forum?

  • Who is interested in participating, and in what capacity?

  • Is the problem scoped appropriately (at least for now)?

  • What are the right drafts to write?

  • Establish rough timeline for drafts


A call to arms

A Call To Arms

  • The Grid Forum security working group needs to do more than just gather 3 times a year to chat about data management.

  • But Grid Forum is only appropriate for this activity if people meaningfully participate.

    • I will be doing this regardless.

    • But it will hopefully be done better and faster with broad participation.

    • If there is not meaningful participation, I won’t bother with the overhead of Grid Forum.


  • Login