1 / 26

GridFTP - PowerPoint PPT Presentation

  • Uploaded on

GridFTP. Steve Tuecke Argonne National Laboratory. Overview. Motivation for GridFTP Working Group Requirements GridFTP Solution GridFTP Working Group Documents Role of GridFTP Working Group. GridFTP Working Group Motivation.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'GridFTP' - gusty

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript


Steve Tuecke

Argonne National Laboratory


  • Motivation for GridFTP Working Group

  • Requirements

  • GridFTP Solution

  • GridFTP Working Group Documents

  • Role of GridFTP Working Group

Gridftp working group motivation
GridFTP Working Group Motivation

  • Data transfer solutions have been developed by the Globus Project over past ~5 years, GridFTP is 3rd generation

  • Grid Forum started ~1 year ago to promote and develop Grid technologies

    • Critical mass of people working in this area

  • Grid Forum GridFTP working group formed to foster the further specification and development of GridFTP

    • Community effort to move GridFTP forward

Some important definitions
Some Important Definitions

  • Resource

  • Network protocol

  • Network enabled service

  • Application Programmer Interface (API)

  • Syntax

  • Software Development Kit (SDK)


  • Entity that is to be shared

    • Includes computers, storage, data, software

  • Does not have to be physical entity

    • Condor pool, distributed file system, …

  • Defined in terms of interfaces, not devices

    • E.g. LSF defines compute resource

    • Open/close/read/write defines access to a distributed file system, e.g. NFS, AFS, DFS

Network protocol
Network Protocol

  • A formal description of message formats and a set of rules for message exchange

    • Rules may define sequence of message exchanges

    • Protocol may define state-change in endpoint, e.g. state change

  • Good protocols designed to do one thing

    • Protocols can be layered

  • Examples of protocols

    • IP, TCP, TLS, FTP, HTTP, Kerberos

Network enabled services

FTP Server

Web Server

HTTP Protocol

FTP Protocol

Telnet Protocol

TLS Protocol

TCP Protocol

TCP Protocol

IP Protocol

IP Protocol

Network Enabled Services

  • Implementation of a protocol that defines a set of capabilities

    • Protocol defines interaction with service

    • All services require protocols

    • Not all protocols are used to provide services (e.g. IP, TLS)

  • Examples: FTP and Web servers

Api application programming interface
API(Application Programming Interface)

  • A specification for a set of routines to facilitate application development

    • Refers to definition, not implementation, e.g. there are many implementations of MPI

  • Spec often language-specific (or IDL)

    • Routine name, number, order and type of arguments; mapping to language constructs

    • Behavior or function of routine

  • Examples

    • GSS API, MPI


  • A specification for how a defined set of information is encoded into bits

    • A syntax may be defined as part of a protocol or API

      • Protocol messages have defined syntax

      • A syntax may be used as API function argument

    • But syntax can also stand alone

  • Good syntax designed to do one thing

    • Syntaxes can be layered

  • Examples

    • XML, ASN.1, X.509, LDIF

Sdk software development kit
SDK(Software Development Kit)

  • A particular instantiation of an API

  • SDK consists of libraries and tools

    • Provides implementation of API specification

  • Can have multiple SDKs for an API

  • Examples of SDKs

    • MPICH, Motif Widgets

Multiple apis but a single protocol example tcp ip
Multiple APIs but a Single ProtocolExample: TCP/IP

  • Multiple APIs: BSD sockets, Winsock, System V streams, …

  • Different programs use different APIs

  • Interoperability: programs using different APIs can exchange information



WinSock API

Berkeley Sockets API

TCP/IP Protocol: Reliable byte streams

Single api but multiple protocols e g gss api






Kerberos SDK

GSI protocol

Kerberos protocol

Different message formats, exchange sequences, etc.



Single API, but Multiple ProtocolsE.g., GSS-API

  • GSS-API provides portability: any correct program compiles & runs on a platform

  • Does not provide interoperability: all processes must link against same SDK

    • E.g., GSI and Kerberos versions of GSS-API

I e standard apis and protocols are both important for different reasons
I.e., Standard APIs and Protocols are Both Important: For Different Reasons

  • Standard APIs/SDKs are important

    • They enable application portability

    • But w/o standard protocols, interoperability is hard (every SDK speaks every protocol?)

  • Standard protocols are important

    • Enable cross-site interoperability

    • Enable shared infrastructure

    • But w/o standard APIs/SDKs, application portability is hard (different platforms access protocols in different ways)

Grid data needs
Grid Data Needs Different Reasons

  • Transfer of large amounts of data (petabytes or terabytes) between storage systems

  • Access to large amounts of data (terabytes or gigabytes) by many geographically distributed applications and users for analysis, visualization, etc.

Requirements Different Reasons

  • Grid Security Infrastructure (GSI) and Kerberos support

  • Third-party control of data transfer

  • Parallel data transfer

  • Striped data transfer

  • Partial file transfer

  • Automatic negotiation of TCP buffer/window size

  • Support for reliable/recoverable data transfer

Candidate standards
Candidate Standards Different Reasons

  • FTP

    • Defined by a set of IETF RFCs

    • No partial file, parallel/striped, GSI, etc

    • Separate control & data channels

  • WebDAV

    • New extension to http

    • No third party transfer, parallel/striped, etc.

    • Combined control & data channel

Separate control data channels
Separate Control & Data Channels Different Reasons

  • WebDAV combines control and data over single channel

  • FTP splits control and data

    • Supports multiple, user selectable data channel protocols

  • Advantage to split channels

    • Third party transfers handled cleanly

    • Can (cleanly) define new data channel protocols

      • E.g. parallel/striped transfer, automatic TCP buffer/window negotiation

    • Amenable to high-performance proxies

      • E.g. For firewalls, load balancing, etc.

Gridftp solution
GridFTP Solution Different Reasons

  • Built on existing FTP standards

    • RFC 949: File Transfer Protocol

    • RFC 2228: FTP Security Extensions

    • RFC 2389: Feature Negotiation for the File Transfer Protocol

    • Draft: FTP Extensions

  • Extends standards with

    • Additions to security extensions, partial file transfer, parallel/striped transfer, TCP buffer/window size tuning,

Gridftp implementation status
GridFTP Implementation Status Different Reasons

  • Modified wu-ftpd server

    • Most features

  • Modified ncftp client

    • Security, TCP buffer setting

  • Modified HPSS & Unitree ftpd server

    • Security

  • Globus Toolkit client and server SDKs, and command line tools

    • Most features

  • Striped FTP server (aka DPSS2)

Gridftp working group documents
GridFTP Working Group Documents Different Reasons

  • GridFTP: A Data Transfer Protocol for the Grid

    • Overview of working group activities and documents

    • Requirements

    • Informational draft

  • GridFTP: FTP Extensions for the Grid

    • Protocol specification

Gridftp protocol specifications
GridFTP Protocol Specifications Different Reasons

  • Existing standards

    • RFC 949: File Transfer Protocol

    • RFC 2228: FTP Security Extensions

    • RFC 2389: Feature Negotiation for the File Transfer Protocol

    • Draft: FTP Extensions

  • New drafts

    • GridFTP: FTP Extensions for the Grid

Gridftp apis
GridFTP APIs Different Reasons

  • Should there be standard API(s)?

    • Posix I/O

    • SRB client

    • grid_storage

    • globus_ftp_client

    • MPI-IO

    • HDF5

    • etc

  • Beyond scope of this working group

  • Common protocol beneath these APIs would allow interoperability

Role of gridftp working group
Role of GridFTP Working Group Different Reasons

  • Bring together those who are interested in the future of GridFTP to help foster the…

    • continued specification and standardization of GridFTP

    • development of inter-operable GridFTP implementations

    • widespread adoption of GridFTP as a transfer protocol for the Grid

  • Develop drafts which together define GridFTP

    • May submit some of them to IETF

  • Move GridFTP forward to better address Grid data transfer requirements

Not goals of gridftp working group
NOT Goals of GridFTP Working Group Different Reasons

  • This working group will not start from first principles

    • Starting point is roughly GridFTP as it now exists

    • FTP base is assumed

  • Its not design by committee

    • Seeking rough consensus, with broad input

    • Draft authors and WG chair have final say

Gf5 gridftp working session
GF5 GridFTP Working Session Different Reasons

  • Is this appropriate for Grid Forum?

  • Who is interested in participating, and in what capacity?

  • Is the problem scoped appropriately (at least for now)?

  • What are the right drafts to write?

  • Establish rough timeline for drafts

A call to arms
A Call To Arms Different Reasons

  • The Grid Forum security working group needs to do more than just gather 3 times a year to chat about data management.

  • But Grid Forum is only appropriate for this activity if people meaningfully participate.

    • I will be doing this regardless.

    • But it will hopefully be done better and faster with broad participation.

    • If there is not meaningful participation, I won’t bother with the overhead of Grid Forum.