Gridftp
This presentation is the property of its rightful owner.
Sponsored Links
1 / 26

GridFTP PowerPoint PPT Presentation


  • 139 Views
  • Uploaded on
  • Presentation posted in: General

GridFTP. Steve Tuecke Argonne National Laboratory. Overview. Motivation for GridFTP Working Group Requirements GridFTP Solution GridFTP Working Group Documents Role of GridFTP Working Group. GridFTP Working Group Motivation.

Download Presentation

GridFTP

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Gridftp

GridFTP

Steve Tuecke

Argonne National Laboratory


Overview

Overview

  • Motivation for GridFTP Working Group

  • Requirements

  • GridFTP Solution

  • GridFTP Working Group Documents

  • Role of GridFTP Working Group


Gridftp working group motivation

GridFTP Working Group Motivation

  • Data transfer solutions have been developed by the Globus Project over past ~5 years, GridFTP is 3rd generation

  • Grid Forum started ~1 year ago to promote and develop Grid technologies

    • Critical mass of people working in this area

  • Grid Forum GridFTP working group formed to foster the further specification and development of GridFTP

    • Community effort to move GridFTP forward


Some important definitions

Some Important Definitions

  • Resource

  • Network protocol

  • Network enabled service

  • Application Programmer Interface (API)

  • Syntax

  • Software Development Kit (SDK)


Resource

Resource

  • Entity that is to be shared

    • Includes computers, storage, data, software

  • Does not have to be physical entity

    • Condor pool, distributed file system, …

  • Defined in terms of interfaces, not devices

    • E.g. LSF defines compute resource

    • Open/close/read/write defines access to a distributed file system, e.g. NFS, AFS, DFS


Network protocol

Network Protocol

  • A formal description of message formats and a set of rules for message exchange

    • Rules may define sequence of message exchanges

    • Protocol may define state-change in endpoint, e.g. state change

  • Good protocols designed to do one thing

    • Protocols can be layered

  • Examples of protocols

    • IP, TCP, TLS, FTP, HTTP, Kerberos


Network enabled services

FTP Server

Web Server

HTTP Protocol

FTP Protocol

Telnet Protocol

TLS Protocol

TCP Protocol

TCP Protocol

IP Protocol

IP Protocol

Network Enabled Services

  • Implementation of a protocol that defines a set of capabilities

    • Protocol defines interaction with service

    • All services require protocols

    • Not all protocols are used to provide services (e.g. IP, TLS)

  • Examples: FTP and Web servers


Api application programming interface

API(Application Programming Interface)

  • A specification for a set of routines to facilitate application development

    • Refers to definition, not implementation, e.g. there are many implementations of MPI

  • Spec often language-specific (or IDL)

    • Routine name, number, order and type of arguments; mapping to language constructs

    • Behavior or function of routine

  • Examples

    • GSS API, MPI


Syntax

Syntax

  • A specification for how a defined set of information is encoded into bits

    • A syntax may be defined as part of a protocol or API

      • Protocol messages have defined syntax

      • A syntax may be used as API function argument

    • But syntax can also stand alone

  • Good syntax designed to do one thing

    • Syntaxes can be layered

  • Examples

    • XML, ASN.1, X.509, LDIF


Sdk software development kit

SDK(Software Development Kit)

  • A particular instantiation of an API

  • SDK consists of libraries and tools

    • Provides implementation of API specification

  • Can have multiple SDKs for an API

  • Examples of SDKs

    • MPICH, Motif Widgets


Multiple apis but a single protocol example tcp ip

Multiple APIs but a Single ProtocolExample: TCP/IP

  • Multiple APIs: BSD sockets, Winsock, System V streams, …

  • Different programs use different APIs

  • Interoperability: programs using different APIs can exchange information

Application

Application

WinSock API

Berkeley Sockets API

TCP/IP Protocol: Reliable byte streams


Single api but multiple protocols e g gss api

Application

Application

GSS-API

GSS-API

GSI SDK

Kerberos SDK

GSI protocol

Kerberos protocol

Different message formats, exchange sequences, etc.

TCP/IP

TCP/IP

Single API, but Multiple ProtocolsE.g., GSS-API

  • GSS-API provides portability: any correct program compiles & runs on a platform

  • Does not provide interoperability: all processes must link against same SDK

    • E.g., GSI and Kerberos versions of GSS-API


I e standard apis and protocols are both important for different reasons

I.e., Standard APIs and Protocols are Both Important: For Different Reasons

  • Standard APIs/SDKs are important

    • They enable application portability

    • But w/o standard protocols, interoperability is hard (every SDK speaks every protocol?)

  • Standard protocols are important

    • Enable cross-site interoperability

    • Enable shared infrastructure

    • But w/o standard APIs/SDKs, application portability is hard (different platforms access protocols in different ways)


Grid data needs

Grid Data Needs

  • Transfer of large amounts of data (petabytes or terabytes) between storage systems

  • Access to large amounts of data (terabytes or gigabytes) by many geographically distributed applications and users for analysis, visualization, etc.


Requirements

Requirements

  • Grid Security Infrastructure (GSI) and Kerberos support

  • Third-party control of data transfer

  • Parallel data transfer

  • Striped data transfer

  • Partial file transfer

  • Automatic negotiation of TCP buffer/window size

  • Support for reliable/recoverable data transfer


Candidate standards

Candidate Standards

  • FTP

    • Defined by a set of IETF RFCs

    • No partial file, parallel/striped, GSI, etc

    • Separate control & data channels

  • WebDAV

    • New extension to http

    • No third party transfer, parallel/striped, etc.

    • Combined control & data channel


Separate control data channels

Separate Control & Data Channels

  • WebDAV combines control and data over single channel

  • FTP splits control and data

    • Supports multiple, user selectable data channel protocols

  • Advantage to split channels

    • Third party transfers handled cleanly

    • Can (cleanly) define new data channel protocols

      • E.g. parallel/striped transfer, automatic TCP buffer/window negotiation

    • Amenable to high-performance proxies

      • E.g. For firewalls, load balancing, etc.


Gridftp solution

GridFTP Solution

  • Built on existing FTP standards

    • RFC 949: File Transfer Protocol

    • RFC 2228: FTP Security Extensions

    • RFC 2389: Feature Negotiation for the File Transfer Protocol

    • Draft: FTP Extensions

  • Extends standards with

    • Additions to security extensions, partial file transfer, parallel/striped transfer, TCP buffer/window size tuning,


Gridftp implementation status

GridFTP Implementation Status

  • Modified wu-ftpd server

    • Most features

  • Modified ncftp client

    • Security, TCP buffer setting

  • Modified HPSS & Unitree ftpd server

    • Security

  • Globus Toolkit client and server SDKs, and command line tools

    • Most features

  • Striped FTP server (aka DPSS2)


Gridftp working group documents

GridFTP Working Group Documents

  • GridFTP: A Data Transfer Protocol for the Grid

    • Overview of working group activities and documents

    • Requirements

    • Informational draft

  • GridFTP: FTP Extensions for the Grid

    • Protocol specification


Gridftp protocol specifications

GridFTP Protocol Specifications

  • Existing standards

    • RFC 949: File Transfer Protocol

    • RFC 2228: FTP Security Extensions

    • RFC 2389: Feature Negotiation for the File Transfer Protocol

    • Draft: FTP Extensions

  • New drafts

    • GridFTP: FTP Extensions for the Grid


Gridftp apis

GridFTP APIs

  • Should there be standard API(s)?

    • Posix I/O

    • SRB client

    • grid_storage

    • globus_ftp_client

    • MPI-IO

    • HDF5

    • etc

  • Beyond scope of this working group

  • Common protocol beneath these APIs would allow interoperability


Role of gridftp working group

Role of GridFTP Working Group

  • Bring together those who are interested in the future of GridFTP to help foster the…

    • continued specification and standardization of GridFTP

    • development of inter-operable GridFTP implementations

    • widespread adoption of GridFTP as a transfer protocol for the Grid

  • Develop drafts which together define GridFTP

    • May submit some of them to IETF

  • Move GridFTP forward to better address Grid data transfer requirements


Not goals of gridftp working group

NOT Goals of GridFTP Working Group

  • This working group will not start from first principles

    • Starting point is roughly GridFTP as it now exists

    • FTP base is assumed

  • Its not design by committee

    • Seeking rough consensus, with broad input

    • Draft authors and WG chair have final say


Gf5 gridftp working session

GF5 GridFTP Working Session

  • Is this appropriate for Grid Forum?

  • Who is interested in participating, and in what capacity?

  • Is the problem scoped appropriately (at least for now)?

  • What are the right drafts to write?

  • Establish rough timeline for drafts


A call to arms

A Call To Arms

  • The Grid Forum security working group needs to do more than just gather 3 times a year to chat about data management.

  • But Grid Forum is only appropriate for this activity if people meaningfully participate.

    • I will be doing this regardless.

    • But it will hopefully be done better and faster with broad participation.

    • If there is not meaningful participation, I won’t bother with the overhead of Grid Forum.


  • Login