Reliable data movement using globus gridftp and rft new developments in 2008
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 PowerPoint PPT Presentation


  • 129 Views
  • Uploaded on
  • Presentation posted in: General

Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008. John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and The University of Chicago. GridFTP. High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks

Download Presentation

Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Reliable data movement using globus gridftp and rft new developments in 2008

Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008

John Bresnahan

Michael Link

Raj Kettimuthu

Argonne National Laboratory and

The University of Chicago


Gridftp

GridFTP

  • High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks

  • Based on FTP protocol - defines extensions for high-performance operation and security

  • Standardized through Open Grid Forum (OGF)

  • Globus implementation of GridFTP is widely used for bulk data movement

    • Average of more then 3 million transfers per day


Gridftp1

GridFTP

GridFTP

Server

CC

Client

DC

CC

GridFTP

Server


Key features

Key features

  • Performance

  • Security

    • GSI, SSH

    • Username/password and anonymous

  • Cluster-to-cluster data movement/striping

  • Support for reliable and restartable transfers

  • Modular

    • Easy to plug-in alternate transport protocols

    • Storage systems too - HPSS, SRB


Globus reliable file transfer service rft

Globus Reliable File Transfer Service (RFT)

  • GridFTP client that provides more reliability

  • GridFTP - on demand transfer service

    • Not a queuing service

  • RFT

    • Queues requests

    • Orchestrates transfers on client’s behalf

    • Writes to persistent store

    • Recovers from GridFTP and RFT service failures


Reliable data movement using globus gridftp and rft new developments in 2008

RFT

Client

SOAP Messages

Notifications(Optional)

RFT Service

Persistent Store

CC

CC

DC

GridFTP

Server

GridFTP

Server


Reliable data movement using globus gridftp and rft new developments in 2008

New features in GridFTP

  • GridFTP information provider service

    • Max connections

    • Open connections

    • Load

  • Higher level services can utilize this information for scheduling data transfers

    • Help with selecting the appropriate replica of data


Concurrency

Concurrency

GridFTP

Server

CC

Client

DC

CC

GridFTP

Server


Concurrency1

Concurrency

  • Client submits concurrent transfer requests to the server

    • Significantly improves the performance of lots of small files transfers

    • APS used this feature to transfer 1 TB of data to Australia at 30x faster than SCP

    • LIGO used this feature to transfer 1.5 TB of data from Milwaukee to Germany at 80 MB/s


Multicasting

Multicasting


Gridftp overlay network

GridFTP Overlay Network

BWsd


Bottleneck detection

Bottleneck detection

  • Determine the bottleneck for the data transfer performance

  • Network read, network write, disk read, disk write

  • Netlogger is used to determine these values

  • Netlogger is shipped with Globus, starting from 4.2

    • ./configure --enable-netlogger

    • make gridftp globus_xio_netlogger_driver


Popen

Popen

  • Popen XIO driver

    • allows users to open pipes to the standard IO of existing programs

    • leverage programs like you can with UNIX pipes

    • globus-gridftp-server -p 5000 -fs-whitelist popen,file,ordering -aa

    • globus-url-copy -dst-fsstack popen:argv=#/usr/bin/zip#/home/bresnaha/text.txt.zip#-,ordering ftp://localhost:5000/home/bresnaha/text.txt ftp://localhost:5000/y


New features in rft

New features in RFT

  • Command line client in C

    • A new feature rich and fast command line client.

    • Globus-crft

  • GT4.2 RFT has more robust retry mechanisms.

    • help prevent overheating in certain cluster configurations.


Connection caching

Connection caching

  • Instead of only caching connections across a users single transfer request they are now cached against all transfer requests.

  • This has dramatic performance increases when a user performs multiple requests

  • Eliminate authentication overhead on the control and data channels


Connection caching1

Connection caching

  • Measured performance improvement for jobs submitted using Condor-G

  • For 500 jobs - each job requiring file stageIn, stageOut and cleanup (RFT tasks)

    • 30% improvement in overall performance

    • No timeout due to overwhelming connection requests to GridFTP servers


  • Login