reliable data movement using globus gridftp and rft new developments in 2008
Download
Skip this Video
Download Presentation
Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008

Loading in 2 Seconds...

play fullscreen
1 / 16

Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 - PowerPoint PPT Presentation


  • 164 Views
  • Uploaded on

Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008. John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and The University of Chicago. GridFTP. High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008' - kasi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
reliable data movement using globus gridftp and rft new developments in 2008

Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008

John Bresnahan

Michael Link

Raj Kettimuthu

Argonne National Laboratory and

The University of Chicago

gridftp
GridFTP
  • High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks
  • Based on FTP protocol - defines extensions for high-performance operation and security
  • Standardized through Open Grid Forum (OGF)
  • Globus implementation of GridFTP is widely used for bulk data movement
    • Average of more then 3 million transfers per day
gridftp1
GridFTP

GridFTP

Server

CC

Client

DC

CC

GridFTP

Server

key features
Key features
  • Performance
  • Security
    • GSI, SSH
    • Username/password and anonymous
  • Cluster-to-cluster data movement/striping
  • Support for reliable and restartable transfers
  • Modular
    • Easy to plug-in alternate transport protocols
    • Storage systems too - HPSS, SRB
globus reliable file transfer service rft
Globus Reliable File Transfer Service (RFT)
  • GridFTP client that provides more reliability
  • GridFTP - on demand transfer service
    • Not a queuing service
  • RFT
    • Queues requests
    • Orchestrates transfers on client’s behalf
    • Writes to persistent store
    • Recovers from GridFTP and RFT service failures
slide6
RFT

Client

SOAP Messages

Notifications(Optional)

RFT Service

Persistent Store

CC

CC

DC

GridFTP

Server

GridFTP

Server

slide7
New features in GridFTP
  • GridFTP information provider service
    • Max connections
    • Open connections
    • Load
  • Higher level services can utilize this information for scheduling data transfers
    • Help with selecting the appropriate replica of data
concurrency
Concurrency

GridFTP

Server

CC

Client

DC

CC

GridFTP

Server

concurrency1
Concurrency
  • Client submits concurrent transfer requests to the server
    • Significantly improves the performance of lots of small files transfers
    • APS used this feature to transfer 1 TB of data to Australia at 30x faster than SCP
    • LIGO used this feature to transfer 1.5 TB of data from Milwaukee to Germany at 80 MB/s
bottleneck detection
Bottleneck detection
  • Determine the bottleneck for the data transfer performance
  • Network read, network write, disk read, disk write
  • Netlogger is used to determine these values
  • Netlogger is shipped with Globus, starting from 4.2
    • ./configure --enable-netlogger
    • make gridftp globus_xio_netlogger_driver
popen
Popen
  • Popen XIO driver
    • allows users to open pipes to the standard IO of existing programs
    • leverage programs like you can with UNIX pipes
    • globus-gridftp-server -p 5000 -fs-whitelist popen,file,ordering -aa
    • globus-url-copy -dst-fsstack popen:argv=#/usr/bin/zip#/home/bresnaha/text.txt.zip#-,ordering ftp://localhost:5000/home/bresnaha/text.txt ftp://localhost:5000/y
new features in rft
New features in RFT
  • Command line client in C
    • A new feature rich and fast command line client.
    • Globus-crft
  • GT4.2 RFT has more robust retry mechanisms.
    • help prevent overheating in certain cluster configurations.
connection caching
Connection caching
  • Instead of only caching connections across a users single transfer request they are now cached against all transfer requests.
  • This has dramatic performance increases when a user performs multiple requests
  • Eliminate authentication overhead on the control and data channels
connection caching1
Connection caching
  • Measured performance improvement for jobs submitted using Condor-G
  • For 500 jobs - each job requiring file stageIn, stageOut and cleanup (RFT tasks)
    • 30% improvement in overall performance
    • No timeout due to overwhelming connection requests to GridFTP servers
ad