Efficient access to many small files in a grid filesystem
Download
1 / 44

Efficient Access to - PowerPoint PPT Presentation


  • 500 Views
  • Uploaded on

Efficient Access to Many Small Files in a Grid Filesystem. Douglas Thain and Christopher Moretti University of Notre Dame. Efficient Access to Many Small (and Big) Files in a Grid Filesystem. Douglas Thain and Christopher Moretti University of Notre Dame. Abstract.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Efficient Access to' - LionelDale


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Efficient access to many small files in a grid filesystem l.jpg

Efficient Access toMany Small Files in a Grid Filesystem

Douglas Thain and Christopher Moretti

University of Notre Dame


Efficient access to many small and big files in a grid filesystem l.jpg

Efficient Access to ManySmall (and Big) Files in a Grid Filesystem

Douglas Thain and Christopher Moretti

University of Notre Dame


Abstract l.jpg
Abstract

  • Many grid data tools focus on transferring, storing, and managing large (GB-TB) files.

  • But, many users need to manage, transfer, and process lots (1000s) of small (KB-MB) files.

  • We describe protocols and interfaces for manipulating many small files over wide area networks. (Doesn’t hurt large files, either.)

  • Implemented in the Chirp file system.

  • Performance:

    • Best case: order of magnitude improvement.

    • Worst case: no slower than before.



Who has lots of small files l.jpg
Who has lots of small files?

  • Anyone using a batch system.

    • One file for submit, input, output, error, log...

  • Anyone using a large software package.

    • Executables, libraries, config files...

  • Anyone using a filesystem like a database.

    • Genomics, astronomy, physics...

  • Anyone who likes to write shell scripts.

    • foreach host in list ssh $host > $host.output


Why is this a problem l.jpg
Why is this a problem?

  • Users do the “sensible” thing:

    • foreach file in (list) do transfer done

  • The “sensible” thing performs miserably:

    • New TCP Connection

    • SSL Authentication

    • Configuration Operations

    • Slow Start Again

  • Result is KB/s on a GB/s link.


Why not just use tar l.jpg
Why not just use tar?

  • If you can, you should!

  • Sometimes you cannot:

    • The system semantics demand multiple files.

    • Packing and unpacking can be very slow.

    • Not enough disk space to unpack.

    • Different apps select different data subsets.

    • Using an existing script or program.

  • Users don’t know or care that it’s a dist system, why should they change?


The challenge how to design interfaces so that users get the expected performance and behavior l.jpg
The Challenge:How to design interfacesso that users get the expectedperformance and behavior?


Chirp and parrot a grid filesystem l.jpg
Chirp and Parrot:A Grid Filesystem


Requirements for a grid filesystem l.jpg
Requirements for a Grid Filesystem

  • Transparent access to files in the same manner as a local Unix filesystem.

  • Non privileged deployment at both client and server. (root not possible on the grid.)

  • User control over policies for naming, caching, consistency, and fault tolerance.

  • Flexible access controls for sharing.

  • Good performance on both small and large files.


Chirp parrot a grid filesystem l.jpg
Chirp/Parrot – A Grid Filesystem

Ordinary

Unix

Program

Authentication:

Kerberos / Globus / Hostname / Unix

No

Privs

Needed!

Automatic Recovery

unix

system

calls

ptrace

trap

Single TCP Stream

Chirp

Parrot

Protocol:

open / pread / pwrite / close

stat / mkdir / rmdir / unlink

getfile / putfile / movefile

No

Privs

Needed!

Ordinary

Unix

Filesystem

Authorization:

kerberos:[email protected] RWLDA

globus:/O=ND/CN=Joe RWLDA

hostname:*.nd.edu RL

group:server.nd.edu/team RWL


Ordinary unix commands l.jpg
Ordinary Unix Commands

> parrot tcsh

> ls /chirp

alpha.nd.edu

beta.nd.edu

...

> cd /chirp/alpha.nd.edu/mydir

> cp /tmp/bigdata .

> emacs mydata.txt


Parrot specific commands l.jpg
Parrot Specific Commands

> parrot tcsh

> parrot_whoami

globus:/O=ND/CN=Joe

> parrot_getacl /chirp/alpha.nd.edu/

kerberos:[email protected] RWLDA

globus:/O=ND/CN=Joe RWL

hostname:*.nd.edu RL


Chirp as remote filesystem l.jpg

App

App

Parrot

Parrot

App

App

App

App

App

Parrot

Parrot

Parrot

Parrot

Parrot

App

Cert

Parrot

Chirp as Remote Filesystem

Grid Site A

Grid Site B

Secured

by GSI

Chirp

Server

Grid

Middleware

Unix

Filesystem


Chirp as cluster filesystem l.jpg

App

App

Parrot

Parrot

App

App

App

App

App

Parrot

Parrot

Parrot

Parrot

Parrot

aux

db

dir

server

Chirp as Cluster Filesystem

Grid Site A

Grid Site B

Chirp

Server

Chirp

Server

Chirp

Server

Chirp

Server

Unix

Filesystem

Unix

Filesystem

Unix

Filesystem

Unix

Filesystem



Sample applications l.jpg
Sample Applications

  • Image Processing for Biometrics

    • Moretti et al, PCGRID 2007

  • Bioinformatics on EGEE

    • Blanchet et al, Grid 2006

  • High Energy Physics on LCG

    • Sfiligoi et al, CHEP 2005,

  • Molecular Dynamics Repository

    • Wozniak et al, HPDC 2005

  • Remote DB Access on EDG

    • Klous et al, CCPE 2005



What about ftp l.jpg
What About FTP?

  • FTP is a great data transfer system, but it was never designed to be a file system:

    • New TCP stream per data transfer.

    • New TCP stream for each directory list.

    • Lots of connections can overwhelm net devices.

    • Coarse errors: 550 for all file system errors.

    • Semantic problems: e.g. empty directory.

    • Unix access controls, (But, see SecPAL)

    • Wildly varying implementations and support.


Ftp protocol reminder l.jpg
FTP Protocol Reminder

Control Connection

AUTH GSSAPI

MIC

MIC

PORT

RETR

FTP

Client

FTP

Server

Data Connection

Minimum of four round trips (plus auth overhead) to fetch a file + loss of TCP window.

AUTH GSSAPI

MIC

MIC

Data Transfer

Common practice is new control connection for every data transfer!


What about nfs l.jpg
What About NFS?

  • NFS was designed for a local area network among (relatively) trusted hosts.

    • Fine-grained file access very slow on WAN.

    • Kernel support and root assistance needed to start server, mount client, change target.

    • Unix UID for ownership, access control.

    • Need to bind to privileged port, often filtered.

    • Use of “file handles” to refer to files makes it very difficult to build a user-level server.

      + lots of lookup operations over the WAN.


Nfs protocol reminder l.jpg
NFS Protocol Reminder

lookup(00,a)

lookup(10,b)

lookup(20,c)

...

NFS

Client

NFS

Server

read 4KB

read 4KB

read 4KB

...

On a WAN, throughput limited to 4KB/latency.

10ms = 400 KB/s

100ms = 40 KB/s


Chirp hybrid protocol overview l.jpg
Chirp Hybrid Protocol Overview

auth globus (8 RTT)

open

read

write

close

...

getfile(“mydata”)

putfile(“otherdata”,size)

Chirp

Client

Chirp

Server

size and data

data


Protocol comparison l.jpg
Protocol Comparison

  • FTP - Stream per File

    • Latency = 4+ RTT for each file

    • Throughput = TCP limit after slow start

  • NFS – Remote Procedure Call

    • Latency = 1 RTT for each file

    • Throughput = block size / latency

  • Chirp - Hybrid

    • Latency = 1 RTT for each file

    • Throughput = TCP limit in steady state






Standard unix copy l.jpg
Standard Unix Copy

cp /tmp/source /chirp/B/target

cp

open(source)

open(target)

loop: read/write

Parrot

read

open

write

open(source)

Local

Chirp

open(source)

read

open

write

Chirp

Server

Local

Disk


Slide30 l.jpg

Problem:The system does not know the context of the operation!Solution:Introduce a higher-level operationcopyfile that exploits the context.


Improved copy with copyfile l.jpg

copyfile(source,target)

open(source)

putfile(target)

open(source)

putfile(target)

Improved Copy with Copyfile

cp /tmp/source /chirp/B/target

new

cp

Parrot

Local

Chirp

Chirp

Server

Local

Disk


Is it reasonable to modify cp l.jpg
Is it reasonable to modify cp?

  • Installation:

    • Cannot modify /bin/cp.

    • Install new parrot_cp

    • Alias cp or link named “cp” in PATH.

  • Backwards compatibility:

    • parrot_cp without Parrot falls back to normal.

    • Ordinary cp on Parrot behaves as before.

    • Parrot_cp on a different filesystem falls back.


Improved copy with copyfile33 l.jpg

copyfile(source,target)

thirdput(source,B,target)

thirdput(source,B,target)

putfile(target)

Improved Copy with Copyfile

cp /chirp/A/source /chirp/B/target

new

cp

Parrot

Chirp

Chirp

Server

A

Chirp

Server

B


Directory copy l.jpg

thirdput(/mydir/X,B,/mydir/X)

mkdir(mydir)

thirdput(/mydir/X,B,/mydir/Y)

setacl(mydir)

thirdput(/mydir/X,B,/mydir/Z)

mydir

ACL

X

Y

Z

Directory Copy

cp –r /chirp/A/mydir

/chirp/B/mydir

cp

Parrot

Chirp

Server

A

Chirp

Server

B

mydir

ACL

X

Y

Z


Improved directory copy l.jpg

thirdput(/mydir,B,/mydir)

mydir

ACL

X

Y

Z

Improved Directory Copy

cp –r /chirp/A/mydir

/chirp/B/mydir

cp

Parrot

mkdir

putfile*3

setacl

Chirp

Server

A

Chirp

Server

B

mydir

ACL

X

Y

Z



You get the idea l.jpg
You get the idea...

ls –la D

  • Original: getdir D + N*stat

  • Improved: getlongdir D

  • rm –rf D

    • Original: getdir D + N*unlink (recursive)

    • Improved: rmall D

  • md5sum F

    • Original: open F + N*read + close

    • Improved: md5 F


  • Final example l.jpg
    Final Example

    ls –la /chirp/alpha/data

    md5sum /chirp/alpha/data/*

    cp -r /chirp/alpha/data

    /chirp/beta/data

    md5sum /chirp/beta/data/*

    rm –rf /chirp/alpha/data


    Original implementation l.jpg

    ls -la

    md5

    cp

    rm

    cp

    md5

    Original Implementation

    app

    parrot

    chirp

    server

    A

    chirp

    server

    B


    Improved implementation l.jpg

    ls -la

    md5

    cp

    rm

    md5

    Improved Implementation

    app

    parrot

    chirp

    server

    A

    chirp

    server

    B



    The challenge how to design interfaces so that users get the expected performance and behavior42 l.jpg
    The Challenge:How to design interfacesso that users get the expectedperformance and behavior?


    Summary l.jpg
    Summary

    • Good small file performance requires attention to low level network protocols.

      • getfile, putfile, thirdput, rmall, checksum

    • Exploiting protocols requires minor changes to the Unix I/O interface.

      • copyfile, rmall, checksum, others?

    • Easy to apply those changes in a user transparent way.

      • cp, rm, md5sum all operate as normal

    • Usable performance in a wide-area FS.


    For more information l.jpg
    For more information...

    • Douglas Thain

      • [email protected]

    • Chris Moretti

      • [email protected]

    • Parrot and Chirp

      • http://www.cctools.org


    ad