1 / 33

Grid-Based File Access: The Legion I/O Model

Grid-Based File Access: The Legion I/O Model. Brain S White, Andrew S Grimshaw Any Nguyen-Tuong Department of Computer Science University of Virginia. Overview. Overview of Legion I/O User Interface Server Implementation Performance. Design Principle of Legion.

matsu
Download Presentation

Grid-Based File Access: The Legion I/O Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid-Based File Access: The Legion I/O Model Brain S White, Andrew S Grimshaw Any Nguyen-Tuong Department of Computer Science University of Virginia

  2. Overview • Overview of Legion • I/O User Interface • Server Implementation • Performance

  3. Design Principle of Legion • Object-based grid operating system • a single virtual system • a collection of heterogeneous resources • extensible and replaceable components • Specify the functionality (interface), not the implementation • reference implementation of core objects only • users are encouraged to use their own implementation for specific requirements.

  4. Grid File Systems • Performance • Low response times • High throughput • Usability • Minimizing any change to legacy code • Interface which hides the grid environment • Rich interface to take full advantage of grid’s potential

  5. LOID Object Address Context Name Context Space Binding Process Naming and Binding • Three-level naming hierarchy • Name-space is hierarchical and rooted • Human-readable context name • location-independent identifiers (LOID – Legion Object Identifiers) via ContextObjexts • Object Addresses

  6. I/O User Interface • Familiar Interfaces • Command Line Utilities • e.g. legion_ls, legion_cat and etc • Remote I/O Interface • Basic I/O Interface • raw and buffered • Low impact buffered interface • Legion-aware nfsd • Parallel I/O Interface

  7. Command Line Utilities • Use command to navigate context space and manipulate its structure • UNIX-like commands: • legion_ls, legion_cat, legion_cat and etc • Bridging the gap between context space and traditional file system • legion_import_tree • copies a local directory tree, creating a Legion object for each subdirectory and file • legion_export_dir • makes a UNIX directory visible in context space, without creating stand-alone objects for each contained file and subdirectory

  8. Basic I/O Interface • C and C++ I/O library • Follow the naming and arugment conventions of their C counterparts. • Encapsulate the lower-level BasicFileObject primitives • Raw I/O interface • fine-grained file accesses are relatively expensive • unoptimized Legion protocol stack • Buffered I/O interface • C, C++ and Fortran • reduce the frequency of remote procedure calls

  9. Low Impact Buffered Interface • Give the user the option of making the smallest number of changes to their program and still use Legion file objects • Transfer the contents of BasicFileObjects between the local file system • lio_legion_to_tempfile • lio_tempfile_to_legion • Similar to GASS staging

  10. Legion-aware nfsd (lnfsd) • acts as ordinary nfs between the kernel client and Legion • receives NFS request from the kernel • translates into Legion method invocations • return result in NFS format to client • Performance Issue • use larger granule to reduce overhead • read ahead (asynchronous prefetch) • asynchronous write-behind

  11. Legion-aware nfsd • Security • Legion users on a Legion host trust privileged processes on that host • User credentials are stored in /tmp file system • lnfsd only accepts connections made from a reserved port • The NFS client and lnfsd are collocated on the same host

  12. Parallel I/O Interface • Synchronous or asynchronous interface • Allow user-specified striping of data across BasicFileObjects • Allow multiple clients to access the data without contending with one another at a central server object • Individual client performance benefits because multiple BasicFileObjects may be accessed concurrently to deliver the desired data

  13. Server Implementation • BasicFileObject • each object represents exactly one file • persistently stored in a LegionBuffer, a random-access array • LegionBuffer is stored in a VaultObject (storage server) • ProxyMultiObject • serves both ContextObject and BasicFileObject • changes to a BasicFileObject’s contents are immediately and automatically reflected to the user’s file

  14. BasicFileObject

  15. ProxyMultiObject

  16. Performance • Compared with standard ftp transfer and Globus GASS • Environment • an SGI Origin 2000, 56 processors, Irix 6.5 at NCSA in Champaign, Illionis • dual processor 400Mhz Penitum II, Linux kernel 2.2.14, University of Virginia in Charlottesville, Virginia • OC-12 backbone and OC-3 intermediate connection • Experiment • Transfer data from (and written to) an NFS mount on the SGI Origin array • Security option is disabled

  17. Protocol Overhead • The fixed cost of connection setup and tear-down

  18. Protocol Overhead • Significant overhead in Legion mechanism • location-independent naming scheme which requires several context name/LOID translations and LOID/OA bindings • an expensive remote call to BasicFileObject

  19. Bandwidth Measurements • File transfers of various sizes • Result summary • 70-85% of ftp’s read bandwidth for mass transfers • Achieves 55-65% of ftp’s write bandwidth • Suffers owing to its protocol overhead for sizes less than 1MB

  20. Legion Read Bandwidth • low impact interface is similar to basic I/O interface • lnfs throughput is limtied due to • lnfs periodically queries the remote BASICFileObject to satisfy GETATTR requests • max transfer size is restricted to a 4K page.

  21. Legion Write Bandwidth • low impact interface is worse than basic I/O interface and lnfs since file must be first, modified, and written back

  22. Read Bandwidth • ftp and GASS has similar performance • basic I/O and low impact I/O approach 70-85% of ftp bandwidth due to extensible protocol stack • lnfs lags significantly due to more RPC calls and sophisticated block cache

  23. Write Bandwidth • Legion basic I/O outperforms GASS and low impact I/O because it does not need copy-in/copy-out • lnfs lags significantly due to sophisticated block cache

  24. Reference • The Core Legion Object Model, Mike Lewis,Andrew Grimshaw, August 1995, Technical Report • Architectural Support for Extensibility and Autonomy in Wide-Area Distributed Object Systems, A S Grimshaw, M J Lewis, A J Ferrarri and J F Karpovich. June 3 1998, Technical Report • GASS: A Data Movement and Access Service for Wide Area Computing Systems. J Bester, J Foster, C Kesselman, J Tedesco and S Tuecke, May 1999, Proceedings of the Sixth Workshop on Input/Output in Parallel and Distributed Systems • A Flexible Security System for Metacomputing Environments. A Ferrari, F Kanbe, M Humphrey, S Chapin and S Grimshaw. December 1998. Technical Papers. • The Legion Research Group. Legion 1.7 user manual • The Legion Research Group. Legion 1.7 development manual

  25. Related Work • NFS • artificially restricted transfer sizes based on the virtual memory architecture • cache consistency mechanism limits throughput • AFS • transfers and copies entire files • unnecessary traffic when a dataset is partitioned between multiple distributed works • cache consistency mechanism limits throughput

  26. Related work • Globus GASS • remote access through • x-gass • ftp • HTTP • whole-file caching • streaming append operation • specialized calls are cumbersome

  27. Legion Interface • Method calls are non-blocking and may be accepted in any order by the called object • Legion class interface can be described in an Interface Description Language. CORBA IDL and MPL are initially supported.

  28. Legion I/O Objects • HostObjects • computational resources running in a Legion system • VaultObjects • persistent storage of inactive Legion object • BasciFileObjects • corresponding to files in a conventional file system • ContextObjects • analogous to distributed rooted directory tree

  29. Legion Context Space

  30. Context Name • Context name • Use UNIX-like structure • e.g. /User1/ContextA/Foo • Single object with multiple context name

  31. GASS Staging

  32. Duplicate Request Cahce • A short-term memory mechanism in which the original completion status of a request is remembered and the operation attempted only once • If a duplicate copy of this request is received, then the original completion status is returned

  33. LegionBuffer • The fundamental data container in the Legion Library. Legionbuffer exports operations to read and write data from and to a logical buffer • Implementations for in-memory buffers, Unix file buffers and Legion file objects • compress or encrypt data

More Related