1 / 24

Reliable I/O on the Grid

This paper discusses the problem of "half-interactive" jobs on the grid and proposes a solution using the Grid Console and Kangaroo, a user-level data movement system. The prototype demonstrates improved reliability and throughput for I/O operations.

Download Presentation

Reliable I/O on the Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reliable I/O on the Grid Douglas Thain and Miron Livny Condor Project University of Wisconsin

  2. Outline • A Practical Problem • Half-Interactive Jobs • Solution: The Grid Console • Philosophical Musings • A New System: Kangaroo

  3. Problem:“Half-Interactive” Jobs • Users want to submit batch jobs to the Grid, but still be able to monitor the output interactively. • But, network failures are expected as a matter of course, so keeping the job running takes priority over getting output. • Examples: • INFN: Collider event simulation and reconstruction with CMS • NCSA: Modelling with Gaussian

  4. Existing Toolsare not Sufficient • Installing a uniform world-wide DFS is not feasible. Even if it were: • NFS: disconnect causes delay • AFS: close() can fail?!? • Condor • Vanilla: dependent on file system. • Standard: disconnect causes rollback. • GASS • Staging mode: no incremental output. • Append mode: no easy failure recovery.

  5. Solution: The Grid Console • Trap reads and writes on stdio and send them via RPCs to be executed at the home site. • If connection is lost, just keep writing to disk but retry connection periodically. • If re-made, send all spooled data back and then continue operation.

  6. Solution: The Grid Console Execution Site Storage Site APP Stdin, stdout, stderr Other files FILE SYSTEM BYPASS Existing storage system: NFS, AFS, GASS, etc. GC SHADOW RPC on TCP GC AGENT Globus Auth SPOOL DIR

  7. Observations onthe Grid Console • Interfaces well with existing systems: • Applied to vanilla Condor(G) jobs. • Works on any dynamically-linked program. • Undesired properties: • Only applies to standard streams. • Job is blocked during recovery mode. • Strange property: • Disconnected mode might be faster than connected mode! • Can we have it both ways?

  8. Philosophical Musings • What have we done? • Hidden errors • Job is not designed to deal with unusual error conditions: • Write -> disconnected? • Close -> host not found? • Hidden latency • Job is not designed to deal with slow I/O. It assumes that I/O ops are low latency, or at least appear to be. • GC could be better at this.

  9. Philosophical Musings, #2 • These problems are one and the same: • Hiding errors: Retry, report the error to a third party, and use another resource to satisfy the request. • Hiding latency: Use another resource to satisfy the request in the background, but if an error occurs, there is no channel to report it. • Reliability is not a binary property. • A slow link can be just as damaging to throughput as a disconnection.

  10. Philosophical Musings, #3 • A traditional OS deals with these same problems when it uses memory to buffer disk operations. • Let’s apply the same principle to the Grid: Use memory and disk to satisfy unscheduled I/O operations in the background.

  11. Introducing Kangaroo - A user-level data movement system that ‘hops’ files piecemeal from node to node on the Grid. - A background process that will ‘fight’ for your jobs’ I/O needs. - A ‘damage control’ specialist that will give errors to a third party but never admit failure to the job.

  12. App File System File System File System File System Our Vision: A Grid K K K Data Movement System K K K K Disk

  13. Kangaroo Prototype • We have built a first-try Kangaroo that validates the central ideas of error and latency hiding. • Emphasis on high-level reliability and throughput, not on low-level optimizations. • First, work to improve writes, but leave room in the design to improve reads.

  14. User Interface • Like the GC, attach standard applications with Bypass. • A tool for trapping UNIX I/O operations and routing them through new code. • Works on any dynamically-linked, unmodified program. • Examples: • setenv LD_PRELOAD pfs_agent.so • vi kangaroo://coral.cs.wisc.edu/etc/hosts • gcc gsiftp://ftp/input.c -o kangaroo://host/out

  15. Kangaroo Prototype APP Execution Site Storage Site FILE SYSTEM BYPASS Reads K SERVER K MOVER K SERVER SPOOL DIR KANGAROO AGENT Writes

  16. Microbenchmark:File Transfer • Create a large output file at the execution site, and send it to a storage site. • Ideal conditions: No competition for cpu, network, or disk bandwidth. • Three methods: • Stream output directly to target. • Stage output to disk, then copy to target. • Kangaroo

  17. Macrobenchmark:Image Processing • Post-processing of satellite image data: Need to compute various enhancements and produce output for each. • Read input image • For I=1 to N • Compute transformation of image • Write output image • Example: • Image size about 5 MB • Compute time about 6 sec • IO-cpu ratio .91 MB/s

  18. I/O Models for Image Processing Offline I/O: INPUT CPU CPU CPU CPU OUTPUT OUTPUT OUTPUT OUTPUT Online I/O: INPUT CPU OUTPUT CPU OUTPUT CPU OUTPUT CPU OUTPUT Current Kangaroo: INPUT CPU CPU CPU CPU PUSH OUTPUT OUTPUT OUTPUT OUTPUT

  19. Summary of Results • At the micro level, our prototype provides reliability with reasonable performance. • At the macro level, I/O overlap gives reliability and speedups (for some applications.) • Kangaroo allows the application to survive on its real I/O needs: .91 MB/s. Without it, there is ‘false pressure’ to provide fast networks.

  20. Research Problems • Virtual Memory • A K-node has one input, one output, and a memory/disk buffer. How should we move data to maximize throughput? • File System • Existing spool directory is clumsy and inefficient. Need a fs optimized for 1-write, 1-read, 1-delete. • Fine-Grained Scheduling • Reads should have priority over writes. This is easy at one node, but multiple nodes?

  21. Conclusion • The Grid is BYOFS. • Error hiding and latency hiding are tightly-knit problems. • The solution to both is to overlap I/O and computation. • The benefits of high-level overlap can outweigh any low-level inefficienies.

  22. Conclusion • Need more info? • {thain|miron}@cs.wisc.edu • http://www.cs.wisc.edu/condor/bypass • Demo time: • Wednesday, 9-12 AM • Room 3381 CS • Questions now?

More Related