the arc data client in emi 1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
The ARC data client in EMI-1 PowerPoint Presentation
Download Presentation
The ARC data client in EMI-1

The ARC data client in EMI-1

205 Views Download Presentation
Download Presentation

The ARC data client in EMI-1

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. The ARC data client in EMI-1 Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the ARC Data PT

  2. Outline • Why ARC data libs? • ARC data in production • ARC data Architecture • ARC client tools • ARC data in EMI • Plans / way forward Jon Kerr Nilsen

  3. Why use ARC data libs? • Can be used on different platforms (Linux, Windows, Mac, Solaris) • Can be used from different languages (C++, Python, java) • Pluggable infrastructure which allows for a very limited set of dependencies from other libraries • Plugins exist for several widely used data transfer and management protocols (GridFTP, SRM, LFC, https, stdio, …) • Light-weight and user-friendly, with easy client-side configuration (the default config would work in most cases) Jon Kerr Nilsen

  4. ARC DDM in production • ARC is not only for the Nordics • Highly distributed – e.g. NDGF T-1 is spread over four countries (both CPU and storage) • Most resources shared with other users/VO’s Jon Kerr Nilsen

  5. ARC DDM in production • 4-6k cores being used • 20-40k jobs per day • Transfering500MB/s on average for downloads and uploads–> ~100TB/day • Approx. half of it on NDGF T-1 • Not very big but very efficient Jon Kerr Nilsen

  6. ARC Internals • Clients can submit jobs using WS interface or to ARC classic and other legacy services via specialized adaptors • HED handles communication between services and with outside world • A-REX handles data staging (through libarcdata2) and batch job submission (through LRMS) • No middleware on compute-nodes Jon Kerr Nilsen

  7. libarcdata2 • Generic library for basic data access and movement • Protocols are supported through plugins (DMCs) • No external dependencies in libarcdata2, only in plugins • Same library used on client side and service side – smaller codebase, easier to maintain Jon Kerr Nilsen

  8. Data client functionality • Three command line tools • arcls • arccp • arcrm • Extra functionality in CLI options and URL options • Focused on what functionality is needed in grid DDM – not on providing a fully POSIX-compliant interface Jon Kerr Nilsen

  9. Data client functionality • Protocols are supported through DMC plugins • GridFTP, SRM, LFC, HTTP, file, stdio • Clients support whatever protocol with corresponding plugin installed • All plugins installs via nordugrid-arc-client-toolsmetapackage Jon Kerr Nilsen

  10. Documentation • man arccp • arccp --helpUsage:arccp[OPTION...] source destinationThe arccp command copies files to, from and between grid storage elements.Help Options:-h, -?, --help Show help optionsApplication Options:-p, --passive use passive transfer (does not work if secure is on, default if secure is not requested)-n, --nopassivedo not try to force passive transfer-y, --cache=path path to local cache (use to put file into cache)... Jon Kerr Nilsen

  11. Example: LFC • Download from LFC • arccplfc:// ./file.01 • Upload to LFC specifying SRM location and spacetoken • arccp ./file.01 lfc://\ --location=\srm://;spacetoken=MY_TOKEN/test/file.01.a • List LFC replicas • arcls-L lfc:// • Remove LFC entry • arcrmlfc:// Jon Kerr Nilsen

  12. Example: stdio • Copying to/from stdio • arccpgsi • arccpgsi–hello • arccp - -hellohello Jon Kerr Nilsen

  13. Python API • libarcdata2 API available in Python • import arcfromarcomimportdatapoint_from_urlsrc= datapoint_from_url(turl, ssl_config)dst= datapoint_from_url( = arc.DataMover()mover.verbose(verbose)mover.retry(False)mover.passive(True)status = mover.Transfer(src, dst, arc.FileCache(),arc.URLMap())ifstatus != status.Success:raiseException, ”””Failedto download file. Try again!”””returnstr(status) Jon Kerr Nilsen

  14. ARC Data in EMI • ARC participates in the EMI Data group with two data experts • Zsombor Nagy, Jon Kerr Nilsen • ARC has no storage element in EMI, limited amount of efforts needed for ARC data libraries so far (need GLUE2 and SRM changes on SEs first) • Most important task so far: Consolidation Jon Kerr Nilsen

  15. Data lib consolidation • Two sets of data clients/libs in EMI • arcdatalib2 and arc data tools • GFAL and lcg_util • These have overlapping functionality • Important goal for EMI to reduce code base (and thus reduce maintenance costs) • Need to identify overlap and potential consolidation possibilities Jon Kerr Nilsen

  16. Data lib consolidation • High-level plan • Evaluate current libs/clients • Investigate what features in the libs/clients are used/needed • Identify scenarios and analyze • Efforts, costs, benefits, risks, dependencies… • Suggest a decision for the best solution • Implement this solution Jon Kerr Nilsen

  17. Data lib consolidation • Had F2F meeting with the gLite and ARC data client developers in March • A couple of possible consolidation possibilities were found • SRM and LFC libraries • Also found surprisingly small feature overlap • GFAL focused on POSIX API • ARC data libs focused on grid user needs – functionality not covered by POSIX API • Both are useful • More details on EMI all-hands in June :) Jon Kerr Nilsen

  18. Way forward • Consolidation, consolidation and consolidation • Also • Phasing out ng* clients, replaced by arc* clients • Move from SRM through httpg to SRM through https • GLUE2 as soon as SEs have implemented • Bugfixing and improvements Jon Kerr Nilsen

  19. Thank you! Manual: Installation: Bugs and feature requests: Contact: Jon Kerr Nilsen