The arc data client in emi 1
This presentation is the property of its rightful owner.
Sponsored Links
1 / 19

The ARC data client in EMI-1 PowerPoint PPT Presentation


  • 125 Views
  • Uploaded on
  • Presentation posted in: General

The ARC data client in EMI-1. Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the ARC Data PT. Outline. Why ARC data libs? ARC data in production ARC data Architecture ARC client tools ARC data in EMI Plans / way forward. Why use ARC data libs?.

Download Presentation

The ARC data client in EMI-1

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The arc data client in emi 1

The ARC data client in EMI-1

Jon Kerr Nilsen

Dept. of Physics, Univ. of Oslo

On behalf of the ARC Data PT


Outline

Outline

  • Why ARC data libs?

  • ARC data in production

  • ARC data Architecture

  • ARC client tools

  • ARC data in EMI

  • Plans / way forward

Jon Kerr Nilsen


Why use arc data libs

Why use ARC data libs?

  • Can be used on different platforms (Linux, Windows, Mac, Solaris)

  • Can be used from different languages (C++, Python, java)

  • Pluggable infrastructure which allows for a very limited set of dependencies from other libraries

  • Plugins exist for several widely used data transfer and management protocols (GridFTP, SRM, LFC, https, stdio, …)

  • Light-weight and user-friendly, with easy client-side configuration (the default config would work in most cases)

Jon Kerr Nilsen


Arc ddm in production

ARC DDM in production

  • ARC is not only for the Nordics

  • Highly distributed – e.g. NDGF T-1 is spread over four countries (both CPU and storage)

  • Most resources shared with other users/VO’s

Jon Kerr Nilsen


Arc ddm in production1

ARC DDM in production

  • 4-6k cores being used

  • 20-40k jobs per day

  • Transfering500MB/s on average for downloads and uploads–> ~100TB/day

  • Approx. half of it on NDGF T-1

    • Not very big but very efficient

Jon Kerr Nilsen


Arc internals

ARC Internals

  • Clients can submit jobs using WS interface or to ARC classic and other legacy services via specialized adaptors

  • HED handles communication between services and with outside world

  • A-REX handles data staging (through libarcdata2) and batch job submission (through LRMS)

  • No middleware on compute-nodes

Jon Kerr Nilsen


Libarcdata2

libarcdata2

  • Generic library for basic data access and movement

  • Protocols are supported through plugins (DMCs)

  • No external dependencies in libarcdata2, only in plugins

  • Same library used on client side and service side – smaller codebase, easier to maintain

Jon Kerr Nilsen


Data client functionality

Data client functionality

  • Three command line tools

    • arcls

    • arccp

    • arcrm

  • Extra functionality in CLI options and URL options

  • Focused on what functionality is needed in grid DDM – not on providing a fully POSIX-compliant interface

Jon Kerr Nilsen


Data client functionality1

Data client functionality

  • Protocols are supported through DMC plugins

    • GridFTP, SRM, LFC, HTTP, file, stdio

  • Clients support whatever protocol with corresponding plugin installed

  • All plugins installs via nordugrid-arc-client-toolsmetapackage

Jon Kerr Nilsen


Documentation

Documentation

  • man arccp

  • arccp --helpUsage:arccp[OPTION...] source destinationThe arccp command copies files to, from and between grid storage elements.Help Options:-h, -?, --help Show help optionsApplication Options:-p, --passive use passive transfer (does not work if secure is on, default if secure is not requested)-n, --nopassivedo not try to force passive transfer-y, --cache=path path to local cache (use to put file into cache)...

http://www.nordugrid.org/documents/arc-ui.pdf

http://www.nordugrid.org/documents/arc-client-install.html

Jon Kerr Nilsen


Example lfc

Example: LFC

  • Download from LFC

    • arccplfc://lfc1.ndgf.org/grid/test/file.01 ./file.01

  • Upload to LFC specifying SRM location and spacetoken

    • arccp ./file.01 lfc://lfc.ndgf.org/grid/myvo/file.01\ --location=\srm://srm.ndgf.org;spacetoken=MY_TOKEN/test/file.01.a

  • List LFC replicas

    • arcls-L lfc://lfc.ndgf.org/grid/test/file.01srm://srm.ndgf.org/test/file.01.a:guid=d11c1c69-3400-466c-995e-7d5681a33692

  • Remove LFC entry

    • arcrmlfc://lfc.ndgf.org/grid/test/file.01

Jon Kerr Nilsen


Example stdio

Example: stdio

  • Copying to/from stdio

    • arccpgsiftp://grid.uio.no/jobdir/145434643346354/out.txtstdio:///stdouthello

    • arccpgsiftp://grid.uio.no/jobdir/145434643346354/out.txt–hello

    • arccp - -hellohello

Jon Kerr Nilsen


Python api

Python API

  • libarcdata2 API available in Python

  • import arcfromarcomimportdatapoint_from_urlsrc= datapoint_from_url(turl, ssl_config)dst= datapoint_from_url(fobj.name)mover = arc.DataMover()mover.verbose(verbose)mover.retry(False)mover.passive(True)status = mover.Transfer(src, dst, arc.FileCache(),arc.URLMap())ifstatus != status.Success:raiseException, ”””Failedto download file. Try again!”””returnstr(status)

Jon Kerr Nilsen


Arc data in emi

ARC Data in EMI

  • ARC participates in the EMI Data group with two data experts

    • Zsombor Nagy, Jon Kerr Nilsen

  • ARC has no storage element in EMI, limited amount of efforts needed for ARC data libraries so far (need GLUE2 and SRM changes on SEs first)

  • Most important task so far: Consolidation

Jon Kerr Nilsen


Data lib consolidation

Data lib consolidation

  • Two sets of data clients/libs in EMI

    • arcdatalib2 and arc data tools

    • GFAL and lcg_util

  • These have overlapping functionality

  • Important goal for EMI to reduce code base (and thus reduce maintenance costs)

  • Need to identify overlap and potential consolidation possibilities

Jon Kerr Nilsen


Data lib consolidation1

Data lib consolidation

  • High-level plan

    • Evaluate current libs/clients

    • Investigate what features in the libs/clients are used/needed

    • Identify scenarios and analyze

      • Efforts, costs, benefits, risks, dependencies…

    • Suggest a decision for the best solution

    • Implement this solution

Jon Kerr Nilsen


Data lib consolidation2

Data lib consolidation

  • Had F2F meeting with the gLite and ARC data client developers in March

  • A couple of possible consolidation possibilities were found

    • SRM and LFC libraries

  • Also found surprisingly small feature overlap

    • GFAL focused on POSIX API

    • ARC data libs focused on grid user needs – functionality not covered by POSIX API

    • Both are useful

  • More details on EMI all-hands in June :)

Jon Kerr Nilsen


Way forward

Way forward

  • Consolidation, consolidation and consolidation

  • Also

    • Phasing out ng* clients, replaced by arc* clients

    • Move from SRM through httpg to SRM through https

    • GLUE2 as soon as SEs have implemented

    • Bugfixing and improvements

Jon Kerr Nilsen


The arc data client in emi 1

Thank you!

Manual: http://www.nordugrid.org/documents/arc-ui.pdf

Installation: http://www.nordugrid.org/documents/arc-client-install.html

Bugs and feature requests: http://bugzilla.nordugrid.org

Contact: [email protected]

Jon Kerr Nilsen


  • Login