1 / 18

Potential Data Access Architectures using xrootd

Potential Data Access Architectures using xrootd. OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC http://xrootd.org. Goals. Describe xrootd What it is and what it is not The architecture The clustering model Data access modes

gilead
Download Presentation

Potential Data Access Architectures using xrootd

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Potential Data AccessArchitectures using xrootd OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC http://xrootd.org

  2. Goals • Describe xrootd • What it is and what it is not • The architecture • The clustering model • Data access modes • How they relate to the xrootdarchitecture • Conclusion

  3. What Is xrootd? • A file access and data transfer protocol • Defines POSIX-style byte-level random access for • Arbitrarydata organized as files of anytype • Identified by a hierarchical directory-like name • A reference softwareimplementation • Embodied as the xrootdand cmsddaemons • xrootddaemon provides access to data • cmsddaemon clusters xrootddaemons together • Attempts to brand software as Scallahave failed

  4. What Isn’t xrootd? • It is not a POSIX file system • There is a FUSEimplementation called xrootdFS • An xrootdclientsimulating a mountable file system • It does not provide full POSIX file system semantics • It is not an Storage Resource Manager (SRM) • Provides SRM functionality via BeStMan • It is not aware of any file internals (e.g., root files) • But is distributed with root and proof frameworks • As it provides unique & efficient file access primitives

  5. Primary xrootdAccess Modes • The root framework • Used by most HEP and many Astro experiments (MacOS, Unix and Windows) • POSIX preload library • Any POSIX compliant application (Unix only, no recompilation needed) • File system in User SpacE • A mounted xrootddata access system via FUSE(Linux and MacOS only) • SRM, globus-url-copy, gridFTP, etc • General grid access (Unix only) • xrdcp • The parallel stream, multi-source copy command (MacOS, Unix and Windows) • xrd • The command line interface for meta-data operations (MacOS, Unix and Windows)

  6. What Makes xrootdUnusual? • A comprehensive plug-in architecture • Security, storage back-ends (e.g., tape), proxies, etc • Clusters widely disparate file systems • Practically any existing file system • Distributed (shared-everything) to JBODS (shared-nothing) • Unified view at local, regional, and global levels • Very low support requirements • Hardware and human administration

  7. Authentication (gsi, krb5, etc) lfn2pfn prefix encoding Authorization (dbms, voms, etc) Protocol (1 of n) (xroot, proof, etc) Logical File System (ofs, sfs, alice, etc) Physical Storage System (ufs, hdfs, hpss, etc) Clustering (cmsd) The Plug-In Architecture Protocol Driver (XRD) Let’s take a closer look atxrootd-style clustering Replaceable plug-ins to accommodate any environment

  8. Clustering • xrootdservers can be clustered • Increase access points and reliability • Uses highly effective clustering algorithms • Cluster overhead (human & non-human) scales linearly • Cluster size is not limited • I/O performance is not affected • Always pairs xrootd& cmsdservers • Symmetric cookie-cutter arrangement • Allows for a single configuration file xrootd cmsd

  9. A Simple xrootd Cluster Manager (a.k.a. Redirector) 1: open(“/my/file”) Client 4: Try open() at A xrootd xrootd xrootd xrootd cmsd cmsd cmsd cmsd 5: open(“/my/file”) 3: I DO! 3: I DO! 2: Who has “/my/file”? Data Servers A /my/file B C /my/file

  10. Recapping The Fundamentals • An xrootd-cmsd pair is the building block • xrootd provides the client interface • Handles data and redirections • cmsd manages xrootd’s (i.e. forms clusters) • Monitors activity and handles file discovery • Building blocks are stackable & replicable • Can create a wide variety of configurations • Much like you would do with LEGOÒ blocks • Extensive plug-ins provide adaptability

  11. Exploiting Stackability Client Meta-Manager (a.k.a. Global Redirector) S e r v e r s S e r v e r s S e r v e r s Data is uniformly available By federating three distinct sites 1: open(“/my/file”) 2: Who has “/my/file”? 5: Try open() at ANL 7: Try open() at A 6: open(“/my/file”) 4: I DO! 4: I DO! Manager (a.k.a. Local Redirector) Manager (a.k.a. Local Redirector) Manager (a.k.a. Local Redirector) xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd 8: open(“/my/file”) B A C /my/file ANL SLAC UTA C C C /my/file /my/file /my/file An exponentially parallel search! (i.e. O(2n)) A A A /my/file /my/file B B B Federated Distributed Clusters Distributed Clusters 3: Who has “/my/file”? 3: Who has “/my/file”? 3: Who has “/my/file”?

  12. Federated Distributed Clusters • Unites multiple site-specific data repositories • Each site enforces its own access rules • Usable even in the presence of firewalls • Scalability increases as more sites join • Essentially a real-time bit-torrent social model • Federations are fluid and changeable in real time • Provide multiple data sources to achieve high transfer rates • Increased opportunities for data analysis • Based on what is actually available

  13. What Federated Clusters Foster • Resilient analysis • Fetch the “last” missing file at run-time • Copy only when necessary • Adaptable analysis • Cache files where they are needed • Copy whatever analysis demands • Storage-starved analysis • Real-time access to data across multiple sites • Deliver towherever the compute cycles are Copy Data Access Architecture Cached Data Access Architecture Direct Data Access Architecture

  14. Copy Data Access Architecture • The built-in File Residency Manager drives • Copy On Fault • Demand driven (fetch to restore missing file) • Copy On Request • Pre-driven (fetch files to be used for analysis) S e r v e r s S e r v e r s S e r v e r s open(“/my/file”) xrdcp –x xroot://mm.org//my/file /my Meta-Manager (a.k.a. Global Redirector) Client xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd Manager (a.k.a. Local Redirector) Manager (a.k.a. Local Redirector) Manager (a.k.a. Local Redirector) C C C /my/file /my/file /my/file A A A /my/file /my/file B B B /my/file /my/file ANL SLAC UTA xrdcp copies data using two sources

  15. DirectData Access Architecture • Use servers as if all of them were local • Normal and easiest way of doing this • Latency may be an issue (depends on algorithms & CPU-I/O ratio) • Requires Cost-Benefit analysis to see if acceptable S e r v e r s S e r v e r s S e r v e r s open(“/my/file”) Meta-Manager (a.k.a. Global Redirector) Client xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd Manager (a.k.a. Local Redirector) Manager (a.k.a. Local Redirector) Manager (a.k.a. Local Redirector) C C C /my/file /my/file /my/file A A A /my/file /my/file B B B /my/file ANL SLAC UTA

  16. Cached Data Access Architecture • Front servers with a caching proxy server • Client access proxy server for all data • Server can be central or local to client (i.e. laptop) • Data comes from proxy’s cache or other servers S e r v e r s S e r v e r s S e r v e r s open(“/my/file”) Meta-Manager (a.k.a. Global Redirector) Client xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd xrootd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd cmsd Manager (a.k.a. Local Redirector) Manager (a.k.a. Local Redirector) Manager (a.k.a. Local Redirector) C C C /my/file /my/file /my/file A A A B B B /my/file ANL SLAC UTA

  17. Conclusion • The xrootdarchitecture promotes efficiency • Can federated almost any file system • Gives a uniform view of massive amounts of data • Assuming per-experiment common logical namespace • Secure and firewall friendly • Ideal platform for adaptive caching systems • Completely open source under a BSD license • See more at http://xrootd.org/

  18. Acknowledgements • Current Software Contributors • ATLAS: Doug Benjamin • CERN: FabrizioFurano, Lukasz Janyst, Andreas Peters,David Smith • Fermi/GLAST: Tony Johnson • FZK: ArtemTrunov • LBNL: Alex Sim, JunminGu, VijayaNatarajan(BeStMan team) • Root: Gerri Ganis, BeterandBellenet, FonsRademakers • OSG: Tim Cartwright, Tanya Levshina • SLAC: Andrew Hanushevsky,WilkoKroeger, Daniel Wang, Wei Yang • UNL: Brian Bockelman • UoC: Charles Waldman • Operational Collaborators • ANL, BNL, CERN, FZK, IN2P3, SLAC, UTA, UoC, UNL, UVIC, UWisc • US Department of Energy • Contract DE-AC02-76SF00515with Stanford University

More Related