1 / 41

Core SRB Technology for 2005 NCOIC Workshop

Core SRB Technology for 2005 NCOIC Workshop. By Michael Wan And Wayne Schroeder SDSC. SDSC/UCSD/NPACI. Outline. Basic Concepts behind SRB SRB architecture SRB features SRB Usage Model Wayne: SRB productization - Installation, Administration, etc Security and Authentication

saima
Download Presentation

Core SRB Technology for 2005 NCOIC Workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Core SRB Technologyfor 2005 NCOIC Workshop By Michael Wan And Wayne Schroeder SDSC SDSC/UCSD/NPACI

  2. Outline Basic Concepts behind SRB SRB architecture SRB features SRB Usage Model Wayne: SRB productization - Installation, Administration, etc Security and Authentication Examples and demo

  3. Initial Design of SRB Transparency and Uniformity Data are increasingly distributed Design Goal – use a single interface and authorization mechanism to access data across: Multiple hosts Multiple OS platforms Multiple resource type (UNIX FS, HPSS, UniTree, DBMS ..)

  4. Initial Design of SRB Global view Global Logical Name space – Data organization UNIX like directories (collections) and files (data) Mapping of logical name to physical attributes - host address, physical path. UNIX like API and utilities Single Global User Name Space Single sign-on No need for UNIX account on every systems Robust access control

  5. SRB Architecture Federated middleware system Client/server model – Federation of resource servers with uniform interfaces client-server server-server - Each request handler has 2 versions Local Remote – pass off to server that can handle the request All Servers use same software Simplicity – easy to implement, easy to debug Robust access control user level, grant access to multiple users group level tickets MCAT – Metadata catalog

  6. Federation of Servers MCAT Mcat Server Server1 Server2

  7. SRB as a Data Grid DB MCAT SRB SRB SRB SRB SRB SRB • Data Grid has arbitrary number of servers • Complexity is hidden from users

  8. SRB server design Three layers design Top layer Interacts with clients and other servers through tcp/ip sockets User authentication Handle function requests – parses requests and invokes handlers in middle and bottom layers.

  9. SRB server design (cont2) Middle layer (logical layer) Most requests pass through here Input parameters are in their logical representations (logical path name , logical resource name) Generally, two types of requests – Data access – Queries MCAT, translates from logical to physical representations Calls functions in the bottom (physical) layer to access data Metadata access – Interacts with MCAT

  10. SRB server design (con2) Bottom layer (physical layer) Where all data I/O to/from resources are done Handles three types of resources File system Drivers to interface with different FS FS supported : UNIX, HPSS, ADS, UniTree, gridFTP (to be released) DB large objects DB tables Access DB tables (query, insert, …)

  11. SRB Features -Authentication Support 2 authentication schemes Encrypt1 (SDSC) – No plain text password over the net GSI (Globus) Wayne will give details

  12. Performance Enhancement Parallel I/O For transferring large files Uses multi-threads for data transfer and disk I/O Interface with HPSS’s mover protocol for parallel I/O Parallel third party transfer for copy and replicate One hop data transfer between client and data resource Bulk Operation Uploading and downloading large number of small files Multi-threads Bulk registration – 500 files in one call 3-10 times speedup

  13. Sput – serial mode Peer-to-peer Request srbObjCreate srbObjWrite Sput 1 5 SRB server2 SRB server1 3 4 6 SRB agent SRB agent 2 Server(s) Spawning MCAT 1.Logical-to-Physical mapping 2. Identification of Replicas 3.Access & Audit Control R Data Transfer

  14. Parallel mode Data Transfer – Client Initiated Connect to server Data transfer Sput -M srbObjPut 8 1 6 7 SRB server2 SRB server1 3 4 SRB agent SRB agent 2 5 Return socket addr., port and cookie MCAT 1.Logical-to-Physical mapping 2. Identification of Replicas 3.Access & Audit Control R

  15. Performance Enhancement (cont1) Container – physical grouping of small files for tape I/O or archival resources Easy to use, transparent to users

  16. Data Replication A SRB file can have multiple replica Replica can be stored in different resources Sls –l mfile fedsrbbrick8 0 demoResc 3029449 2005-07-29-15.37 % mfile fedsrbbrick8 1 demoResc1 3029449 2005-07-29-21.28 % mfile Commands that uses replica Sreplicate – replicate a file to the specified resource Sbackupsrb – backup a file to the specified resource SsyncD – Synchronize the replica of a file

  17. PhyMove –move SRB files to another resource Move files to another resource without making another replica Normally used by admin to move files around Bulk phyMove – large number of small files Parallel I/O – large files Container – move files into container Heavily used by the BBSRC project for distributed archive. Files uploaded to local server Files eventually moved to a central archival resource by admin

  18. Performance Enhancement (cont2) Use of checksum a MCAT metadata associated with a file Checksum routines is part of server and client codes For verification and synchronization of data Built into most data handling utilities Sput, Sget, Srsync, Schksum

  19. Metadata in SRB SRB System Metadata Free-form Metadata (User-defined) Attribute-Value-Unit Triplets… Extensible Schema Metadata User Defined Tables integrated into MCAT Core Schema External Database Metadata operations Metadata Insertion through User Interfaces Bulk Metadata Insertion Template based Metadata Extraction Query Metadata through well defined Interfaces

  20. SRB Proxy operation Perform operations on server on behalf of user Operation where data is located File format conversion, md5 checksum, subsetting and filtering, etc Two types of proxy operations Proxy commands Server fork and exec executable/script on server Pipe output back to client Proxy functions Functions built into server Well defined framework for writing proxy functions

  21. HDF5-SRB ModelData flow Client API srbObjRequest(void *obj, int objID) Server API srbObjProcess(void *obj, int objID) 5. packMsg() 3. H5Obj::op() 6. unpackMsg() HDF5 Library 1. packMsg() 2. unpackMsg() 4. Access file SRB Server HDF5 file

  22. Zone Federation Federation of multiple MCATs MCAT ZONE defines a federation of SRB resources controlled by a single MCAT Each Zone has full control of its own administrative domain Each Zone can operate entirely independently from other zone. Data and Resource sharing across ZONES Use storage resources in foreign zones Share data across zones Copy data across zones

  23. Peer to peer Federated MCAT Zone MCAT1 Server1.1 Server1.2 MCAT3 Server3.1 MCAT2 Server2.2 Server2.1

  24. SRB Client Implementations A set of Basic APIs Over 160 APIs Used by all clients to make request to servers Scommands Unix like command line utilities for UNIX and Window platforms Over 60 - Sls, Scp, Sput, Sget …

  25. SRB Client Implementations (cont) inQ – Window GUI browser Jargon – Java SRB client classes Pure Java implementation mySRB – Web based GUI run using web browser Java Admin Tool GUI for User and Resource management Matrix – Web service for SRB work flow

  26. inQ Windows GUI

  27. MySRB – Web Based SRB Interface SRB Browser Advanced Metadata manipulation

  28. SRB Usage Model Various Usage models Specific Usages SLAC’s Babar experiment UK eScience BBSRC BIRN

  29. SRB Configuration – Peer-to-peer Data Grid Data sharing, no central resourcet Projects – NARA, BIRN Resource server Resource server Resource server Resource server

  30. SRB Configuration - Exploding Star Satellite server Satellite server Data source – physics experiment Projects – Babar, kek Source Server Satellite server Satellite server Satellite server

  31. SRB Configuration - Imploding Star Satellite source server Satellite source server Central Archival server Archival Storage Model Projects – UK eScience – BBSRC Central Cache Server Satellite source server Satellite source server Satellite source server

  32. Peer to peer Federation of MCAT Zone MCAT1 Server1.1 Server1.2 MCAT3 Server3.1 MCAT2 Server2.2 Server2.1

  33. Summary of the Babar Project Preproduction evaluation – 2003 Highlight of Wilco Kroeger’s (SLAC) talk at IEEE 2003 Title - “Distributing Babar Data using SRB” BaBar Computing resources are geographically distributed: 5 Tier-A center GridKA (D), IN2P3 (F), INFN-Padova (I), RAL (UK), SLAC (USA) Data have to be replicated to the Tier-A sites. Number of files is 1M. Size 100’s TB

  34. Babar Preproduction – SRB Usage Allows transparent access to files. Don’t need to know host or storage medium (disk,tape). Accessing files/collections by attributes. Find files that were produced at a certain time or site. Find collections from a particular run period. Preproduction test – 2 weeks of MCAT and file transfer tests

  35. Babar Production Update Transferred ~70 Tb and 140K files Peak rate ~2 Tb/day. Average rate – 1 Tb/day Downtime encountered hardware problem DB updates Plan to federate SLAC and In2p3 Zones – In2p3 picks up some of the load Thanks to Wilko Kroeger (SLAC) and Jean-Yves Nief (In2p3) for the info

  36. UK eScience BBSRC Archival of Biological Data from 16 sites to a central resource Data ingested into local resources Admin uses bulk Sphymove to move data from local resources to a central cache Moves data into containers Replicates containers to cache resource at RAL Replicates containers to ADS archival at RAL Removes cache copies

  37. UK eScience BBSRC Develop some software on their own User interface using Jargon GUI Users not exposed to all SRB functionalities Request tracker – track data movement after ingestion Status Project started at beginning of this year Just done with pilot program using SRB3.2 Upgrading to 3.3 for production

  38. Biomedical Informatics Research Network (BIRN) Major collaboration with SDSC, several of the projects’ Co-Investigators and Co-PIs are at SDSC.. SRB provides the ability to transparently share data across remote sites.

  39. The BIRN SRB Data Grid

  40. The BIRN Data Grid

  41. SRB in BIRN BIRN Toolkit Collaboration Applications Queries/Results Data Management Viewing/Visualization Mediator GridPort Grid Management Data Model Database Scheduler Database Data Grid Computational Grid NMI SRB Globus MCAT Data Access HPSS File System Distributed Resources

More Related