1 / 30

Storage Resource Broker

Storage Resource Broker. Reagan Moore Arcot Rajasekar* Mike Wan* George Kremenek* Bing Zhu Charles Cowart* Sheau-Yen Chen Roman Olschanowsky. Introduction Application Interfaces User Interfaces Demo of Commands Demo of Windows SRB Browser Demo of Metadata Access

judah-garza
Download Presentation

Storage Resource Broker

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storage Resource Broker Reagan Moore Arcot Rajasekar* Mike Wan* George Kremenek* Bing Zhu Charles Cowart* Sheau-Yen Chen Roman Olschanowsky

  2. Introduction Application Interfaces User Interfaces Demo of Commands Demo of Windows SRB Browser Demo of Metadata Access Introduction to Admin Tools Accounts & Access Discussion & Hands On Lab Agenda

  3. Background Problems with Data Handling Environment Problems with Metadata Handling SRB and MCAT as a Solution Features of the SRB System Introduction to Extensible MCAT Introduction

  4. Handle 100s of Millions of Datasets Handle Peta Bytes of Data Integrate Data Collections Handle Metadata for Collections Provide Information Discovery Handle Legacy Data and Methods Background

  5. Astronomy: National Virtual Observatory (ITR prop) Integrate 18 sky surveys 2MASS (2 Micron All Sky Survey) 10TB; 5million files LSST (Large-scale Synoptic Survey Telescope) 3TB/night Co-locate Images for Spatial Access Particle Physics: GrPhyN (NSF ITR proj) CERN LHC (Large Hadron Collider) 1PB/yr (1billion obj) Multi-Lab integration Large Data Projects

  6. Molecular Science: (NPACI proj) SLAC 2MB/sec 2TB/yr Privacy Issues Biology: Functional MRI Integration (NIH prop) Integrate multiple MRI Facilities; 1TB/dataset Medicine: Digital Embryo (NIH proj) Collection Management; 500TB Education: NSDL Federate over 40 Existing Collections Projects (Contd.)

  7. Earth System Sciences ESIPS (Earth System Information Providers) Build and Federate Multiple Digital Libraries EOSDIS 7PB by 2007; 3-15MB/sec LTER (Long Term Ecological Research) BioComplexity (KDI) Federate Collections from 20+ sites Persistent Archives: NARA project Store and Recover Data after 400 years 5 million emails; 33 million web pages 90 million personnel records; … Projects (Contd.)

  8. Large Datasets; Large Number of Datasets Distributed, Heterogeneous Storage Collaboration, Access Control, Authentication Replication, Coherency Caching and Data Placements Data Migration over Time and Space Fault Tolerance and Load Distribution Collection Curation and Management Uniform Name Space Management Data Handling Problems

  9. Large Number of Attributes; Large Size Standardized Metadata User-defined Metadata Federation - integration over space Evolution - integration over time Presentation Extraction and Maintenance Metadata Problems

  10. Integration Schema Integration and Crosswalks Ontological Differences Context Dependency Attribute Mappings Type/Semantic Conversion Inter-domain & Intra-domain Integration Metadata Problems

  11. Resource Transparency Local or Remote, Resource Type & Access Method Location Transparency Path Names, Schemas Cross-Domain Authentication & Access Control Uniform User Name Space Uniform Data/Collection Name Space Data Discovery – User-defined Metadata Scalable System Solution SRB

  12. Federated Server Architecture Uniform Access Interface – API Metadata Catalog Handles transparency and access control Proxy Super User – access to remote users Integration of Data Handling & Digital Library Functionalities Replicated Data Management Solution SRB

  13. SRB Interface Application Application MCAT Core SRB Master SRB Agent SRB Server MCAT Dublin Core Eco Core SRB Server SRB Server

  14. Federated SRB Operation Application 1 6 SRB Master 2 3 5 SRB agent SRB agent 4 MCAT

  15. DR DR DL DL DR Client Client Client Client Client Client DR DR MC DR DL SRB Space SRB SRB SRB SRB SRB SRB SRB DL DR - Data Repository DL - Dig Library MC - Meta Catalog SRB SRB SRB

  16. Access to Heterogeneous Resources Concept of Collections Concept of Logical Resources Replication Support Container Support Proxy Operation Support Support for Methods Users and Groups Extensive Access Control Multiple Authentication Schemes User-defined Metadata Information-based Access Multiple Platform Support Rich Interfaces Administration Features

  17. Resources Supported HPSS (DCE and DCE-less), UniTree, ADSM,DMF, Unix FS, Mac OSX FS, NTFS, Oracle, DB2, Sybase, DPSS, HTTP, FTP 32 & 64-bit File Sizes Database Tables and LOBs Platforms/OS Supported Cray, Sun, SGI, AIX, DEC, Linux, MacOSX, NT, 2000, 98*, Me* * Browser only Resources

  18. Logical Abstraction of Directories/Folders Not tied to a host or file system Independent of Path Names Can have datasets in Multiple Resources Access Controlled Curator of Collections/Sub-collections Collection-level metadata Collections

  19. Group of Multiple Physical Resources Resource Metadata : type, bandwidth,… Resource Class: Archival, Permanent, Cache, Volatile, Primary, Secondary Various Usage Modes Automatic Replication Choice (m of n resources) Round Robin (load balancing) Distributed Archive-Cache System Near-Far System Container Movement Logical Resources

  20. Core Functionality Synchronous Replication Replication via Logical Resource definition integrated into open/create & write function Can choose: k out of n Associate replication with containers/collections Consistency Asynchronous Replication - Offline srbObjReplicate API , Sreplicate command, GUI Out of Band Replication - outside SRB Registering of Replicas using srbRegisterReplica API Replication

  21. Choice at Read any replica specific replica (by copy number) round-robin by resource characteristics by timestamp or other characteristics data itself may be identified by meta characteristics user defined metadata & annotations data type, owner, comments, ... Replication (Contd.)

  22. Physical Grouping of Objects Similar to tar but has significant differences Multiple Uses: To take advantage of resource characteristics To aid access patterns Move data sets together Tie together logically different files Automatic Archiving/Caching Chaining of Containers Sharing of metadata Containers for Collections Containers

  23. Containers (Contd.) 1.Create Container 5.Read File 1 2.Write File 1 6.Write File 3 3. Write File 2 7.Sync & Purge 4.Sync Container

  24. Access Control Datasets Collections Resources Multi-level Access Read, Annotate,Write, Curate, Own Access Control for Users and Groups Ticket-Based Access Control Audit Access Access Control

  25. Four Types of Authentication Plain Password (useful for web and ssh) Challenge-Response SEA - RSA Public/Private Keys and RC5 Encryption Algorithm GSI Certificate-based Systems Authentication

  26. Operations performed server-side Compiled/Preloaded Operations (secure) Flexible Interface Examples DataCutter (Univ. Maryland) Copy & Replicate (third-party data movement) Methods Metadata Proxy Operations

  27. Annotations Metadata for datasets and collections 10 strings and 2 integers Flexibly used for storing arbitrary number of metadata Collection-level metadata can store attribute names for datasets Extensible Metadata (next generation of MCAT) User-defined Metadata

  28. Access to Heterogeneous Resources Concept of Collections Concept of Logical Resources Replication Support Container Support Proxy Operation Support Support for Methods Users and Groups Extensive Access Control Multiple Authentication Schemes User-defined Metadata Information-based Access Multiple Platform Support Rich Interfaces Administration Features

  29. Get software at http://www.npaci.edu/DICE/SRB/tarfiles /??? To register as a user (at one of SDSC-based SRBspace) Fill form at http://www.npaci.edu/DICE/SRB/install/SRBUserRegister.html SRB Admin will respond with your authorization password (should be changed immediately) User and Domain name Host name and port number home collection & default resource details Client Registration

  30. Two environment files .srb/.MdasEnv mdasCollectionHome ‘/home/myName.myDomain’ mdasDomainHome ‘myDomain’ srbUser ‘myName’ srbHost ‘srb.sdsc.edu’ defaultResource ‘unix-sdsc’ AUTH_SCHEME ‘ENCRYPT1’ .srb/.MdasAuth MYSRBPASSWD Setting the Client Environs

More Related