1 / 15

Production Data Grids SRB - iRODS

Production Data Grids SRB - iRODS. Storage Resource Broker. Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb. Topics. Production data grids Architecture and installation challenges Production challenges Interoperability challenges (federation) Applications Data grids - sharing

Download Presentation

Production Data Grids SRB - iRODS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Production Data GridsSRB - iRODS Storage Resource Broker Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb

  2. Topics • Production data grids • Architecture and installation challenges • Production challenges • Interoperability challenges (federation) • Applications • Data grids - sharing • Digital libraries - publication • Persistent archive - preservation • Real-time sensor data - collection • Cyberinfrastructure - analysis

  3. BaBar High-Energy Physics • Stanford Linear Accelerator • Palo Alto, CA • IN2P3 • Lyon, France • A functioning international Data Grid for high-energy physics Manchester-SDSC mirror Moved over 300 TBs of data Increasing to 5 TBs per day

  4. Architecture Challenges • Infrastructure heterogeneity • Storage in file systems, archives, ORBs • Choice of database for metadata catalog • Network devices • Management of firewalls, private virtual networks, load levelers • Network latency • Geographic distance between storage locations

  5. Installation Choices • Infrastructure heterogeneity • Provision of drivers for each type of storage system or database • Porting of APIs for each preferred access mechanism • Network devices • Establishment of range of ports for access through firewall • Server-initiated parallel I/O and bulk operations • Network latency • Master-slave metadata catalogs • Federation across multiple independent data grids

  6. Using a Data Grid – in Abstract Data delivered Ask for data • The data is found and returned • Where & how details are hidden Data Grid • User asks for data from the data grid

  7. Data Grid Management • Data grids integrate multiple system components • Application level client software • Federation software • Data grid servers • Data grid metadata catalog • Security infrastructure • Storage systems • Database catalog • Network • A failure in any of the systems is viewed as a failure of the data grid

  8. Operation Challenges • Data grids provide mechanisms to analyze all types of infrastructure failure • Integrity checks • Authenticity checks • System logs • Data grids provide mechanisms to manage all types of infrastructure failure • Replication of data and metadata • Synchronization of replicas • Federation of data grids • Server rebooting and server maintenance mode

  9. Operation Procedures Periodic system administration Manage integrity checks on data Manage audit trails Manage consistency checks on collections Manage synchronization of replicas Manage deletion of files (empty trash can) Track all errors and reported data losses Manage upgrades to new versions of the data grid servers Operational tasks for each data grid Add servers for new storage systems Add new users Respond to user questions Modify access controls on collections and storage Restart data grid servers as needed Identify problems with storage systems Respond to installation questions Integrate user interfaces with data grid

  10. Automation of Management Tasks • integrated Rule-Oriented Data System - iRODS • Express management policies as rules that control the execution of micro-services • Micro-service is a standard operation performed on a remote storage system • Manage persistent state information that describes outcome of the micro-service • Persistent Metadata catalog stores state information • Virtualize the management policies • Logical name space for rules • Logical name space for micro-services • Logical name space for state information • First release in December 2006

  11. iRODS - integrated Rule-Oriented Data System Client Interface Admin Interface Rule Invoker Resources Service Manager Rule Modifier Module Config Modifier Module Metadata Modifier Module Resource-based Micro-services Rule Consistency Check Module Consistency Check Module Consistency Check Module Engine Micro Service Modules Current State Confs Metadata-based Micro-services Rule Base Metadata Persistent Repository Micro Service Modules

  12. Interoperation Virtualization • Management of federation with other data grid technologies • Define micro-service that executes the protocols required by the alternate data grid • Define rule for when this micro-service is executed (link to explicit storage location) • Separately manage state information from application of this micro-service • iRODS enables encapsulation of the rules, access mechanisms, and state information needed for interoperation with other data grids

  13. Federation Between Data Grids Data Access Methods (Web Browser, DSpace, OAI-PMH) Data Collection A Data Collection B • Data Grid • Logical resource name space • Logical user name space • Logical file name space • Logical persistent state name space • Logical rule name space • Logical micro-service name space • Data Grid • Logical resource name space • Logical user name space • Logical file name space • Logical persistent state name space • Logical rule name space • Logical micro-service name space Access controls and consistency constraints on cross registration of logical name spaces

  14. OGF Data Grid Federation

  15. For More Information Reagan W. Moore San Diego Supercomputer Center moore@sdsc.edu http://www.sdsc.edu/srb/

More Related