1 / 43

Hai-Ning Wu Academia Sinica Grid Computing hnwu@twgrid

Hai-Ning Wu Academia Sinica Grid Computing hnwu@twgrid.org. Data Grid Services/SRB/SRM & Practical. Outlines. Introduction Characteristics of data grid Storage Resource Management (SRM) Storage Resource Broker (SRB) SRB Practical Summary. Data Storage in Large Scales.

nedaa
Download Presentation

Hai-Ning Wu Academia Sinica Grid Computing hnwu@twgrid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hai-Ning Wu Academia Sinica Grid Computing hnwu@twgrid.org Data Grid Services/SRB/SRM & Practical

  2. Outlines • Introduction • Characteristics of data grid • Storage Resource Management (SRM) • Storage Resource Broker (SRB) • SRB Practical • Summary

  3. Data Storage in Large Scales • Historically data has been STORED rather than MANAGED • The amount of data grows so rapidly that traditional storage architectures are no longer suitable • Data are distributed in multiple types of source – hard to integrate data and increase the barriers between users and storage systems

  4. Challenges of Data Storage • Large scales of data • Distributed storage via network • Heterogeneous data resources • Management data with efficiency and safety • Long-term preservation

  5. The Solution: Data Grid • Data virtualization • Manipulates data in high level • Hides details in low level • Provides a uniform interface to access the distributed data storage systems

  6. Virtualization • Data virtualization • Trust virtualization • Data grids are used to manage shared collections that are distributed across multiple sites and multiple storage systems

  7. Client Users Data Grid - The Idea Data Grid Data found Request for Data

  8. Client Users Data Grid - The Idea Data found Data Grid System Request for Data Details are hidden. The data grid system finds out where the data are located.

  9. Data Grid Transparencies Find data without knowing the identifier Descriptive attributes Access data without knowing the location Logical name space Access data without knowing the type of storage Storage repository abstraction Retrieve data using your preferred API Access abstraction Provide transformations for any data collection Data behavior abstraction

  10. Data Grid Components Federated client-server architecture Servers can talk to each other independently of the client Infrastructure independent naming Logical names for users, resources, files, applications Collective ownership of data Collection-owned data, with infrastructure independent access control lists Context management Record state information in a metadata catalog from data grid services such as replication Abstractions for dealing with heterogeneity

  11. Application OAI, WSDL OGSA Linux I/O DLL / Python, Perl Java, NT Browsers C, C++, Java Libraries Unix Shell HTTP Federation Management Consistency & Metadata Management / Authorization-Authentication Audit Latency Management Metadata Transport Logical Name Space Management Digital Component Transport Standard Database Interface Standard Storage System Operations Interface Databases DB2, Oracle, Sybase, Postgres, mySQL, Informix Archives - Tape, HPSS, ADSM, UniTree, DMF, CASTOR,ADS Databases DB2, Oracle, Sybase, SQLserver,Postgres, mySQL, Informix File Systems Unix, NT, Mac OSX ORB Data Grid Architecture

  12. SRM The Data Grid Interface for EGEE Grid Middleware

  13. gLite Services

  14. SE • Storage Element • The Storage Elementis the service which allows a user or an application to store data • Data Channel Protocols • File Transfer and File I/O

  15. SRM (Storage Resource Management) • What is SRM? • SRM is a protocol to manage storage resources (It is NOT a file access protocol!) • Provides an uniform interface for computing applications and client users to heterogeneous storage elements • Does not transfer files itself • Provides space management • Manage the life time of file

  16. SRM & Grid

  17. Grid files • Grid Files • Files in the Grid can be referred by different names: • Logical File Name (LFN) : An alias created by a user to refer to some item of data. For example, /grid/gilda/gridcamp/testFile.txt • Grid Unique IDentifier (GUID) : A non-human-readable unique identifier for an item of data. For example, 37afd0cc-c53b-4795-a873-6a9dde35a9cc • Site URL (SURL) : The location of an actual piece of data on a storage system. For example, srm://dpm01.grid.sinica.edu.tw/dpm/grid.sinica.edu.tw/home/twgrid/generated/2007-09-18/file4c4a5a6f-878d-4ef3-a73d-941ae6275383 • Transport URL (TURL) : Temporary locator of a replica + access protocol: understood by a SE. For example, gsiftp://dpm01.grid.sinica.edu.tw/dpm01.grid.sinica.edu.tw:/path1/twgrid/2007-09-18/file4c4a5a6f-878d-4ef3-a73d-941ae6275383.168233.0 • While the GUIDs and LFNs identify a file irrespective of its location, the SURLs and TURLs contain information about where a physical replica is located, and how it can be accessed.

  18. LFC • File Catalogue (LFC) • The mappings between LFNs, GUIDs and SURLs are kept in a File Catalogue service, while the files themselves are stored in Storage Elements. • The only file catalogue officially supported in WLCG/EGEE is the LCG File Catalogue (LFC). Mapping by the “LFC” catalogue server

  19. Upload a file to a SE CASE 1 User needs to store data in SE (from a UI) Create a new LFN entry in LFC, return a SURL. srmPrepateToPut (SURL) Transfer the file srmPutDone (SURL)

  20. Upload a file to a SE CASE 2 Application needs to store data in SE (from a WN) Create a new LFN entry in LFC, return a SURL. srmPrepateToPut (SURL) Transfer the file srmPutDone (SURL)

  21. Download files from a SE CASE 3 User needs to retrieve (onto the UI) data stored into SE Query the file catalog to retrieve the SURL from the LFN. srmPrepateToGet (SURL) Transfer the file (read) srmReleaseFile (SURL)

  22. Download files from a SE CASE 4 Application needs to copy data locally (into the WN) and use them. Query the file catalog to retrieve the SURL from the LFN. srmPrepateToGet (SURL) Transfer the file (read) srmReleaseFile (SURL)

  23. SRB Storage Resource Broker

  24. Storage Resource Broker Developed at San Diego Supercomputer Center A distributed file management system (Data Grid), based on a client-server architecture A uniform interface to heterogeneous data storage resources, Based upon their attributes rather than just their names or physical locations Support many data storage systems Provide various types of client interfaces on different platforms

  25. SRB Physical Structure Oracle RDBMS Oracle Client SRB Server SRB Server SRB Server SRB Server User @ location X Storage Driver Storage Driver Storage Driver Storage Space Storage Space Storage Space MCAT-Enabled Server SRB Vault @ location A SRB Vault @ location B SRB Vault @ location D

  26. SRB Pratical - inQ • Download inQ 3.5.0 from http://www.sdsc.edu/srb/tarfiles/inQ350.zip • Unzip inQ350.zip • Execute inQ.exe

  27. inQ – Login • Name: srbusr+your number • Host: tap07.grid.sinica.edu.tw • Domain: ASGC • Port: 6833 • Authorization: ENCRYPT1 • Password: The same as your user name

  28. SRB Client Tool - inQ

  29. SRB Demonstrations • Use InQ to upload, download, remove files. • Use Scommands to upload, download, remove files. • Sinit: log in SRB system • Syntax: Sinit • Sls: list directory content • Syntax: Sls • Sput: upload a file to the SRB server • Syntax: Sput filename • Sget: download a file from the SRB server • Syntax: Sget filename • Srm: remove a file stored in SRB server • Syntax: Srm filename • Sreplicate: to replicate data to another resource • Syntax: Sreplicate filename • Sexit: log out SRB system • Syntax: Sexit

  30. Data Grid Applications • Digital Archiving • Long-term preservation • Heterogeneous backup • Digital Library • Data sharing • Scientific Computing

  31. SRB Use Case Build Data Grid Management System Data Grid services in Academia Sinica NDAP cross-organization data backup project

  32. SRB Data Grid Services in Academia Sinica (1) Objective To provide Grid services for long-term preservation and unified data access Data Collection Status File size: ~ 60 TB File count: ~ 3.5 Million

  33. SRB Data Grid Service in Academia Sinica(2)

  34. NDAP Partners For Long-term Data Preservation • Academia Sinica (AS) • National Palace Museum (NPM) • National Taiwan University (NTU) • National Museum of History (NMH) • Academia Historica (DRNH) • National Central Library (NCL) • National Museum of Natural Science (NMNS) • Taiwan Historica (TH)

  35. Data grid for NDAP LTP service

  36. Summary • Data grids provides a new solution for large-scale storage with the following features: • Distributed data storage • Efficient and safe management of data • A uniform interface to heterogeneous systems • Flexibility to new storage technology

  37. SRM & SRB • SRM • Used in gLite middleware • A uniform interface between different SEs and grid middleware • SRB • Developed by SDSC • Support many backend storage systems • Widely used data grid software

  38. SRM & SRB • SRM and SRB cannot interoperate unless they have a standard to communicate • Constructing a bridge between SRM and SRB so that • Integrate SRB into the gLite environment • Bind resources from the two important data grid systems • This project is currently developed by ASGC

  39. SRM & SRB

  40. iRODS • A next generation data grid system after SRB developed by SDSC • A rule-oriented data grid system • More flexibility for data management • Current version: iRODS 1.0

  41. iRODS Workshop • Time – Tue 8 April 2008 • Location – 2nd Conference Room, 3F • For more information, please check on ISGC 2008 Website

  42. References [1] Use Cases on Data Services, Fu-Ming Tsai [2] Building Preservation Environments with Data Grid Technology, R. Moore [3] EGEE Middleware Architecture and Planning (Release 1)

  43. Thanks for your attentions!

More Related