1 / 23

Towards Data Grid Standard Implementations

Towards Data Grid Standard Implementations. Arun Jagatheesan San Diego Supercomputer Center. Open Grid Forum 19 Jan 31, 2007 – session II. Outline. Community Introduction : OGF-GFS User perspective Developer/Vendor Perspective Need for standard community implementation

Download Presentation

Towards Data Grid Standard Implementations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Data Grid Standard Implementations Arun Jagatheesan San Diego Supercomputer Center Open Grid Forum 19 Jan 31, 2007 – session II

  2. Outline • Community Introduction : OGF-GFS • User perspective • Developer/Vendor Perspective • Need for standard community implementation • Community implementation process • GFS-WG community architecture sketch • Follow-up actions

  3. Motivation • Global namespace for unstructured data storage • Collaboration amongst multiple partners / teams • Long-term management of unstructured data • Files, collection-based digital entities

  4. NIH BIRN Data Grid

  5. World Wide Datagrid

  6. Used or Required by • Large scale academic projects • Federal agencies (NARA, LoC, …) • Fortune 500, Forbes Global 2000, ….

  7. DGMS Concept-wise • Large-scale logical file system • File System • Database System • Grid Computing = Data Grid Management System (DGMS) • Core Concepts • Logical shared collections • Logical shared resources • Collaborative communities

  8. Problem solved / Requirements –1 • Collaborative logical namespace • Global collaborations of multiple teams • Collaborations of multiple organizations • Avoid multiple mount points as they restrict scalability of the collaboration • Coordinated data sharing at any granular level (data, metadata, annotations,…)

  9. Problem solved / Requirements –2 • Data Distribution • Multi-site replicas reduce access times • Replicas have the same logical name everywhere in the enterprise (big plus for users) • Concept of replica, copy, cache • Replicas controlled by user, admin, system-enabled (automated or policy based) • Reduce WAN latency (chattiness)

  10. Problem solved / Requirements –3 • Data Classification and Discovery • Major advantage for Global 2000 companies • Tag data with any arbitrary metadata schema • Each team can organize its data based on user-defined attributes • Multiple teams can have different metadata attributes on the same data • Query, discover and access data without knowing path or protocol to be used

  11. User Perspective • Designed for Off the shelf • don’t want to assemble (or DIY) • But able to customize the solution • One point of contact or responsibility • If it does not work I have one mailing list or number to call

  12. Vendor/developer perspective • “OGF-GFS compatible” • OGF-GFS Data Grid Applications • OGF-GFS Data Grid Appliance • Ease of standard evolution • Avoid unnecessary dependencies on multiple interfaces for operations that are the same granular level • Ability to collaborate, learn and compete • An end-to-end solution with common interface • Additional capabilities that add value to the solution

  13. Lessons Learnt • Software v/s Specification • Software implementation to engage and collaborate as we define standards (unless every wants to invest on software development from the start) • Make both the user and vendor/developer happy • Have users happy to be confident to share requirements and demand for the standards from vendors/developers • Vendors/developers know it’s a real thing that can be implemented around their existing products or software

  14. The scope (from GFS Architecture) • A single interface • Protocols • A hybrid of XML and byte-level protocol • XML – command channel of operations • Byte-level – data movement • Possible Functionalities • File namespace and file operations (read, write, … • Meta-data operations (user-defined metadata, search) • Data Grid Language for policy, rules etc.,

  15. XML-command protocol XML-command protocol Byte-level data protocol Byte-level data protocol What could be the right high level picture? Facilitate SOA DGMS Object-transfer

  16. XML-command protocol XML-command protocol Byte-level data protocol Byte-level data protocol What could be the right high level picture? DGMS server DGMS server DGMS server

  17. User perspective User defined meta data for data discovery Secret Recipe  Logical Resources Multiple Replicas Users from different organizations

  18. So what will we be doing (products?) • Definition • Concept ( data grid namespace, resource-namespace…) • Initial functionalities (DGMS operations to be targeted) • Namespace (Files, Metadata, Resource, Policy rules) • XML protocol • XML-handshake and message transfer between DGMS-client and DGMS-server • Most importantly… • Software as a common framework for the evolution, adoption and growth of the standard and DGMS concepts

  19. So how will we do it? (process) • Community-based open design (OPEN FORUM) • Design discussions as a community • Code through multiple parties to make sure we keep the vendor/developer community and user community engaged • Community-based open standard (OPEN STDS) • Specs written using wiki and other mechanisms • Community based spec for OGF • Interoperability workshops and Workshops along with other relevant agencies like SNIA or DMTF

  20. How can you get started? • Initial requirements • Can you delete email? (sign up for our mailing list) • Got Bandwidth and browser? (Visit our group page) • Can you scream or shout or smile ( join our WG sessions) • Are you a user or consumer or researcher? • Tell us what is needed? • What should be there for you to put this open source software/standard in production • Are you a vendor/developer? • Have your engineer or developer talk to us (we will convert him to a DGMS developer or DGMS Guru) • We are developing a open standard – take advantage of it and develop a value added solution around it

  21. When do we get started? • Right now (Hmmm.. We did long time back) • Conference calls every other week • Mostly Wednesdays • Attend through phone call, Skype or Polycom Video conference (any thing you like) • Discussions influencing, design requirements • Face to face meeting • Once every quarter (planned), OGF sessions

  22. Suggestions, comments, critics • TO DO • Standard operations based on policies/rules • Take advantage of OGF standards as possible • Other commercial or magic tools could be used below the standard • NOT TO DO

  23. Conclusions • Data Grids • Data Grid Management systems (DGMS) • Very good user need in academic and non-academics • Need for standards framed by Grid File System WG • Software-included Spec Strategy

More Related