towards data grid standard implementations n.
Skip this Video
Loading SlideShow in 5 Seconds..
Towards Data Grid Standard Implementations PowerPoint Presentation
Download Presentation
Towards Data Grid Standard Implementations

Towards Data Grid Standard Implementations

102 Views Download Presentation
Download Presentation

Towards Data Grid Standard Implementations

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Towards Data Grid Standard Implementations Arun Jagatheesan San Diego Supercomputer Center Open Grid Forum 19 Jan 31, 2007 – session II

  2. Outline • Community Introduction : OGF-GFS • User perspective • Developer/Vendor Perspective • Need for standard community implementation • Community implementation process • GFS-WG community architecture sketch • Follow-up actions

  3. Motivation • Global namespace for unstructured data storage • Collaboration amongst multiple partners / teams • Long-term management of unstructured data • Files, collection-based digital entities

  4. NIH BIRN Data Grid

  5. World Wide Datagrid

  6. Used or Required by • Large scale academic projects • Federal agencies (NARA, LoC, …) • Fortune 500, Forbes Global 2000, ….

  7. DGMS Concept-wise • Large-scale logical file system • File System • Database System • Grid Computing = Data Grid Management System (DGMS) • Core Concepts • Logical shared collections • Logical shared resources • Collaborative communities

  8. Problem solved / Requirements –1 • Collaborative logical namespace • Global collaborations of multiple teams • Collaborations of multiple organizations • Avoid multiple mount points as they restrict scalability of the collaboration • Coordinated data sharing at any granular level (data, metadata, annotations,…)

  9. Problem solved / Requirements –2 • Data Distribution • Multi-site replicas reduce access times • Replicas have the same logical name everywhere in the enterprise (big plus for users) • Concept of replica, copy, cache • Replicas controlled by user, admin, system-enabled (automated or policy based) • Reduce WAN latency (chattiness)

  10. Problem solved / Requirements –3 • Data Classification and Discovery • Major advantage for Global 2000 companies • Tag data with any arbitrary metadata schema • Each team can organize its data based on user-defined attributes • Multiple teams can have different metadata attributes on the same data • Query, discover and access data without knowing path or protocol to be used

  11. User Perspective • Designed for Off the shelf • don’t want to assemble (or DIY) • But able to customize the solution • One point of contact or responsibility • If it does not work I have one mailing list or number to call

  12. Vendor/developer perspective • “OGF-GFS compatible” • OGF-GFS Data Grid Applications • OGF-GFS Data Grid Appliance • Ease of standard evolution • Avoid unnecessary dependencies on multiple interfaces for operations that are the same granular level • Ability to collaborate, learn and compete • An end-to-end solution with common interface • Additional capabilities that add value to the solution

  13. Lessons Learnt • Software v/s Specification • Software implementation to engage and collaborate as we define standards (unless every wants to invest on software development from the start) • Make both the user and vendor/developer happy • Have users happy to be confident to share requirements and demand for the standards from vendors/developers • Vendors/developers know it’s a real thing that can be implemented around their existing products or software

  14. The scope (from GFS Architecture) • A single interface • Protocols • A hybrid of XML and byte-level protocol • XML – command channel of operations • Byte-level – data movement • Possible Functionalities • File namespace and file operations (read, write, … • Meta-data operations (user-defined metadata, search) • Data Grid Language for policy, rules etc.,

  15. XML-command protocol XML-command protocol Byte-level data protocol Byte-level data protocol What could be the right high level picture? Facilitate SOA DGMS Object-transfer

  16. XML-command protocol XML-command protocol Byte-level data protocol Byte-level data protocol What could be the right high level picture? DGMS server DGMS server DGMS server

  17. User perspective User defined meta data for data discovery Secret Recipe  Logical Resources Multiple Replicas Users from different organizations

  18. So what will we be doing (products?) • Definition • Concept ( data grid namespace, resource-namespace…) • Initial functionalities (DGMS operations to be targeted) • Namespace (Files, Metadata, Resource, Policy rules) • XML protocol • XML-handshake and message transfer between DGMS-client and DGMS-server • Most importantly… • Software as a common framework for the evolution, adoption and growth of the standard and DGMS concepts

  19. So how will we do it? (process) • Community-based open design (OPEN FORUM) • Design discussions as a community • Code through multiple parties to make sure we keep the vendor/developer community and user community engaged • Community-based open standard (OPEN STDS) • Specs written using wiki and other mechanisms • Community based spec for OGF • Interoperability workshops and Workshops along with other relevant agencies like SNIA or DMTF

  20. How can you get started? • Initial requirements • Can you delete email? (sign up for our mailing list) • Got Bandwidth and browser? (Visit our group page) • Can you scream or shout or smile ( join our WG sessions) • Are you a user or consumer or researcher? • Tell us what is needed? • What should be there for you to put this open source software/standard in production • Are you a vendor/developer? • Have your engineer or developer talk to us (we will convert him to a DGMS developer or DGMS Guru) • We are developing a open standard – take advantage of it and develop a value added solution around it

  21. When do we get started? • Right now (Hmmm.. We did long time back) • Conference calls every other week • Mostly Wednesdays • Attend through phone call, Skype or Polycom Video conference (any thing you like) • Discussions influencing, design requirements • Face to face meeting • Once every quarter (planned), OGF sessions

  22. Suggestions, comments, critics • TO DO • Standard operations based on policies/rules • Take advantage of OGF standards as possible • Other commercial or magic tools could be used below the standard • NOT TO DO

  23. Conclusions • Data Grids • Data Grid Management systems (DGMS) • Very good user need in academic and non-academics • Need for standards framed by Grid File System WG • Software-included Spec Strategy