1 / 14

Intrinsic References in Distributed Systems

Intrinsic References in Distributed Systems. Presented by: Nimish Pachapurkar. Snapshot:. To contrast and compare Intrinsic References with Physical References. Storage and Retrieval mechanism using intrinsic references : Elephant Store

nerina
Download Presentation

Intrinsic References in Distributed Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intrinsic References in Distributed Systems Presented by: Nimish Pachapurkar ScaLAB seminar 21st October 2002

  2. Snapshot: • To contrast and compare Intrinsic References with Physical References. • Storage and Retrieval mechanism using intrinsic references : Elephant Store • Use of intrinsic references in Hierarchical data structures Terminology: • Collision resistance: • Extremely difficult to find two sequences with same hash. • Implies that hash is unique (sufficiently so…) • One-way hash: • Given a hash of a sequence it is difficult to reconstruct the sequence. • Reference => Hash AND Referent => byte sequence • (ex. Memory addresses and data, URLs and web pages etc.) ScaLAB seminar 21st October 2002

  3. Physical References – • Relationship between reference and referent is defined by state of the physical system. • Change in the state changes the referent. • All accesses to referent have to be through the system. • Bottleneck and potential failure point • Intrinsic References - • Collision resistant (unique) and one-way hash value • State Independence: The relationship between S and R depends only on the hash function. • Uniqueness: A given R refers only to a particular S from which it was obtained. • Physical storage is still required to store/retrieve the referents. ScaLAB seminar 21st October 2002

  4. Intrinsic References and Distributed Storage – • Useful for Distributed, replicated storage mechanism. • No reference-referent inconsistency (hash gives the reference) • Simple hashing can check for the correctness of the data • Opaque Storage – • Used for storing an instance of a data structure in Elephant Store • Serialize the data structure, store the byte sequence. • Called OPAQUE representation as data structure is hidden behind the byte sequence. • Hash of the sequence is the reference (digest). • Retrieval: Retrieve the byte sequence from store, de-serialize Opaque Reference (Hash digest) Serialization (makes the structure opaque) Data Structure ScaLAB seminar 21st October 2002

  5. HDAGs – • Hash based Acyclic Directed Graph. • Nodes are directories • arcs are directory – sub-directory relationships. • Root digest of a rooted HDAG is used as intrinsic reference to the whole HDAG. • Application: Can be used to represent a file system or mail system. • Root digest uniquely represents the state of whole directory structure and not just the root directory ScaLAB seminar 21st October 2002

  6. Versions and Change (Problems with OR) – • For a file system, example of Opaque representation is a tarball of the directory structure. • Change in any file will cause the opaque representation to change. • Hash digest also changes. • There is no relationship between the old and new representations. • Solution: Use HDAGs • Adding a file to a directory is same as a new mail in Inbox. • The representation of all other files & directories is not changed. • Efficient than Opaque Rep. • Saves communication cost among replicas for distributed storages. ScaLAB seminar 21st October 2002

  7. Advantages of HDAGs – • Efficient for Distributed systems (version management) • Every version is represented by a unique intrinsic reference which is independent of physical system. • Replication and caching will never lead to inconsistencies • Two versions of an object are represented by sharing majority of the storage and communication costs. • Conclusions – • HDAGs promise to be a useful mechanism for building and maintaining distributed storage systems. ScaLAB seminar 21st October 2002

  8. OS Support for P2P Programming:a Case for TPS Presented by: Nimish Pachapurkar ScaLAB seminar 21st October 2002

  9. Introduction – • Need for RPC-like interaction mechanism for P2P infrastructures • Must be decoupled • Anonymous and asynchronous • Layers over RPC would certainly hamper performance • Type based Publish/Subscribe as a candidate • Abstraction of low-level P2P library – JXTA • What’s in the paper: • Comparison of the implementation of TPS with pure JXTA • A “first” experience • Design and source code of applications ScaLAB seminar 21st October 2002

  10. JXTA • Three layers • Core Layer: Several protocols ensuring basic communication between peers, message routing or peer group creation • Service Layer: Ready-made services such as content management system and wire service • Application Layer: All the code written by the programmer • Six concepts: • ID: for any resource (peer, pipe, peergroup, codat) • Peer: Any device with an electronic pulse (normal and special) • Rendez-vous and routers • Pipe: Virtual communication channel – asynchronous and uni-directional (wire for many-to-many) – independent of IP • PeerGroup: Collection of peers • Advertisement: XML msg with information about new resource • Message: Any kind of communication (using XML) ScaLAB seminar 21st October 2002

  11. Protocols for JXTA – • PDP – Peer Discovery Protocol • Allows different peers to find each other • PRP - Peer Resolver Protocol • Just above the transport layer, dispatches JXTA message to right service • PIP – Peer Information Protocol • Know the status of a peer. (time the peer was up, channels available) • PMP – Peer Membership Protocol • Obtain group membership requirements information (credentials, password, etc.) • PBP – Peer Binding Protocol • Keeps different peers in a pipe bound together (even when they move) • ERP – Endpoint Routing Protocol • For routing messages between the peers • Enables communication between 2 peers even when they do not know how to connect to each other (due to Firewall etc.) ScaLAB seminar 21st October 2002

  12. TPS over JXTA – • Publish/Subscribe paradigm • Time decoupling: Publisher and Subscriber do not need to be up at the same time • Space decoupling: Publisher and Subscriber do not need to know each other • Flow decoupling: Sending or receiving of messages do not block the participants. • This decoupling suits the server-less architectures. • Subscription based on Subject and Content • Type-based: Subject => Event object type Content => State of instance of that type • Type safety • Subscriber knows event type in advance ScaLAB seminar 21st October 2002

  13. Example – • Ski renting application • Need to find ski rentals with reasonable rates • Must surf the net for a long time • Alternative: Use the TPS based P2P infrastructure • Subscribe to ski-rental type and wait for answers • Publisher: (A new shop is opened) • Search launched for ski-rental advertisement • If not found, a new one is created • Programming phases – ScaLAB seminar 21st October 2002

  14. Performance – • Invocation time Time for sendMessage() • Publisher produces 50 evts • JXTA-WIRE is quicker • No difference between SR-JXTA and SR-TPS • Throughput: Similar trends! • Conclusion- • TPS is a viable alternative abstraction to RPC for future Internet-wide Operating Systems to support P2P applications • Simple to use, type-safe, preserves decoupled nature of P2P. • Makes programming easier than with pure JXTA. ScaLAB seminar 21st October 2002

More Related