1 / 34

Distributed Systems

Highly Concurrent and Fault-Tolerant h -out of- k Mutual Exclusion Using Cohorts Coteries for Distributed Systems. Distributed Systems. A distributed system consists of interconnected, autonomous nodes which communicate with each other by passing messages. CS vs Mutual Exclusion.

shadi
Download Presentation

Distributed Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Highly Concurrent and Fault-Tolerant h-out of-k Mutual ExclusionUsing Cohorts Coteriesfor Distributed Systems

  2. Distributed Systems • A distributed system consists of interconnected, autonomous nodes which communicate with each other by passing messages.

  3. CS vs Mutual Exclusion • A node in the system may need to enter the critical section (CS) occasionally to access a shared resource, such as a shared file or a shared table, etc. • How to control the nodes so that the shared resource is accessed by at most one node at a time is called the mutual exclusion problem.

  4. K-Mutual Exclusion • If there are k, k1, identical copies of shared resources, such as a k-user software license, then there can be at most k nodes accessing the resources at a time. • This raises the k-mutual exclusion problem.

  5. h-out of k-mutual exclusion • On some occasions, a node may require to access h (1hk) copies out of the k shared resources at a time; for example, a node may need h disks from a pool of k disks to proceed. • How to control the nodes to acquire the desired number of resources with the total number of resources accessed concurrently not exceeding k is called the h-out of k-mutual exclusion problem or the h-out of-k resource allocation problem [10].

  6. Related Work • There are four distributed h-out of-k mutual exclusion algorithms proposed in the literature [2, 5, 9. 10]. • The first algorithm using request broadcast is proposed by Raynal in [10], and three algorithms using k-arbiters, (h, k)-arbiters, and k-coteries are later proposed in [2], [9], and [5], respectively.

  7. Jiang’s Algorithm • Among the four algorithms, only Jiang’s algorithm using k-coteries is fault-tolerant. • It can tolerate node and/or network link failures even when the failures lead to network partitioning. • Furthermore, it is shown in [5] to have lower message cost than others.

  8. Basic Idea of Jiang’s Alg. • The basic idea of the algorithm is simple: a node should select h mutually disjoint sets and collect permissions from all the nodes of the h sets to enter CS for accessing h resources. • To render the algorithm fault-tolerant, a node is demanded to repeatedly reselect h mutually disjoint sets for gathering incremental permissions when a node fails to gather enough permissions to enter CS after a time-out period.

  9. Drawbacks of Jiang’s Alg. • First, it does not specify explicitly how a node can efficiently select and reselect h mutually disjoint sets. • Second, when there is contention, a low-priority node always yields its gathered permissions to high-priority nodes, which causes higher message overhead and may prohibit nodes from entering CS concurrently.

  10. Overview of the Proposed Alg. • In this paper, we proposed another h-out of-k mutual exclusion algorithm using a specific k-coterie cohorts coterie to eliminate the drawbacks of Jiang’s algorithm. • Constant message cost in the best case • A candidate to achieve the highest availability, the probability that a node can gather enough permissions to enter CS in an error-prone environment, among all the algorithms using k-coteries.

  11. k-Coterie

  12. Cohorts Structure

  13. Quorum under Coh(k, m)

  14. An Example

  15. Domination

  16. ND k-coteries • Since an available quorum implies an available entry to CS, we should always concentrate on ND (nondominated) k-coteries that no other k-coterie can dominate. • The algorithm using ND k-coteries, for example the proposed algorithm, is a candidate to achieve the highest availability.

  17. The quorum construction procedure

  18. Probe(Ci, g)

  19. Case 1 for Probe(Ci, g) to return

  20. Case 2 for Probe(Ci, g) to return

  21. Case 3 for Probe(Ci, g) to return

  22. Pre-release

  23. Maekawa’s Alg. (1983) uses six types of messages: • REQUEST • LOCKED • FAILED • RELEASE • INQUIRE • RELINQUISH

  24. Differences of Ours and Maekawa’s • Our mechanism does not use FAILED message. • Our mechanism uses an extra PRE-RELEASE message. • Our mechanism sends INQUIRE message conditionally (instead of insistently) only when there is possibility of deadlock.

  25. Conflict resolution mechanism #1

  26. Conflict resolution mechanism #2

  27. Conflict resolution mechanism #3

  28. Conflict resolution mechanism #4

  29. Conflict resolution mechanism #5 • On receiving a RELINQUISH message form w (w must be the locker), node v swaps w with u, sets u as the locker, and sends a LOCKED message to u, where u is the node at the front of R-QUEUE.

  30. Conflict resolution mechanism #6

  31. Conflict resolution mechanism #7

  32. Conflict resolution mechanism #8

  33. Comparisions

  34. Conclusion • The proposed algorithm becomes a k-mutual exclusion algorithm for k>h=1, and becomes a mutual exclusion algorithm for k=h=1. • It is resilient to node and/or link failures and has constant message cost in the best case. Furthermore, it is a candidate to achieve the highest availability among all the algorithms using k-coteries since the cohorts coterie is ND. • It has the k-concurrency property to allow more nodes in CS concurrently. • However, it’s mutual exclusion delay may be long since it probe nodes cohorts by cohorts instead of probing all nodes in a quorum at once.

More Related