1 / 20

Elections in a Distributed Computing System

Elections in a Distributed Computing System. Hector Garcia-Molina . Presenter: Srinath Rao. Introduction to Elections. Strategies to deal with a node failure Have s/w which can operate continuously even as failures occur Halt temporarily, reorganize the system

levi
Download Presentation

Elections in a Distributed Computing System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Elections in a Distributed Computing System Hector Garcia-Molina Presenter: Srinath Rao

  2. Introduction to Elections • Strategies to deal with a node failure • Have s/w which can operate continuously even as failures occur • Halt temporarily, reorganize the system • Need for the coordinator and hence election • Election protocols can be used to start up a system, add/remove nodes • Issues ?

  3. Issues • Constituent nodes may fail after election • What does it mean to be a coordinator? • How to cope up with failures during the election itself? • Always possible to select a unique coordinator? • Might wish to have more than one coordinator

  4. Outline • Assumptions • Elections with no commn. failures • The Bully Election Algorithm • Elections with commn. failures • The Invitation Election Algorithm • Related Work • Conclusions

  5. Assumptions • All nodes cooperate • Election algorithm makes use of “bug-free” software facilities • Communication subsystem will not spontaneously generate messages • Nodes have “safe” storage cells • Node halts processing when it fails • No transmission errors

  6. Assumptions (contd..) • Messages are processed in the order they are received • Communication subsystem does not fail • Node never pauses

  7. State Vector of node • A collection of safe storage cells • Principal components of Vector S(i) • Status of node i: S(i).s • Down, Election, Reorganization, Normal • Coordinator according to node i: S(i).c • Definition of the Task being performed: S(i).d

  8. Outline • Assumptions • Elections with no commn. failures • The Bully Election Algorithm • Elections with commn. failures • The Invitation Election Algorithm • Related Work • Conclusions

  9. Desired Characteristics • Assertion 1: For two nodes i and j • S(i).c = S(j).c if nodes i and j are in one of the states “Normal” or “Reorganization” • S(i).d = S(j).d if both i and j are in “Normal” state • States what it means to be a coordinator

  10. Desired Characteristics (contd..) • Assertion 2: If no failures, election will eventually transform a system in any state to a state: • There exists node i with S(i).s =“Normal” and S(i).c = i • Other active nodes j have S(j).s = “Normal” and S(j).c = i

  11. The Bully Election Algorithm • Each node has an unique id no. • Algorithm uses id no. as priorities • Two step algorithm • Node i tries to contact all nodes with higher priorities. If no reply received, then assume the role of coordinator • Inform all the lower priority nodes • Send “halt” message, force state of j to “Election” • Send “I am elected” message, node j sets S(j).c=I and S(j).s = “Reorganization” • Distribute new algorithms to nodes, all status changed to “Normal”

  12. Bully (contd..) • Let the recovering node k attempt to become the coordinator using the same algorithm • Halts all lower priority nodes which may be in the process of becoming coordinators • Step 1 ensures no conflict with higher priority nodes

  13. Outline • Assumptions • Elections with no commn. failures • The Bully Election Algorithm • Elections with commn. failures • The Invitation Election Algorithm • Related Work • Conclusions

  14. Discussion • Failures • Partitioning of nodes • A node can only send/receive message • Node i and node j can talk to node k but not with each other • Node may pause and then resume • Observation: Impossibility of consensus in the event of failure of commn. subsystem or node pausing. • Inference: Redefine the meaning of an election

  15. Discussion (contd..) • Notion of a group of nodes and group id • Node i stores group id in its state vector: S(i).g • Nodes are free to change groups • Identify messages with group id • Coordinator is unique within a group

  16. New Desired Characteristics • Assertion 3: For two nodes i and j • S(i).c = S(j).c if nodes i and j are in the same group and are in one of the states “Normal” or “Reorganization” • S(i).d = S(j).d if both i and j are in “Normal” state and are in the same group

  17. Desired Characteristics (contd..) • No requirements for nodes with only one-way communication • Assertion 4: If no failures, election will eventually transform a set of nodes R that have two way communication in any state to a state: • There exists node i with S(i).s =“Normal” and S(i).c = i • Other active nodes j in R have S(j).s = “Normal” and S(j).c = i and S(j).g = S(i).g

  18. The Invitation Election Algo. • A node “invites” other nodes to join it in forming a new group • A node may accept or decline an invitation • Make a receiving node form a new group with itself the coordinator and the only member • Objective: to merge groups • Coordinators periodically send “invite” message • Can be used instead of the Bully algorithm

  19. Related Work • Scott D. Stoller (2000) • Modifies Bully algorithm to work with crash failures • Points out a flaw and proposes a new specification • Gurdip Singh (1996) • Proposes an algorithm for leader election in the presence of link failures

  20. Conclusions • Meaning of an election depends on the possible types of failures • Paper studied elections in two representative failure environments • Postulated assertions that define concept of an election • Presented an election algorithm for each environment

More Related