1 / 27

Distributed Systems - Plan 3 Report 2

Distributed Systems - Plan 3 Report 2. Siddharth Sarasvati Karthikeyan Karur Balu. Introduction. Traditional distributed system issues Load Balancing Data Integrity Performance Common approaches for load balancing Virtual Servers ID Reassignment Multiple random choice scheme

aminia
Download Presentation

Distributed Systems - Plan 3 Report 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Systems - Plan 3 Report 2 SiddharthSarasvati KarthikeyanKarurBalu

  2. Introduction • Traditional distributed system issues • Load Balancing • Data Integrity • Performance • Common approaches for load balancing • Virtual Servers • ID Reassignment • Multiple random choice scheme • Local Probing

  3. Research paper I • Author: Gurmeet Singh Manku • Title: Balanced binary trees for ID management and load balance in distributed hash tables • Conference: Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing • Year: 2004 • URL: http://dl.acm.org/citation.cfm?id=1011797

  4. The ID Assignment Problem • How does a new host acquire an ID? • No global knowledge of “current set of ID’s” • Low cost (# messages) • Almost equi-sized partitions • This paper presents a low-cost, decentralized algorithm for ID management in DHT

  5. Naïve ID Assignment Choose ‘r’ a random number in [0,1) σ = θ(n log n) with n hosts in the system σ >100 when n = 4K Can we do better? perhaps.. If we could “learn” a few ID’s σ= Partition-balance ratio = Ratio of the largest to the smallest partition

  6. The Algorithm • Upon arrival, a host identifies the manager of a random number in [0, 1) • Identifies the IDs of ‘c log n’ hosts adjacent to the manager along the circle • Splits the largest manager into two.

  7. 0 1 .01101 .01100 .0001 .0000 Only leaf nodes correspond to IDs in [0, 1) Balance Binary Trees • Small fraction of internal nodes are marked active • For every leaf node, exactly one internal node along the path from that leaf node to the root is active • Insertion done in 3 steps

  8. Claim RANDOM walk down the tree Walk up until sub-tree has (c log n) leaves Split the “shallowest leaf” below the sub-tree Claim: A newly-arrived host needs (R + log n) messages Leaves in at most 3 different levels So σ  4

  9. Features of Algorithm • Generality: Independent of overlay network topology • Low cost: Θ(R + log n) • Optimal re-assignments • Handles host “departures” with only 1 re-assignment • “arrivals” require no re-assignments • Small partition balance Ratio(σ 4) optimal

  10. Research Paper 2 • Author: Brighten Godfrey, KarthikLakshminarayanan, SoneshSurana, Richard Karp and Ion Stoica • Title: Load Balancing in Dynamic Structured P2P Systems  • Conference: Proceedings of IEEE Infocom, Hong Kong, March 2004 • Year: 2004 • URL: http://www.cs.berkeley.edu/~karthik/research/papers/infocom04.pdf  

  11. Goal • Goal : To maintain the system in a state in which load on a node is less than its target • Load : Depends on the particular P2P system. Eg Storage, Bandwidth • Target : Maximum load a node can hold.

  12. Node A Node B Node C Chord Ring Random ID space distribution • Contiguous region of the ID space. • Each node can be responsible for many virtual servers. • Consider Chord Ring.

  13. Random ID space distribution • Contiguous region of the ID space. • Each node can be responsible for many virtual servers. • Consider Chord Ring. Node A Node B Node C Chord Ring

  14. 11 20 L=45 15 L=41 3 L=31 L=3 10 20 30 Random Mapping of nodes • May result in Imbalance either from mapping or addition of new data to the system Node A T=50 Node B T=35 Heavy Node C T=15 Chord Ring

  15. L=45 L=31 L=3 30 L=41 ID Space redistribution Choose where L>T and check with other nodes to redistribute the load 11 20 Node A T=50 15 3 Node B T=35 Heavy Node C 10 20 T=15 Chord Ring

  16. L=45 L=31 L=14 L=30 ID Space redistribution Result in maintaining the GOAL, always L <= T 11 20 Node A T=50 15 3 Node B T=35 Node C 10 30 T=15 Chord Ring

  17. H H L L L L H L L Load Balancing Scheme 1: One-to-One Light contacts the node x responsible for it, and accepts load if x is heavy. It takes ~ O(N)^2 operations.

  18. L1 D1 H1 L2 H3 L3 L5 H2 D2 L4 Light nodes Directories Heavy nodes Load Balancing Scheme 2: One-to-Many • Light nodes report their load information to directories. • Directories are present in DHT • Heavy node H gets this information by contacting a directory. • H contacts the light node which can accept the excess load.

  19. Research Paper 3 • Author: Minseok Kwon, Gahyun Park • Title: Distributed Tries for Load Balancing in Peer-to-Peer Systems • Conference: Proceedings of IEEE IWQoS, June 2010 • Year: 2010 • URL: http://www.cs.rit.edu/~jmk/papers/trieload.pdf

  20. Algorithm Goal : If Trie is balanced, ID space will be balanced

  21. Basic Idea (New node Join) • Optimal Path Discovery – A new node travels down the trie from the root taking the path towards the minimum depth • Drawback : Global knowledge of ID space New node

  22. Node join/leave process • ‘y’ joins with a Random ID ‘r’ and locate the host that owns the interval • Starting from ‘r’ it travels up until |id(r)| = number of bits of id(r)

  23. Hypothesis • Distributed Trie for load-balancing in a structured P2P system allows a node to join or leave the system at low cost, R+Θ(log logn), where R denotes the routing cost and n denotes the number of nodes.

  24. Algorithm (Node Join Process) • |id(r)| = number of bits of id(r) • While i < log|id(r)| + 4

  25. Deliverables • Simulate the P2P distributed system with DHT and implement Balanced Binary Tree and the Distributed Trie load balancing algorithms • Graphical representation comparing node arrival and departure cost(routing cost, ID reassignment)

  26. Progress • Comprehensive understanding of the research papers • Discussions on a generic simulation design to fit in different load balancing algorithms as a pluggable module • Analysis of the discussed Load Balancing algorithms

  27. QUESTIONS?

More Related