1 / 17

Distributed System  Report 1

Distributed System  Report 1. Siddharth Sarasvati Karthikeyen Balu Sudipan Mishra. Overview. Distributed Tries for Load Balancing in P2P Systems Distributed Hash Table Load Balancing Job Scheduling in Hadoop Fair Scheduler Capacity Scheduler.

lynde
Download Presentation

Distributed System  Report 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed System Report 1 Siddharth Sarasvati Karthikeyen Balu Sudipan Mishra

  2. Overview Distributed Tries for Load Balancing in P2P Systems • Distributed Hash Table • Load Balancing Job Scheduling in Hadoop • Fair Scheduler • Capacity Scheduler

  3. Distributed Tries for Load Balancing in P2P Systems What are DHTs? • Decentralized Distributed Hash Tables • Properties • Decentralized • Fault Tolerance • Scalable Load Balancing • Efficiency needed to avoid performance degradation

  4. Distributed Tries for load balancing in P2P Systems • DHT structure

  5. H H L L L L H L L Naïve approaches to Load balancing • Virtual Servers • ID reassignment

  6. Limitations with Naïve approaches • Communication cost for node join & leave is high • Join or leave operation requires prior knowledge of the entire system

  7. Hypothesis Distributed Trie addresses the reduction in communication costs comparing with naïve approaches.

  8. Trie Structure for Load balancing • Construct a Distributed Trie for DHT ID space • To minimize load balance cost • To lower communication cost

  9. Approach Trie is balanced => DHT ID space is balanced

  10. MapReduce

  11. Hadoop • Open source implementation of MapReduce • Default scheduling- FIFO • Critical Jobs? Ad-hoc Analysis?

  12. Job Scheduling Fair Scheduler • Groups jobs into “pools” • Assign each pool a guaranteed minimum share Capacity Scheduler • Jobs are submitted to a queue • Queues get their capacity when they contain jobs  • Unused capacity is used between queues

  13. Investigation • Simulate the systems • Prove/Disprove the efficiency of the discussed job scheduling algorithm over the default(FIFO) implementation • Analyze the efficiency of Load balancing using Distributed Trie over naïve approaches

  14. References • Author - Minseok Kwon, Gahyun Park • Title - Distributed Tries for Load Balancing in Peer-to-Peer Systems • Conference -  Proceedings of IEEE IWQoS, June 2010 • Year - 2010 • URL - http://www.cs.rit.edu/~jmk/papers/trieload.pdf

  15. References • Author - Yingwu Zhu, YimingHu • Title - Towards Efficient Load Balancing in Structured P2P Systems • Conference -  Proceedings of the 18th International Parallel and Distributed Processing Symposium • Year - 2004 • URL - http://fac-staff.seattleu.edu/zhuy/web/papers/load_bala.pdf

  16. References • Author - Michael Isard, VijayanPrabhakaran, Jon Currey, UdiWieder, KunalTalwar and Andrew Goldberg • Title - Quincy: Fair Scheduling for Distributed Computing Clusters • Conference -  Proceedings of the ACM SIGOPS 22nd symposium on Operating Systems Principles • Year - 2009 • URL - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.5498

  17. Questions

More Related