Distributed System Report 1

Distributed System Report 1 Siddharth Sarasvati Karthikeyen Balu Sudipan Mishra

Overview Distributed Tries for Load Balancing in P2P Systems • Distributed Hash Table • Load Balancing Job Scheduling in Hadoop • Fair Scheduler • Capacity Scheduler

Distributed Tries for Load Balancing in P2P Systems What are DHTs? • Decentralized Distributed Hash Tables • Properties • Decentralized • Fault Tolerance • Scalable Load Balancing • Efficiency needed to avoid performance degradation

Distributed Tries for load balancing in P2P Systems • DHT structure

H H L L L L H L L Naïve approaches to Load balancing • Virtual Servers • ID reassignment

Limitations with Naïve approaches • Communication cost for node join & leave is high • Join or leave operation requires prior knowledge of the entire system

Hypothesis Distributed Trie addresses the reduction in communication costs comparing with naïve approaches.

Trie Structure for Load balancing • Construct a Distributed Trie for DHT ID space • To minimize load balance cost • To lower communication cost

Approach Trie is balanced => DHT ID space is balanced

MapReduce

Hadoop • Open source implementation of MapReduce • Default scheduling- FIFO • Critical Jobs? Ad-hoc Analysis?

Job Scheduling Fair Scheduler • Groups jobs into “pools” • Assign each pool a guaranteed minimum share Capacity Scheduler • Jobs are submitted to a queue • Queues get their capacity when they contain jobs • Unused capacity is used between queues

Investigation • Simulate the systems • Prove/Disprove the efficiency of the discussed job scheduling algorithm over the default(FIFO) implementation • Analyze the efficiency of Load balancing using Distributed Trie over naïve approaches

References • Author - Minseok Kwon, Gahyun Park • Title - Distributed Tries for Load Balancing in Peer-to-Peer Systems • Conference - Proceedings of IEEE IWQoS, June 2010 • Year - 2010 • URL - http://www.cs.rit.edu/~jmk/papers/trieload.pdf

References • Author - Yingwu Zhu, YimingHu • Title - Towards Efficient Load Balancing in Structured P2P Systems • Conference - Proceedings of the 18th International Parallel and Distributed Processing Symposium • Year - 2004 • URL - http://fac-staff.seattleu.edu/zhuy/web/papers/load_bala.pdf

References • Author - Michael Isard, VijayanPrabhakaran, Jon Currey, UdiWieder, KunalTalwar and Andrew Goldberg • Title - Quincy: Fair Scheduling for Distributed Computing Clusters • Conference - Proceedings of the ACM SIGOPS 22nd symposium on Operating Systems Principles • Year - 2009 • URL - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.5498

Questions

Distributed System Report 1

Distributed System Report 1

Presentation Transcript

Computer Security Distributed System Security

A Generic Fault Tolerant System for Dynamic Scheduling in Distributed System

Distributed Databases

Introduction to the Grapevine Distributed System

Ceph: A Scalable, High-Performance Distributed File System

Distributed Operating Systems

Distributed Systems and Algorithms

Chapter 18: Distributed Coordination

Module 16: Distributed System Structures

Distributed systems and Distributed databases design

Distributed System Building Blocks

Distributed Systems

Chapter 18 – Distributed Systems and Web Services

Team CMD Distributed Systems Team Report 1 12/20/06

Distributed Systems

Advance Operating System (CS G623)

Module 16: Distributed System Structures

Distributed Annotation System (DAS) part I

Distributed systems: How did we get here?

A Distributed Aspect-Oriented System for J2EE Applications