1 / 27

An Adaptive File Distribution Algorithm for Wide Area Network

An Adaptive File Distribution Algorithm for Wide Area Network Takashi Hoshino, Kenjiro Taura, Takashi Chikayama University of Tokyo. Background. New environments for parallel and distributed computation Clusters, cluster of clusters, GRID Offer scalability and good cost performance

rozene
Download Presentation

An Adaptive File Distribution Algorithm for Wide Area Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Adaptive File Distribution Algorithm for Wide Area Network Takashi Hoshino, Kenjiro Taura, Takashi Chikayama University of Tokyo

  2. Background • New environments for parallel and distributed computation • Clusters, cluster of clusters, GRID • Offer scalability and good cost performance • Setting up computation in such environments is complex, however • Install programs/data

  3. Setting up computation in DS • Often involves copying large programs/data to many nodes • Manually copying large files is troublesome because: • faults occur easily • firewalls block (some) connections • transfers must be scheduled carefully for good performance

  4. Contribution • NetSync • A file replicator optimized for copying large data to many nodes in parallel (application-level) • Features • Automatic load-balancing • scalability • Self-stabilizing construction of transfer route • fault-tolerant • Adaptive optimization of transfer route • No reliance on physical topology information

  5. Outline • What are efficient/inefficient transfer routes? • Demo • Algorithm • Base algorithm • Adaptive optimization • Implementation • Experiments • Related work • Summary and future work

  6. Inefficient Transfer Routes • Many inter-subnet/cluster transfer connections • Many branches Node Subnet/cluster Data transfer line

  7. What’s Wrong with Branches? • Branches • share hardware capability of nodes themselves • CPU power • Disk performance • NIC ability • enlarge possibilities of bottleneck One child Three children NIC NIC 100Mbps x1 33Mbps x3 CPU CPU DISK DISK No bottleneck Bottleneck

  8. Efficient Transfer Route • Minimum inter-subnet/cluster transfer connections • No or minimum branches Node Subnet/cluster Data transfer line

  9. Demo B01 A00 B02 A06 B07 A01 B03 A07 B04 A02 B05 A03 B06 A05 B00 A04 • Playback of our experiment using logs A00 A06 B02 A03 B07 A05 A01 B04 B00 A07 B01 B03 B06 A02 A04 B05 Node Data flow (Parent-Child) CXX

  10. Algorithm • Simple base algorithm • Fault-tolerance, scalability, self-stabilization • Add-on adaptive optimizationheuristics • Well-adapted today’s typical network • Very easy configuration • Only need information of (some) neighbors • Need no physical topology • Need no performance measurement • Pseudo-code is described in our paper

  11. Base Algorithm (1) 100% 25% 100% 0% 75% 50% 25% 0% 100% 75% 50% 25% 100% 0% 75% 25% 0% 100% 50% 75% 50% • Each node seeks a node to be its parent • Pipeline transfer in whole nodes • Fault leads to seeking new parent again

  12. Base Algorithm (2) • Child (has not its parent) side send ASK to candidate to be its parent recv OKstart getting data recv NGseeks candidate again • Parent (received ask message) side recv ASK from a node if my offset > node’s offset and # of children < LIMIT_CHILDREN then send OK and start putting data else send NG end

  13. Adaptive Optimization • Two heuristics • NearParent • Tree2List

  14. NearParent Heuristics parent parent candidate candidate self self • NearParent: reduce "long" connections • Each node changes its parent to a closer node

  15. Tree2List Heuristics parent parent X X self self • Tree2List: reduce branches • If the current parent is not closer than one of its siblings X, change its parent to X • A node which has more than one children suggests its children to change their parent to one of their siblings

  16. How to measure closeness? B C A • Features • Throughput • Latency • Prefix of IP address

  17. Property of Heuristics (1) • Assuming there is no firewall… 1. Minimum inter-cluster/subnet connections 2. All nodes connect each other as a list subnet/cluster

  18. Property of Heuristics (2) Firewall • If firewall blocks some connections… 1. Minimum inter-cluster/subnet connections 2.  N – 1 branches for N subnets (assume no firewalls inside a subnet) subnet/cluster

  19. Property of Heuristics (3) • If multiple levels of groups exist (subnets, clusters), it optimizes all levels simultaneously • Minimum inter-subnet edges • Minimum inter-cluster edges subnet cluster

  20. Implementation • File replicator for a large data and many nodes written in Java • Ability of detecting latency: about 1ms • Usage: • Install and run NetSync in all nodes • Throw a file information to several nodes • Wait for finishing the replication Very simple usage!!!

  21. Experiments • Measure performance of our heuristics • Distributed a file to many nodes • Compared completion time • Environments • A single cluster • Multiple clusters

  22. Experiment in a single cluster (1) • Distributed 500MB from one node to other • 16nodes in the cluster • Only NIC (100Mbps) can be bottleneck • Compared two settings • Random Tree • Only using base algorithm • Limited # of children from 1 to 5. • Tree2List • NearParent has no effect

  23. Experiment in a single cluster (2) • Fewer children, better performance • Tree2List is very close to optimal • Limit 1 is not scalable (using our base algorithm)

  24. Experiment in multiple clusters (1) 100M 100M 100M 1G 100M 100M 100M 100M 100M 1G 100M 100M 1G • Distributed 300MB to over 150 nodes in seven clusters • Heuristics on, off, and fixed manually optimized tree

  25. Experiment in multiple clusters (2) • Our heuristics is close to the ideal fixed tree

  26. Related Work • Application-level Multicast • Overcast[Jannotti], ALMI[Pendarakis], etc. • Aims to optimize bandwidth and latency • Content Distribution Network (CDN) • Has roots in HTTP accelerator and HTTP proxy. • Aims to optimize latency and load-balancing. • Our approach • Maximize throughput, even if sacrificing latency

  27. Summary and Future Work • We designed a simple algorithm • for copying large data to many nodes in parallel • with fault-tolerance, scalability, self-organization, and adaptive optimization • Evaluations show our implementation is effective in real environment • Future Work • Integration with searching for contents, or storage systems for distributed computing

More Related