1 / 41

Network Aware Resource Allocation in Distributed Clouds

Network Aware Resource Allocation in Distributed Clouds. Contribution. Develops efficient resource allocation algorithms The developed 2-approximation algorithm for optimum Data Center(DC) selection is found to be quite efficient

aricin
Download Presentation

Network Aware Resource Allocation in Distributed Clouds

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Aware Resource Allocation in Distributed Clouds

  2. Contribution • Develops efficient resource allocation algorithms • The developed 2-approximation algorithm for optimum Data Center(DC) selection is found to be quite efficient • Develops a heuristic for partitioning the requested resources among the chosen DCs and racks • Minimizes distance (latency) between the selected DCs • Simulations show that this approach yields significant gains

  3. Introduction • Resource allocation – a key function of cloud management and automation • Resource allocation algorithms have high impact on performance of applications • Also affects the efficiency of DCs in accommodating requests • User requests require allocation of Virtual Machines(VMs) • To satisfy these requests, resource allocator maintains updated list of resources available at DCs, current allocations and future requirements.

  4. Introduction • User requests include number of VMs and the communication links required between the VMs • Automation software’s objective is to choose the DC and rack such that overall resource usage is minimized and optimal performance is achieved • These two goals are complimentary • Usually involve attempts at allocating all requested resources onto a single rack– not always possible • Thus, for best results, resource allocation algorithms that are capable of handling many scenarios are required

  5. Introduction • Fragmentation of user requests reduces performance • Difficult to solve fragmentation This paper focuses on resource allocation problem in distributed cloud systems spread out geographically over WAN Target : latency

  6. System ArchitectureDistributed Cloud • Requests should be handled by DCs close to them – helps improve performance • Racks consist of blade servers, each containing many cores • Communication between multiple blade servers within the same rack happen via TOR switch • Two different racks communicate using aggregator switch • DC networks designed with assumption of locality of communication

  7. System ArchitectureDistributed Cloud • As distance between machines increases, the bandwidth decreases • Bandwidth depends on physical machines that the Virtual Machines(VM) are assigned to • Overall efficiency of a DC also depends on this • Number of requests serviceable by the DC also depends on this

  8. System ArchitectureDistributed Cloud

  9. System ArchitectureCloud Management and Automation S/W • Prior knowledge about communication links may not be available • Automation S/W have to assign resources based on worst case conditions and then re-optimize • There are also other conditions that need to be satisfied • Number of VMs / DC (for fault tolerance) • Automation S/W computes mapping of user requests to physical machines

  10. System ArchitectureCloud Management and Automation S/W • The output of the cloud automation software is a mapping of VMs to physical resources • The software interacts with Network Management System (NMS) and the local Cloud Management System (CMS) • The cloud optimization software has two functionalities • Track resource usage • Optimize assignment of user requests • Assignment of user requests consists of identifying DCs and machines • Goal: To reduce inter-DC, intra-DC traffic

  11. System ArchitectureCloud Management and Automation S/W

  12. System ArchitectureCloud Management and Automation S/W • Assignment of DCs is done in 4 steps • DC Selection • Identify DCs based on user constraints and availability • Identify subset of DCs that minimize latency • Partitioning Across DCs • Minimize inter-DC traffic • Adhere to given constraints and partition VMs accordingly • Rack, Blade, Processor selection • Identify physical computational resources in the DCs • Goal : Identify machines with low inter-DC traffic • VM Placement • Assign individual VMs to physical resources • Minimize inter-rack traffic

  13. System ArchitectureData Center Selection • Select DCs that meet • All specifications and constraints • Optimize network resources • Maximize application performance • Use an algorithm that selects a subset of DCs with least hops • Handle other constraints such as maximum or minimum VMs / DC

  14. System ArchitectureData Center Selection • DC selection problem – sub-graph selection problem • Given G = (V,E,w,l) • V – Data Centers • E – Path between DCs • w – number of available VMs at DC • l – distance of these paths • Note : • If there are constraints on maximum number of VMs / DC, w takes this value instead • If there is a constraint of the minimum number of VMs / DC, DCs with fewer VMs are omitted

  15. System ArchitectureData Center Selection • Let ‘s’ be number of VMs requested • Problem : Find sub-graph of G whose sum is at least ‘s’ with minimum diameter • Goal : Find sub-graph with minimum length of longest edge • NP-hard problem

  16. System ArchitectureData Center Selection

  17. System ArchitectureData Center Selection • This algorithm finds a star topology centered at v • Diameter of output sub-graph is at most 2x diameter of optimal sub-graph

  18. System ArchitectureData Center Selection

  19. System ArchitectureData Center Selection Running Time • FindMinStar has to be sorted  O(nlogn) • N  number of DCs • Computing diameter  O(n2) • O(FindMinGraph) = n * O(FindMinStar) = O (n3)

  20. System ArchitectureMachine Selection within DC • Goal : Find machines that reduce inter-rack traffic • DC topology is a tree topology • Root – core switch • Children – top-level switches • Leaf – racks • Given the tree representation of the DC (T) and total number of VMs (s) to be placed • Find sub-tree with minimum height that has weight at least equal to ‘s’

  21. System ArchitectureMachine Selection within DC

  22. System ArchitectureMachine Selection within DC

  23. System ArchitectureVirtual Machine Placement • Heuristic algorithms required for assigning individual VMs to DCs and CPUs within DCs • Problem is a variant of graph partitioning and k-cut problem • User request represented as graph G = (V,E) • Nodes represent VMs to be placed • Edges represent connections between them • Goal : Partition G into disjoint sets c1, c2…cm such that communication along vertices is minimized • If traffic is asymmetric, take the average

  24. System ArchitectureVirtual Machine Placement

  25. System ArchitectureVirtual Machine Placement

  26. System ArchitectureVirtual Machine Placement • Algorithms 4,5 give heuristic solution to partition problem • Optimized using Keringhan–Lin heuristics • Runtime : • O(n2logn)

  27. Simulation Results • Results compared to random approach and greedy algorithm • Random approach selects random DC and places as many VMs as possible in the DC • Greedy selects DC with maximum VMs • To measure performances • Random topology created • Random user requests generated • Maximum distance between any two VMs measured

  28. Simulation Results • Location of DCs randomly selected within a 1000x1000 grid • Distance between DCs is the Euclidean distance between points • Five different distributed cloud scenarios • 100 DCs • 75 DCs • 50 DCs • 25 DCs • 10 DCs • However, average machines on each cloud is the same

  29. Simulation ResultsI Experiment • Measuring diameter of placement for a single request of 1000 VMs • Approximation algorithm performs 79% better • Note : Diameter decreases as number of DCs decreases

  30. Simulation ResultsII Experiment • Study cloud systems with series of user requests • Two experiments • 100 requests for 50 – 100 VMs • Requests are uniformly distributed • Large requests • 500 requests for 10 – 20 VMs • Small requests • Note : In both experiments, average VMs requested is the same

  31. Simulation ResultsII Experiment

  32. Simulation ResultsII Experiment • Greedy performs better than random by 32.6% and 66.5% • Approximation algorithm performs better than greedy by 83.4% and 86.4% • Why do larger requests require higher diameter?

  33. Simulation ResultsIII Experiment • Studies performance of cloud system when additional constraints are given • Same requests as previous experiment • Resilience is defined as ratio of total VMs to maximum VMs at any DC • Requests need to be placed in at least resilience number of DCs

  34. Simulation ResultsIII Experiment • Larger requests have longer diameter • As resilience increases, diameter increases • What is different about these results?

  35. Simulation ResultsIII Experiment • Performance of heuristic algorithm • Given communication requirements and available capacity of DCs, algorithm computes optimal placement of VMs that minimizes inter-DC traffic • Comparison of heuristic algorithm with greedy and random algorithms • Random assigns random DC to each VM • Greedy selects DCs in decreasing order of availability • While selecting VMs, it chooses VMs with maximum total traffic first

  36. Simulation ResultsIII Experiment • Experiment assigns a request of 100 VMs to DCs • Bandwidth fixed randomly between 0 and 1 Mbps • Inter-DC traffic for assignment of these VMs to k DCs (k = 2,…,8) was studied • Available resources at each DC were between 100/k and 200/k • Hence 100 VMs were being assigned to DCs consisting of 100 – 200 VMs

  37. Simulation ResultsIII Experiment • For all algorithms inter-DC traffic increases as number of DCs increase…Why? • Greedy algorithm performs better than random by 10.2% • Heuristic algorithm performs better than greedy by 4.6%

  38. Simulation ResultsIII Experiment • When the DCs did not have excess capacity, inter-DC traffic was higher for heuristic algorithm by 28.2% • Heuristic algorithm performed better than the other two algorithms by 4.8% • Greedy and Random had similar performances

  39. Simulation ResultsIV Experiment • In this experiment, effect of VM traffic on inter-DC traffic is studied • The percentage of links with traffic is varied between 20% and 100% and inter-DC traffic is measured • The DCs have no excess capacity in these experiments • Result: inter-DC traffic grows linearly with percentage of links with traffic for all algorithms

  40. Conclusions • Main contribution is development of algorithms for network-aware resource allocation of VMs in distributed cloud systems • Need for these efficient algorithms :Inter-DC traffic may be very expensive • 2-approximation algorithm provided for selection of DCs • This algorithm can also be used for rack selection within DC but using prior knowledge about network topology within DC gives better results • Heuristic algorithm for mapping VMs to resources within DC

  41. Related Work • Graph partitioning problems • K-cut problem • Maximum sub-graph problem • Assigning VMs inside DCs studied in Improving the scalability of data center networks with traffic-aware virtual machine placement

More Related