1 / 32

Client Assignment in Content Dissemination Networks for Dynamic Data

Client Assignment in Content Dissemination Networks for Dynamic Data. Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar University of California Riverside. Traffic data packets thru switches / vehicles on highways Stock prices, Sport Scores.

avari
Download Presentation

Client Assignment in Content Dissemination Networks for Dynamic Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar University of California Riverside

  2. Traffic data packets thru switches / vehicles on highways Stock prices, Sport Scores Dynamic Data • rapid and unpredictable changes • time critical, value critical • used in on-line monitoring, decision making More and more of data gathered from the web/internet is dynamic

  3. Client U(t) Source S(t) Repository R(t) Coherency of Dynamic Data • Strong coherency • The client and source always in sync (U(t) = S(t)) • Strong coherency is expensive! • Relax strong coherency:  - coherency • Time domain: t - coherency • Value domain: v - coherency • The difference in the data values at the client and the source bounded by v at all times • E.g.: temperature changes greater than 1 degree

  4. Metric: Fidelity: % of time coherency requirement is met Broad Focus of work To create a scalable content dissemination network (CDN) for streaming/dynamic data.

  5. Basic Framework: Sources, Repositories, Clients • Clients request for different data items by specifying coherence requirements for each item • Repositories derive their requirements from the client requirements • Source pushes the changes of interest to repositories • Repositories cooperate with each other and the source to serve clients

  6. r: 0.2 p: 0.4, r: 0.3 Example Dissemination Network Data Set: p, q, r Max Clients : 2 Source R1 R2 p: 0.2, q: 0.2 R4 R3 q: 0.3

  7. Challenges – I • Given the data and coherency needs of repositories, how should repositories cooperate to satisfy these needs? • How should repositories refresh the data such that coherency requirements of dependents are satisfied? • How to make repository network resilient to failures? [VLDB02, VLDB03, IEEE TKDE]

  8. Assign clients to repositories Assign data to repositories Challenges – II:Service to Clients • Given the data and coherency needs of clients • what data at what coherency should reside in each repository? • Given the data and the coherency available at repositories, • how to assign clients to the repositories? Service to Clients

  9. r:0.2 q: 0.3 p:0.4, r: 0.3 Assigning clients to repositories Assign<client, data- item, coherence> to repository • Client request is satisfied • Overheads are low • Communication delay • Computational delay Source R1 R2 p:0.2, q:0.2 R4 R3 ? q:0.3 C1

  10. Overview • Client assignment problem is NP-Hard • Solve using preferences • Clients and repositories order each other by preferences • Use Stable Marriages • Assign costs and do many-to-one client-repository pairing

  11. <client, data item, coherence> Repositories Cost based Client Assignment 5 • Assign cost to each potential <client request, repository> pair • Minimum Cost Assignment = {1,3,7} 3 7 9 8 1 6

  12. <client, data item, coherence> Repositories Client Assignment • An assignment may contribute to delay for other assignments at the same node • Assignment = {1,3,8} 5 3 7 9 8 1 Minimum Weight Matching 6

  13. Many-to-one Matching: Min Cost Network Flows Start • Directed graph, G={V, E} • Start vertex • End vertex or sink • Edge • Capacity: maximum flow the edge can have • Cost: per unit flow • Intermediate vertex Inflow = outflow 4 3 2 2 2 2 2 5 8 End

  14. Maximum Flow • Value of the flow: flow leaving the source • Maximum flow: value of flow is maximum • Cost of flow =  edges ( flow * cost per unit flow) • Min Cost Flow: maximum flow of minimum cost Start 2 3 2 2 2 1 2 5 2 End

  15. Client Assignment Using Network Flows • Capacity of <source, client request> edge = 1 • Sum of capacities on <repository, sink> edges  number of client requests Start 1 1 1 Y X Z End X,Y, Z : number of clients the repository is willing to serve

  16. Network Flows: Costs and Capacities • <client request, repository> edge • Capacity : 1 • Cost: function of communication delays and coherence requirement • Cost of all other edges:0 Start 1 1 1 1 1 1 Y Z X

  17. Max Flows • Flow out of start node = number of client requests Each unit of flow makes one assignment • Cost of unit flow = cost of assignment • Maximum Flow of minimum Cost => required solution • But this could overload the repositories! Start 1 1 1 1 1 1 Y Z X

  18. Considering Load:Iterative Min Cost Flows • Load depends on the coherence requirement of the assignments • Assignments depend on this load! • Limit the number of requests assigned to a repository using <repository, sink> capacity • But this number does not translate into load • It translates to load if coherences are close to each other

  19. Iterative Min Cost Flows Split the requests into ranges. For each range: • Calculate the approximate load at each repository due to the previous assignments • Calculate the approximate load of the assignments to be made in this range • Determine the capacity of each repository • Find min-cost max flow

  20. For Each Range • Number of updates for coherence ci is ci-2 • Approximate load at a repository:Ai. • Average load A. • For n client requests, expected load = n * ci-2 • Number of repositories: k • Let ti be the number of assignments in the current range to repository Ri • Total load at Ri will be Ai + ti * ci-2 • Average load at R after assignment = • Capacity for Ri

  21. r:0.2 q: 0.3 p:0.4, r: 0.3 Best Effort Service C1 Source Client will be served q at coherence 0.2 R2 R1 p:0.2, q:0.2 R3 R4 q:0.1

  22. r:0.2 q: 0.3 p:0.4, r: 0.3 Augmentation Coherence of A for q is changed to 0.1. Source R2 R1 p:0.2, q:0.1 R3 R4 q:0.1 C1

  23. Experimental Methodology • Network: 1 source, 10 - 20 repositories, 10,000 – 80,000 client requests • Real stock traces: 100-1000 • Time duration of observations: 10,000 s • Ranges for min cost flow: {0.01-0.03, 0.03- 0.07, 0.07-0.2, 0.2-1.0} • Network Flow Solver: RelaxIV from www.di.unipi.it/di/groups/optimize/ORGroup.html

  24. For comparison… • Prior online Global Heuristic • Selector node for each data item • Selector keeps information of • coherence requirements at repositories • delays between the nodes in the network • number of clients assigned to each repository • Client is assigned to a repository where the sum of the delays is minimized. • Two flavours: GHIS, GHES S. Agarwal et al. Construction of a Temporal Coherency Preserving Dynamic Data Dissemination Network. RTSS’04

  25. Performance of the algorithms • GHIS does better than MCF, GHES initially, but degrades rapidly • unsatisfied requests • source overloading! • Augmentation performs very well • GHES and MCF are comparable for small number of repositories GHIS 50% client requests between 0.01 to 0.09. Remaining from 0.1 to 0.99

  26. MCF vs GHES (best effort) • MCF does better as the number of repositories increase • In fact for some simple inputs, MCF did better than GHES by a factor of 9! GHIS GHES MCF Topology: 1 source, 10 repositories, 50 data items

  27. Augmentation helps, but… as the load increases, augmentation increases loss in fidelity As load increases, serving clients at less stringent coherence requirements might actually reduce the loss in fidelity!

  28. Need to adapt to load– Fair vs. biased approaches Fair Approach Biased Approach MCF MCF_aug MCF_aug It is better to be biased than to be fair!

  29. Adaptive Algorithm • For each data item, source maintains a list of unique coherences and the number of clients for each coherence • If the queuing delay at any source/repository crosses a threshold th1 For each data item, the source reduces the coherence of service for some clients • If the queuing delays at any source/repository goes below a threshold th2. Resume service at desired coherency to some of the clients

  30. Performance of the adaptive algorithm Augmentedadaptation performs the best!

  31. Conclusions and Current Work Conclusions • We prove that the client assignment problem is NP-Hard • Develop two new heuristics for the client assignment problem • Develop an adaptive algorithm for client assignment Current Work • Investigation of the algorithms in real network settings – Planet Lab.

  32. Thank You!

More Related