1 / 23

The Cache Location Problem

The Cache Location Problem. IEEE/ACM Transactions on Networking, Vol. 8, No. 5, October 2000 P. Krishnan, Danny Raz, Member, IEEE, and Yuval Shavitt, Member, IEEE. Abstract. The goal is to minimize the overall flow or the average delay by placing a given number of caches in the network.

Download Presentation

The Cache Location Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Cache Location Problem IEEE/ACM Transactions on Networking, Vol. 8, No. 5, October 2000 P. Krishnan, Danny Raz, Member, IEEE, and Yuval Shavitt, Member, IEEE

  2. Abstract • The goal is to minimize the overall flow or the average delay by placing a given number of caches in the network. • The location problems are formulated both for general caches and for transparent en-route caches (TERCs). • A computationally efficient dynamic programming algorithm is present for the single server case.

  3. Introduction • The popular locations for caches are at the edge of networks in the form of browser and proxy caches. • Significant research has gone into optimizing cache performance, co-operation among several caches, and cache hierarchies. Web servers are also replicated to achieve load-balancing. • Danzig et al. had observed the advantage of placing caches inside the backbone rather than at its edge.

  4. Transparent En-route Caches • When using TERCs, caches are only located along routes from clients to servers. • An en-route cache intercepts any request that passes through it, and either satisfies the request or forwards the request toward the server along the regular routing path. • TERCs are easier to manage than replicated web servers since they are oblivious both to the end-user and the server.

  5. Model and definitions • Considering a general wide area network, where the internal nodes are routers and the external nodes are either servers, clients, or gateways to different subnets. • A client can request a web page from any of the servers, and the server vs sends this page to the client vc on the shortest path from the server to the client. • When caches are present, a client can request the page from a cache vk rather than from the server.

  6. Model and definitions(Cont’d) • Simplifying “full dependency” assumption: If a page will be found in any cache, it will be found in the first cache on the way to the server. • Each client flow is associated with a single number pf that is the cachability of this flow. In other words, pf is the flow hit ratio. • The full dependency assumption implies that all the flows have the same hit ratio p, the hit ratio at any node in the network is also p.

  7. The formal model • The shortest path routing is used. • The network is represented by a undirected graph G = (V, E): • d(e) the length of edge e • d(vi, vj) the sum of the link distances along the route between nodes vi and vj. • The request pattern is modeled by the demand set F: • fs,c the flow from server vs to client vc • Ps,c the hit ratio of the flow

  8. The formal model (Cont’d) • K is the set of at most k nodes where the caches are to be placed. • The cost cs,c of demand fs,c using a cache in location vk is • This model does not capture hierarchical structures.

  9. The general k-cache location problem

  10. The k-TERC location problem • The formal definition of the TERC k-cache location problem is exactly as the general k-cache location problem, except that the minimization in the objective function is over the set

  11. Theorem • The solution of the problem with demands F={fs,c} and flow hit ratios P={ps,c} is equivalent to solving the problem for F’={fs,cps,c} with hit ratio of one. • Proof:

  12. Theorem (Cont’d) The solution for the problem with F’={fs,cps,c} and a hit ratio of one is given by

  13. Assumption • Base on the Theorem, we assume that all flows have the same hit ratio which we denote by p.

  14. Single web server case • Even the case when we have a single server is NP-hard for general networks. • This case can be solved on a tree graph. • Fortunately, if the shortest path routing algorithm implied by the Internet is stable, the routes to various clients as viewed by any single server should be a tree graph.

  15. Simple greedy algorithm • The intuitive greedy algorithm places caches on the tree iteratively in a greedy fashion. • It checks each node of the tree to determine where to place the first cache, and chooses the node that minimizes the costs. • It assigns the first cache to this node, and looks for an appropriate location for the next cache. • The complexity of the greedy algorithm is O(nk).

  16. Worst case

  17. The optimal dynamic-programming algorithm • The general tree is converted into a binary tree by introducing at most n dummy nodes. • Sort all the nodes in reverse breadth first order, i.e, all descendants of a node are number before the node itself. • For each node i having children iL and iR, for each , where k is the maximum number of caches to place. • For each ,where h is the height of the tree, we compute the quantity .

  18. The optimal dynamic-programming algorithm • is the cost of the subtree rooted at i with optimal located caches, where the next cache up the tree is at distance l from i. • is the sum of the demands in the subtree rooted at i that do not pass through a cache in the optimal solution of .

  19. The optimal dynamic-programming algorithm • If no cache is to be put at node i • If we put a cache at node i

  20. The optimal dynamic-programming algorithm

  21. The optimal dynamic-programming algorithm • While running the dynamic program we should also compute the appropriate , and keep track of the location of the caches in these solutions. • The amount of data we have to keep is O(nhk). • The overall time complexity is bounded by O(nhk2).

  22. Greedy versus optimal

  23. Comparison of several placement strategies

More Related