Download presentation
Presentation is loading. Please wait.
1
The Cache Location Problem IEEE/ACM Transactions on Networking, Vol. 8, No. 5, October 2000 P. Krishnan, Danny Raz, Member, IEEE, and Yuval Shavitt, Member, IEEE
2
Abstract The goal is to minimize the overall flow or the average delay by placing a given number of caches in the network. The location problems are formulated both for general caches and for transparent en- route caches (TERCs). A computationally efficient dynamic programming algorithm is present for the single server case.
3
Introduction The popular locations for caches are at the edge of networks in the form of browser and proxy caches. Significant research has gone into optimizing cache performance, co-operation among several caches, and cache hierarchies. Web servers are also replicated to achieve load- balancing. Danzig et al. had observed the advantage of placing caches inside the backbone rather than at its edge.
4
Transparent En-route Caches When using TERCs, caches are only located along routes from clients to servers. An en-route cache intercepts any request that passes through it, and either satisfies the request or forwards the request toward the server along the regular routing path. TERCs are easier to manage than replicated web servers since they are oblivious both to the end-user and the server.
5
Model and definitions Considering a general wide area network, where the internal nodes are routers and the external nodes are either servers, clients, or gateways to different subnets. A client can request a web page from any of the servers, and the server v s sends this page to the client v c on the shortest path from the server to the client. When caches are present, a client can request the page from a cache v k rather than from the server.
6
Model and definitions(Cont ’ d) Simplifying “ full dependency ” assumption: If a page will be found in any cache, it will be found in the first cache on the way to the server. Each client flow is associated with a single number p f that is the cachability of this flow. In other words, p f is the flow hit ratio. The full dependency assumption implies that all the flows have the same hit ratio p, the hit ratio at any node in the network is also p.
7
The formal model The shortest path routing is used. The network is represented by a undirected graph G = (V, E): d(e)the length of edge e d(v i, v j )the sum of the link distances along the route between nodes v i and v j. The request pattern is modeled by the demand set F: f s,c the flow from server v s to client v c P s,c the hit ratio of the flow
8
The formal model (Cont ’ d) K is the set of at most k nodes where the caches are to be placed. The cost c s,c of demand f s,c using a cache in location v k is This model does not capture hierarchical structures.
9
The general k-cache location problem
10
The k-TERC location problem The formal definition of the TERC k-cache location problem is exactly as the general k- cache location problem, except that the minimization in the objective function is over the set
11
Theorem The solution of the problem with demands F={f s,c } and flow hit ratios P={p s,c } is equivalent to solving the problem for F ’ ={f s,c p s,c } with hit ratio of one. Proof:
12
Theorem (Cont ’ d) The solution for the problem with F ’ ={f s,c p s,c } and a hit ratio of one is given by
13
Assumption Base on the Theorem, we assume that all flows have the same hit ratio which we denote by p. lineBounded Degree tree TreeGeneral graph 1 serverPoly NP m serversPolyNP
14
Single web server case Even the case when we have a single server is NP-hard for general networks. This case can be solved on a tree graph. Fortunately, if the shortest path routing algorithm implied by the Internet is stable, the routes to various clients as viewed by any single server should be a tree graph.
15
Simple greedy algorithm The intuitive greedy algorithm places caches on the tree iteratively in a greedy fashion. It checks each node of the tree to determine where to place the first cache, and chooses the node that minimizes the costs. It assigns the first cache to this node, and looks for an appropriate location for the next cache. The complexity of the greedy algorithm is O(nk).
16
Worst case
17
The optimal dynamic- programming algorithm The general tree is converted into a binary tree by introducing at most n dummy nodes. Sort all the nodes in reverse breadth first order, i.e, all descendants of a node are number before the node itself. For each node i having children i L and i R, for each, where k is the maximum number of caches to place. For each,where h is the height of the tree, we compute the quantity.
18
The optimal dynamic- programming algorithm is the cost of the subtree rooted at i with optimal located caches, where the next cache up the tree is at distance l from i. is the sum of the demands in the subtree rooted at i that do not pass through a cache in the optimal solution of.
19
The optimal dynamic- programming algorithm If no cache is to be put at node i If we put a cache at node i
20
The optimal dynamic- programming algorithm
21
While running the dynamic program we should also compute the appropriate, and keep track of the location of the caches in these solutions. The amount of data we have to keep is O(nhk). The overall time complexity is bounded by O(nhk 2 ).
22
Greedy versus optimal
23
Comparison of several placement strategies
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.