Performance Evaluation of Web Proxy Cache Replacement Policies Orit Brimer Ravit krayif Sigal ishay
Introduction The World-Wide Web has grown tremendously in the past few years to become the most significant source of traffic on the Internet today. This growth has led to overloaded Web servers, network congestion and an increase in the response time observed by the client.. Caching of web documents is widely used to reduce both latency and network traffic in accessing data.
Web proxy servers A proxy server is a server that acts as an intermediary between a workstation user and the Internet so that the enterprise can ensure security, administrative control, and caching service.. Web proxy servers that cache documents can potentially improve performance in three ways: Reduce the latency that an end user experiences in retrieving a web page. Lower the volume of network traffic resulting from web pages requests Reduce the number of requests that reach popular servers.
How does a web proxy work? A proxy server receives a request for an Internet service (such as a Web page request) from a user. the proxy server looks in its local cache of previously downloaded Web pages. If it finds the page, it returns it to the user without needing to forward the request to the Internet – Cache hit! Cache miss! If the page is not in the cache, the proxy server, acting as a client on behalf of the user, request the page from the server –
How does a web proxy work? (cont.) ClientsProxy CacheServers Hits Misses Internet When the page is returned, the proxy server relates it to the original request and forwards it on to the user.
Cache Replacement Policy Since the proxy has finite storage, some strategy must be devised to periodically replace documents in favor of more popular ones. The replacement policy decides which web pages are cached and which web pages are replaced, therefore it affects which future requests will be cache hits. The cache’s replacement policy no doubt plays an important role in a cache’s performance.
Project Goal Design and implement a web proxy cache simulator in order to test several different replacement policies and other parameters. Evaluate the performance of the web proxy cache for each parameter by the following metrics: Hit rate Byte hit rate Response time
How does the simulation work? Client’s requests are simulated by the Prowgen simulator. The proxy simulator attempts to fulfill the request from among the web pages stored in its cache. If the requested web page is found (a cache hit) the proxy can immediately respond to the client’s request, hit rate and byte hit rate are updated. If the requested web page is not found (a cache miss) the proxy then should retrieve the web page from the origin server- the miss is written in the input file for the NS. Start the NS simulator which simulates the transfer of the pages from the servers to the proxy.
CLIENTS WEB PROXY WEB SERVERS Request for Web page Simulated by the Prowgen simulator Simulated by the NS simulator Cache HIT Cache MISS Request for web page The requested web page is saved in the cache.
ProwGen simulator ProwGen is a synthetic web proxy workload generator. The workload generator incorporates five selected workload characteristics which are relevant to caching performance: one – time referencing file popularity file size distribution correlation between file size and popularity temporal locality
ProwGen simulator (cont.) The ProwGen is used in our project to create trace file with about requests. Each request simulated by the ProwGen has an id and page size. the proxy simulator maps the id to the server that holds this page and adds time of arrival for each request. The time of arrival has exponential distribution with An example of the requests file: Time of arrival Server number Page size
NS simulator What is NS? NS is a discrete event simulator targeted at networking research. NS is a multiprotocol simulator that implements unicast and multicast routing algorithms, transport and session protocols. What is good for? Evaluate performance of existing network protocols, thus Protocols can be compared. Prototyping and evaluation of new protocols. Large-scale simulations not possible in real experiments.
Using NS in our project Simulation script flow: Create the event scheduler Create network configuration Create transport connection – TCP connections Create traffic on top of TCP – FTP application Transmit application-level data Input files: nsRand- this file contains the parameters for creating the network topology. nsout – the file with the requests that were not in the cache. this file contains: 1. server number 2. page size 3. the arrival time of the request.
Using NS in our project (cont.) Creating network Configuration: The network topology in our project is star topology, the web proxy is connected to each server with duplex link. The topology is given as an input file for the ns script. It is defined by the following parameters: 1. Number of servers for each duplex link we define: 2. Delay – random parameter between ms. 3. Bandwidth –random parameter between 1-10 Mb.
NAM visualization for network configuration
Using NS in our project (cont.) Creating connections and traffic: The NS parse the input file and for each miss open a TCP session with the origin server and retrieve the file from it. The pages are transferred from the servers to the proxy by using FTP application. The NS create an output file that contain the retrieval time of each request. This is done by defining a special procedure which is called automatically at the end of each session. The retrieval time of the request is dependent on the link attributes and has an affect on the web proxy performance. We compare this time for each replacement algorithm
The procedure done: Agent/TCP instproc done {} {global tcpsrc NodeNb ns ftp Out tcp_snk totalSum PR # print in $Out: node, session, start time, end time, duration, # trans-pkts, transm-bytes, retrans-bytes, throughput-how many bytes #transffered per second. set duration [expr [$ns now] - [$self set starts] ] #k is the source node set k [$self set node] #l is the number of the session. set l [$self set sess]set totalSum [expr $totalSum + $duration] #ndatapack_ is the number of packets transmitted by the connection. #ndatadbytes_ is the number of data bytes transmitted by the connection. #nrexmitbytes_ is the number of bytes retransmitted by the connection. puts $Out "$k \t $l \t [$self set starts] \t\ [$ns now] \t $duration \t [$self set ndatapack_] \t\ [$self set ndatabytes_] \t [$self set nrexmitbytes_]" } An example for the code from the tcl script
Pruning Algorithms We describe several cache replacement algorithms proposed in recent studies, which attempt to minimize various cost metrics such as miss ratio, byte miss ratio, average latency, and total cost. These algorithms will be used in our simulation and will be compared at the end of the simulation. In our implementation, each page has a pruning value field and this field holds varying information according to the specific pruning algorithm. The html pages are sorted according to this field – and therefore the pruning is very simple and similar for almost all algorithms. The following algorithms were implemented and tested:.
LRU-Least Recently Used LRU evicts the document which was requested the least recently. It is based on the observation that documents, which have been referenced in the recent past, will likely be referenced again in the near future. We implemented this algorithm by holding a time stamp in the pruning value field of the page. When a page in the cache is accessed, the value of this field is set to the current time. The page with the lowest time stamp will be replaced.
LFU-Least Frequently Used The Least Frequently Used policy maintains a reference count for every object in the cache. The object with the lowest reference count is selected for replacement. The motivation for this algorithm is that some pages are accessed more frequently than others so that the reference counts can be used as an estimate of the probability of a page being referenced. The page with the lowest probability to be referenced again will be replaced.
Hybrid Algorithm HYB algorithm purpose is to answer the need of minimize the time that end users wait for a document to load. HYB is a hybrid of several factors, considering not only download time but also number of references to a document and document size. Each server in the serversDB holds the bandwidth and delay of the link which connects it to the proxy. HYB selects for replacement the document i with the lowest value of the following expression: (clat ser(i) + W B /cbw ser(i) )(nref i * W N )/ s i clat – estimated latency (time) to open a connection to the server. cbw - bandwidth of the connection (in Mega Bytes/second). nref i - number of references to document i since it last entered the cache. s i - the size in bytes of document i. W B and W N are constants.
GreedyDual-Size This algorithm combines locality, size and latency/cost concerns effectively to achieve the best overall performance. The algorithm associates a value, H, with each cached page p. Initially, when a page is brought into cache, H is set to be the cost of bringing the page into the cache. When a replacement needs to be made, the page with the lowest H value is replaced, and all pages reduce their H values by minH. If a page is accessed, its H value is restored to the cost of bringing it into the cache. Thus, the H values of recently accessed pages retain a larger portion of the original cost than those of pages that have not been accessed for a long time. GreedyDual-size selects for replacement the document i with the lowest value of the following expression: (clat ser(i) + 1/cbw ser(i) )/ s i
Size The Size policy, designed specifically for web proxy caches, removes the largest object from the cache when space is needed for a new object. We implemented this algorithm by holding the page size in the pruning value field of the page.
Data Structures struct HtmlPage { long int id; long int size; double prunningValue; int reference; long int timeStamp; HtmlPage next; }; The WebProxy holds a cache which is implemented as a sorted list of HtmlPages. struct WebProxy { List* cache; double currMemory; long int proxyHits; double byteHits; double inflationVal; };
Data Structures (cont.) struct WebServer { /*the index in the servers array */ int sNum; double bandwidth; int delay; }; We hold an array of WebServers. Each WebServer holds information about the delay and bandwidth of the link that connects him to the proxy.
Basic Implementation The program consists of several stages: Creating random values for the network configuration. Read request by request from a trace file created by the ProwGen. For each request: It first checks if the page is stored in its cache. If so, records a proxy hit. Update the pruning value of the page according to the pruning algorithm. If the page is not in the cache, a miss is recorded. The request is written to the misses file. The WebProxy creates a new page and update its pruning value according to the pruning algorithm. The WebProxy checks if there is enough memory in the cache for this page.
Basic Implementation (cont.) If not, it removes pages from the cache according to the pruning algorithm, in such a way that the occupied memory in the cache after inserting the new page, will not exceed TRESHOLD * CACHE_SIZE. The page is cached.
Performance analysis This section evaluates the performance of the web proxy cache for each replacement policy. We examined the replacement policies for different cache size: 4,8,16,32,64,128,256 (MB). The simulations were executed in two different network topologies : 20 and 100 servers. In this study we use three metrics to evaluate the performance of the proxy cache: Hit rate - percentage of all requests that can be satisfied by searching the cache for a copy of the requested object. Byte hit rate - the percentage of all data that is transferred directly from the cache rather than from the origin server. Average response time – the average time that takes to bring a web page that caused cache miss.
Hit Rate The following table show the Hit rate of the tested algorithms. Network configuration: 20 servers in a star topology. GREEDYHYBRIDSIZELFULRU MB MB MB MB MB MB MB
Analyze results As we expected GREEDY and HYBREED algorithms show the best Hit rate (were designed to maximize hit rate). The graph shows that the hit rate grows as the cache size grows, but the sloap is decreasing start from cache size of 64 MB.
Byte Hit Rate The following table show the Byte Hit rate of the tested algorithms. Network configuration: 20 servers in a star topology. GREEDYHYBRIDSIZELFULRU MB MB MB MB MB MB MB
Analyze results SIZE gets the lowest Byte Hit rate. This result is not surprising since SIZE removes from the cache pages with the biggest size. GREEDY and HYBRID also consider the size of the page when calculating the pruning value of the page, therefore these algorithms do not achieve the best Byte Hit rate.
Average time per request The following table show the average time per request of the tested algorithms. Network configuration: 20 servers in a star topology. GREEDYHYBRIDSIZELFULRU MB MB MB MB MB MB MB
Analyze results SIZE gets the lowest average time per request. This result is not surprising since SIZE showed the worst Byte Hit ratio. GREEDY and HYBRID show the best result although they don’t have an optimal Byte Hit ratio, this is because they take into consideration the cost (delay and bandwidth) of bringing a page from origin server. In addition they showed the best Hit rate which also effects this metric results.
Hit Rate The following table show the Hit rate of the tested algorithms. Network configuration: 100 servers in a star topology. GREEDYHYBRIDSIZELFULRU MB MB MB MB MB MB
Analyze results GREEDY and HYBREED algorithms still show the best Hit rate. As expected, changing the network configuration did not influence the Hit rate.
Byte Hit Rate The following table show the Byte Hit rate of the tested algorithms. Network configuration: 100 servers in a star topology. GREEDYHYBRIDSIZELFULRU MB MB MB MB MB MB
Analyze results As in the graph for 20 servers network configuration,SIZE gets the lowest Byte Hit rate. GREEDY and HYBRID also consider the size of the page when calculating the pruning value of the page, therefore these algorithms do not achieve the best Byte Hit rate.
Average time per request The following table show the average time per request of the tested algorithms. Network configuration: 100 servers in a star topology. GREEDYHYBRIDSIZELFULRU MB MB MB MB MB MB
Analyze results GREEDY and HYBRID give the lowest average time per request. These are the expected results since they are the only algorithms that consider the cost of retrieving a page from an origin server. In this network configuration the difference between GREEDY and HYBRID algorithms to the others is obvious.
conclusions The algorithm that gives the best results for all metrics is GREEDY!!! Best Hit rate Best average time per request The algorithm that gives the worse results is SIZE. Worse Byte Hit rate Worse average time per request. LRU and LFU gives the best Byte Hit rate. This can be explained by the fact that these are the only algorithms that do not take into account page size.
conclusions (cont.) HYBRID algorithm shows good performance in the following metrics: Hit rate average time per request BUT in all the metrics, GREEDY shows better results. For all the tested algorithms, the Hit rate improved significantly when the cache size increases from 4MB-64MB. From this point the improvement is much more moderate.