1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at WCW 02, Aug 14
2 Web Cache Hierarchy Parent Web Cache Child Web Cache sibling-sibling relationship parent-child relationship Request ICP QueriesTo Origin Server
3 Motivation and Goals No practical methods to evaluate cache hierarchies under specific workload and network conditions –Important for designing a “caching solution” Criteria for evaluation system –Model reality well –Applicable to different protocols & structures –Experiments should be repeatable –Use both hit rate and user response time as metrics Solution based on Web Polygraph
4 Cache Evaluation with Web Polygraph Polysrv Proxy Cache Polyclt Polysrv Polyclt Synthetic HTTP clients and servers on real machines on a LAN Workload parameterized by size, distribution, popularity, load and many others
5 Hierarchy Evaluation with Web Polygraph Polysrv Polyclt Polysrv Polyclt Proxy Cache Network Delay
6 Evaluation Framework Web Polygraph –Reports throughput, response time, hit ratio etc. from client’s viewpoint (but unaware of hierarchy) Dummynet –Used to simulate networks of different capabilities by controlling bandwidth, latency and packet loss. Squid cache and Squeezer log analysis tool –Captures cache cooperation info –Modified to monitor specific polygraph phases Squeezer and Polygraph info has to be reconciled
7 Experimental Setup Experiments performed on different cache hierarchies of two, three & four Squid caches. Hardware configuration of all Squid machines is the same (800MHz, 256MB, 4 30GB disks) Polygraph machines and caches on same 100Mbps switched ethernet network Balanced workload Cache “fill-up” phase not measured
8 List of Experiments Performance with different cache hierarchies Influence of network latency Influence of cache size Influence of the document sharing pattern One big cache compared to multiple caches Virtually unlimited experiment space with many parameters (e.g., request rate, public interest, cache, memory size etc.)
9 List of Cache Hierarchies 2OY 3SY 3OY 1ON-2OY 2SY 2SY-1OY 2OY-1OY1OY-2SY 1ON-2SY Cache Client Sibling-sibling Parent-child Same memory, disk per cache, fixed total request rate, no network delay
10 Simulation Results - Different Hierarchies Benefit of peering Improved hit ratio overcomes overheads of peering Parents appear less important than siblings Improved hit ratio
11 List of Experiments Performance with different cache hierarchies Influence of network latency 2 and 3 Squid caches independent or as siblings Network delay of 0 msecs, 40 msecs, or 80 msecs between caches Influence of cache size Influence of the document sharing pattern One big cache compared to multiple caches
12 Impact of Network Latency Hit ratio unaffected by latency Hit and Miss response times increase with latency Some increase in response time going from 0 to 40 to 80 msec Cache cooperation is helpful even with modest network delay
13 Conclusion Web Polygraph based framework to evaluate cooperative caching: –Flexible –Works on a real network –Workload characteristics are easy to specify. –Repeatable experiments –Hit ratio and user response time based metrics –Captures actual cooperation overheads
14 Future Work Make the toolset easily usable by the community – currently a recipe type help available Evaluation of large hierarchies may need a combination of experimental and analytical methods More results from the performance of different kinds of hierarchies in different scenarios
15 Influence of Cache Size Two Squid caches, running isolated or as siblings. Various total disk cache size Same total memory cache size Same constant request rate No network latency between caches
16 Simulation Results - Cache Size Cooperative caches –Higher hit & miss response time Miss response time is stable. Increase in hit response time –Fraction of memory to disk cache size Performance with increase of cache size –Improve quickly –Stabilizes gradually –Benefits of cooperation increase.
17 Influence of the Document Sharing Pattern Two Squid caches, running isolated or as siblings. Various document sharing pattern –Global URL space –Public interest: the percentage of all documents shared by Polygraph clients. Same total disk cache size Same total memory cache size Same constant request rate No network latency between caches
18 Simulation Results - Document Sharing Pattern Performance improves with public interest. Influence is mainly on remote hit.
19 Working Set Size
20 Performance of one big cache compared to multiple caches One, two, three and four Squid caches Isolated or all siblings cache hierarchies Best effort workload –Constant rate vs. best effort workload –Used to get the best throughput Same total disk cache size Same total memory cache size No network latency between caches
21 Simulation Results - one big cache compared to multiple caches One cache: the worst throughput Two separate cache: large improvement More separate cache: declined performance All siblings hierarchy –Improvement is more stable –Levels off quickly –Overheads of peering outweigh improved hit ratio eventually
22 Methodology - Phase Schedule Caches are in a stable state after Fill phase. Simulate daily Web traffic pattern in a short period. Fill Inc Top Dec Web Polygraph provides a scheme to customized desired workload pattern by phase schedule.