Evaluating Content Management Techniques for Web Proxy Caches Martin Arlitt, Ludmila Cherkasova, John Dilley, Rich Friedrich and Tai Jin Proceeding on Second Workshop Internet Server Performance May 1999 Chun-Fu Kung System Laboratory, Dept. of Computer Engineering and Science Yuan-Ze University 2000/9/13
Outline Workload characteristics Experimental design Simulation result Virtual cache Conclusion
Workload Characteristics Current web proxy caches utilize simple replacement policies to determine which files to retain in the cache. Access log: cable modem users, five months and 115 million requests for 1.3 TB of data. Cacheable objects (92% for cacheable objects) Object set size (large object set size) Object size (4KB~148MB)
Workload Characteristics(cont.) Recency of reference (1/3 rereference in one hour and approximately 2/3 rereference in 24 hours) Frequency of reference (over 60% access only a single time, we call these objects as “one-timers”) Turnover (objects were once popular in the past)
Experimental Design Least-Recency-Used (LRU) SIZE GreedyDual-Size (GD-Size) Least-Frequency-Used (LFU) GreedyDual-Size with Frequency (GDSF) Least Frequency Used with Dynamic Aging (LFU-DA)
Results 1
Results 2
Results 3
Results 4
Virtual Cache We developed an approach that can focus on both of these metrics simultaneously. Replacements from VC i are moved to VC i+1. Replacements from VC n-1 are evicted from the cache. All objects that are reaccessed while in the cache are reinserted in VC 0.
Results 5
Results 6
Results 7
Results 8
Conclusion Our results indicate that size-based algorithm achieve higher hit rates than other algorithms. We also found that frequency-based algorithm are more effective at improving the byte hit rate of a proxy cache. We have developed virtual caches as an approach to provide optimal cache performance for multiple metrics simultaneously.