Download presentation
Presentation is loading. Please wait.
1
1 On Filter Effects in Web Caching Hierarchies Carey Williamson Department of Computer Science University of Calgary
2
2 Introduction z“The Web is both a blessing and a curse…” zBlessing: yInternet available to the masses ySeamless exchange of information zCurse: yInternet available to the masses yStress on networks, protocols, servers, users zMotivation: techniques to improve the performance and scalability of the Web
3
3 Why is the Web so slow? zClient-side bottlenecks (PC, modem) ySolution: better access technologies zServer-side bottlenecks (busy Web site) ySolution: faster, scalable server designs zNetwork bottlenecks (Internet congestion) ySolutions: caching, replication; improved protocols for client-server communication
4
4 Example of a Web Proxy Cache Proxy server Web server Web Client
5
5 Our Previous Work zEvaluation of Canada’s national Web caching infrastructure for CANARIE’s CA*net II backbone zWorkload characterization and evaluation of CA*net II Web caching hierarchy (IEEE Network, May/June 2000) zDeveloped Web proxy caching simulator for trace-driven simulation evaluation of Web proxy caching architectures zDeveloped synthetic Web proxy workload generator called ProWGen [Busari/Williamson INFOCOMM 2001]
6
CA*net II Web Caching Hierarchy (Dec 1998) USask CANARIE (Ottawa) (selected measurement points for our traffic analyses; 6-9 months of data from each) To NLANR
7
Caching Hierarchy Overview C C CCCCC Proxy... Regional/Univ. (5-10 GB) National (10-20 GB) Top-Level/International (20-50 GB) Cache Hit Ratios 30-40% 15-20% 5-10% (empirically observed)
8
8 Some Observations on Multi-Level Caching... zCaching hierarchy not very effective, due to a “diminishing returns” effect zReason: workload characteristics change as you move up the caching hierarchy (due to filtering effects, etc) zBigger caches aren’t really the answer zBetter caching system design might be...
9
9 Research Goals zDevelop better understanding of cache filter effects (intuitively, quantitatively) zTry to do something about it! zIdea #1: Try different cache replacement policies at different levels of hierarchy zIdea #2: Try partitioning cache content in overall hierarchy based on size or type to limit replication, etc.
10
10 Talk Overview zBackground/Motivation zUnderstanding Cache Filtering Effects zExploiting Cache Filtering Effects zSummary and Conclusions
11
11 Part I: Understanding Cache Filter Effects
12
12 Simulation Model Proxy server Web Servers Web Clients Proxy server Upper Level (Parent) Lower Level (Children)
13
13 Experimental Methodology zTrace-driven simulation (empirical traces) zMulti-factor experimental design zCache size y1 MB to 32 GB zCache Replacement Policy yRecency-based LRU (currently active docs) yFrequency-based LFU-Aging (popular docs) ySize-based GD-Size (favours smaller docs) zAnalyze workload characteristics
14
14 Web Workload Characteristics z“One-timers” (60-70% docs are useless!!!) zZipf-like document referencing popularity zHeavy-tailed file size distribution (i.e., most files small, but most bytes are in big files) zZero correlations between document size and document popularity (debate!) zTemporal locality (temporal correlation between recent past and near future references) [Mahanti et al. PER 2000]
15
15 Zipf-Like Referencing zAn intrinsic “power-law” relationship in the way that humans organize, access, and use information (e.g., library books, English words in text, movie rentals, Web sites, Web pages,...) zPlot item popularity versus relative rank, on a log-log scale, results in straight line
16
16 Example: Zipf-Like Document Popularity Profile for UofS Trace
17
17 Quiz Time: What do you get AFTER the cache?
18
18 Quiz Time: What do you get AFTER the cache?
19
19 (a) Quiz Time: What do you get AFTER the cache?
20
20 (a) Quiz Time: What do you get AFTER the cache? (b)
21
21 (a) Quiz Time: What do you get AFTER the cache? (b) (c)
22
22 (a) Quiz Time: What do you get AFTER the cache? (b) (c) (d)
23
23 Quiz Time: What do you get AFTER the cache? (c) Answer: (c)
24
24 Simulation Results for Input Workload Traces with Different Initial Zipf Slopes
25
25 The Magnitude of the Filter Effect Depends on Cache Size
26
26 Filter Effect Depends on Cache Replacement Policy
27
27 Filter Effect is Most Pronounced at First-Level Cache
28
28 Part II: Exploiting Cache Filter Effects
29
29 Research Questions: Multi-Level Caches zIn a multi-level caching hierarchy, can overall caching performance be improved by using different cache replacement policies at different levels of the hierarchy? zIn a multi-level caching hierarchy, can overall performance be improved by keeping disjoint document sets at each level of the hierarchy?
30
30 Simulation Model Proxy server Web Servers Web Clients Proxy server Upper Level (Parent) Complete Overlap No Overlap Partial Overlap (50%) Lower Level (Children)
31
31 Performance Metrics zDocument Hit Ratio yPercent of requested docs found in cache (HR) zByte Hit Ratio yPercent of requested bytes found in cache (BHR)
32
32 Experiment 1: Different Policies at Different Levels of the hierarchy (a) Hit Ratio (b) Byte Hit Ratio Parent Children
33
33 Parent Children
34
34
35
35 Experiment 2: Sensitivity to Workload Overlap zThe greater the degree of workload overlap amongst the child proxies, the greater the role for the parent cache zIn the “no overlap” scenario, the parent cache has negligible hit ratios, particularly when child caches are large
36
36
37
37
38
38
39
39 Experiment 3: Size-based Partitioning zPartition files across the two levels of the hierarchy based on size (e.g., keep small files at the lower level and large files at the upper level) (or vice versa) zThree size thresholds for “small”... y5,000 bytes y10,000 bytes y100,000 bytes
40
40 Size threshold = 5,000 bytes Size threshold = 10,000 bytes Small files at the lower level; Large files at the upper level Parent Children
41
41 Size threshold = 5,000 bytes Size threshold = 10,000 bytes Children Parent Large files at the lower level; Small files at the upper level
42
42 Summary: Multi-Level Caches zDifferent Policies at different levels yLRU/LFU-Aging at the lower level + GD-Size at the upper level provided improvement in performance yGD-Size + GD-Size provided better performance in hit ratio, but with some penalty in byte hit ratio zSize-threshold approach ysmall files at the lower level + large files at the upper level provided improvement in performance yreversing this policy offered no perf advantage
43
43 Conclusions zExisting multi-level caching hierarchies are not always that effective, due to cache filtering effects z“Heterogeneous” caching architectures may better exploit workload characteristics and improve Web caching performance
44
44 For More Information... zM. Busari, “Simulation Evaluation of Web Caching Hierarchies”, M.Sc. Thesis, Dept of Computer Science, U. Saskatchewan, June 2000 zC. Williamson, “On Filter Effects in Web Caching Hierarchies”, ACM Transactions on Internet Technology, 2002 (to appear). zEmail: carey@cpsc.ucalgary.ca yhttp://www.cpsc.ucalgary.ca/~carey/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.