1 On Filter Effects in Web Caching Hierarchies Carey Williamson Department of Computer Science University of Calgary.

1 On Filter Effects in Web Caching Hierarchies Carey Williamson Department of Computer Science University of Calgary

2 Introduction z“The Web is both a blessing and a curse…” zBlessing: yInternet available to the masses ySeamless exchange of information zCurse: yInternet available to the masses yStress on networks, protocols, servers, users zMotivation: techniques to improve the performance and scalability of the Web

3 Why is the Web so slow? zClient-side bottlenecks (PC, modem) ySolution: better access technologies zServer-side bottlenecks (busy Web site) ySolution: faster, scalable server designs zNetwork bottlenecks (Internet congestion) ySolutions: caching, replication; improved protocols for client-server communication

4 Example of a Web Proxy Cache Proxy server Web server Web Client

5 Our Previous Work zEvaluation of Canada’s national Web caching infrastructure for CANARIE’s CA*net II backbone zWorkload characterization and evaluation of CA*net II Web caching hierarchy (IEEE Network, May/June 2000) zDeveloped Web proxy caching simulator for trace-driven simulation evaluation of Web proxy caching architectures zDeveloped synthetic Web proxy workload generator called ProWGen [Busari/Williamson INFOCOMM 2001]

CA*net II Web Caching Hierarchy (Dec 1998) USask CANARIE (Ottawa) (selected measurement points for our traffic analyses; 6-9 months of data from each) To NLANR

Caching Hierarchy Overview C C CCCCC Proxy... Regional/Univ. (5-10 GB) National (10-20 GB) Top-Level/International (20-50 GB) Cache Hit Ratios 30-40% 15-20% 5-10% (empirically observed)

8 Some Observations on Multi-Level Caching... zCaching hierarchy not very effective, due to a “diminishing returns” effect zReason: workload characteristics change as you move up the caching hierarchy (due to filtering effects, etc) zBigger caches aren’t really the answer zBetter caching system design might be...

9 Research Goals zDevelop better understanding of cache filter effects (intuitively, quantitatively) zTry to do something about it! zIdea #1: Try different cache replacement policies at different levels of hierarchy zIdea #2: Try partitioning cache content in overall hierarchy based on size or type to limit replication, etc.

10 Talk Overview zBackground/Motivation zUnderstanding Cache Filtering Effects zExploiting Cache Filtering Effects zSummary and Conclusions

11 Part I: Understanding Cache Filter Effects

12 Simulation Model Proxy server Web Servers Web Clients Proxy server Upper Level (Parent) Lower Level (Children)

13 Experimental Methodology zTrace-driven simulation (empirical traces) zMulti-factor experimental design zCache size y1 MB to 32 GB zCache Replacement Policy yRecency-based LRU (currently active docs) yFrequency-based LFU-Aging (popular docs) ySize-based GD-Size (favours smaller docs) zAnalyze workload characteristics

14 Web Workload Characteristics z“One-timers” (60-70% docs are useless!!!) zZipf-like document referencing popularity zHeavy-tailed file size distribution (i.e., most files small, but most bytes are in big files) zZero correlations between document size and document popularity (debate!) zTemporal locality (temporal correlation between recent past and near future references) [Mahanti et al. PER 2000]

15 Zipf-Like Referencing zAn intrinsic “power-law” relationship in the way that humans organize, access, and use information (e.g., library books, English words in text, movie rentals, Web sites, Web pages,...) zPlot item popularity versus relative rank, on a log-log scale, results in straight line

16 Example: Zipf-Like Document Popularity Profile for UofS Trace

17 Quiz Time: What do you get AFTER the cache?

18 Quiz Time: What do you get AFTER the cache?

19 (a) Quiz Time: What do you get AFTER the cache?

20 (a) Quiz Time: What do you get AFTER the cache? (b)

21 (a) Quiz Time: What do you get AFTER the cache? (b) (c)

22 (a) Quiz Time: What do you get AFTER the cache? (b) (c) (d)

23 Quiz Time: What do you get AFTER the cache? (c) Answer: (c)

24 Simulation Results for Input Workload Traces with Different Initial Zipf Slopes

25 The Magnitude of the Filter Effect Depends on Cache Size

26 Filter Effect Depends on Cache Replacement Policy

27 Filter Effect is Most Pronounced at First-Level Cache

28 Part II: Exploiting Cache Filter Effects

29 Research Questions: Multi-Level Caches zIn a multi-level caching hierarchy, can overall caching performance be improved by using different cache replacement policies at different levels of the hierarchy? zIn a multi-level caching hierarchy, can overall performance be improved by keeping disjoint document sets at each level of the hierarchy?

30 Simulation Model Proxy server Web Servers Web Clients Proxy server Upper Level (Parent) Complete Overlap No Overlap Partial Overlap (50%) Lower Level (Children)

31 Performance Metrics zDocument Hit Ratio yPercent of requested docs found in cache (HR) zByte Hit Ratio yPercent of requested bytes found in cache (BHR)

32 Experiment 1: Different Policies at Different Levels of the hierarchy (a) Hit Ratio (b) Byte Hit Ratio Parent Children

33 Parent Children

35 Experiment 2: Sensitivity to Workload Overlap zThe greater the degree of workload overlap amongst the child proxies, the greater the role for the parent cache zIn the “no overlap” scenario, the parent cache has negligible hit ratios, particularly when child caches are large

39 Experiment 3: Size-based Partitioning zPartition files across the two levels of the hierarchy based on size (e.g., keep small files at the lower level and large files at the upper level) (or vice versa) zThree size thresholds for “small”... y5,000 bytes y10,000 bytes y100,000 bytes

40 Size threshold = 5,000 bytes Size threshold = 10,000 bytes Small files at the lower level; Large files at the upper level Parent Children

41 Size threshold = 5,000 bytes Size threshold = 10,000 bytes Children Parent Large files at the lower level; Small files at the upper level

42 Summary: Multi-Level Caches zDifferent Policies at different levels yLRU/LFU-Aging at the lower level + GD-Size at the upper level provided improvement in performance yGD-Size + GD-Size provided better performance in hit ratio, but with some penalty in byte hit ratio zSize-threshold approach ysmall files at the lower level + large files at the upper level provided improvement in performance yreversing this policy offered no perf advantage

43 Conclusions zExisting multi-level caching hierarchies are not always that effective, due to cache filtering effects z“Heterogeneous” caching architectures may better exploit workload characteristics and improve Web caching performance

44 For More Information... zM. Busari, “Simulation Evaluation of Web Caching Hierarchies”, M.Sc. Thesis, Dept of Computer Science, U. Saskatchewan, June 2000 zC. Williamson, “On Filter Effects in Web Caching Hierarchies”, ACM Transactions on Internet Technology, 2002 (to appear). zEmail: carey@cpsc.ucalgary.ca yhttp://www.cpsc.ucalgary.ca/~carey/

1 On Filter Effects in Web Caching Hierarchies Carey Williamson Department of Computer Science University of Calgary.

Similar presentations

Presentation on theme: "1 On Filter Effects in Web Caching Hierarchies Carey Williamson Department of Computer Science University of Calgary."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 On Filter Effects in Web Caching Hierarchies Carey Williamson Department of Computer Science University of Calgary.

Similar presentations

Presentation on theme: "1 On Filter Effects in Web Caching Hierarchies Carey Williamson Department of Computer Science University of Calgary."— Presentation transcript:

Similar presentations

About project

Feedback