Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 On the Sensitivity of Web Proxy Cache Performance to Workload Characteristics Mudashiru Busari Carey Williamson Department of Computer Science University.

Similar presentations


Presentation on theme: "1 On the Sensitivity of Web Proxy Cache Performance to Workload Characteristics Mudashiru Busari Carey Williamson Department of Computer Science University."— Presentation transcript:

1 1 On the Sensitivity of Web Proxy Cache Performance to Workload Characteristics Mudashiru Busari Carey Williamson Department of Computer Science University of Saskatchewan

2 2 Talk Outline zIntroduction and Motivation zProWGen: Proxy Workload Generator yTool for Synthetic Web Proxy Workloads zSimulation Study ySimulation Evaluation of Web Proxy Caches zConclusions and Future Work

3 3 Introduction z“The Web is both a blessing and a curse…” zBlessing: yInternet available to the masses ySeamless exchange of information zCurse: yInternet available to the masses yStress on networks, protocols, servers, users zMotivation: techniques to improve the performance and scalability of the Web

4 4 Why is the Web so slow? zClient-side bottlenecks (PC, modem) ySolution: better access technologies zServer-side bottlenecks (busy Web site) ySolution: faster, scalable server designs zNetwork bottlenecks (Internet congestion) ySolutions: caching, replication; improved protocols for client-server communication

5 5 Our Previous Work zEvaluation of Canada’s national Web caching infrastructure for CANARIE’s CA*net II backbone zWorkload characterization and evaluation of CA*net II Web caching hierarchy (IEEE Network, May/June 2000) zDeveloped Web proxy caching simulator for trace-driven simulation evaluation of Web proxy caching architectures

6 CA*net II Web Caching Hierarchy (Dec 1998) USask CANARIE (Ottawa) (selected measurement points for our traffic analyses; 3-6 months of data from each) To NLANR

7 Caching Hierarchy Overview C C CCCCC Proxy... Regional/Univ. (5-10 GB) National (10-20 GB) Top-Level/International (20-50 GB) Cache Hit Ratios 30-40% 15-20% 5-10% (empirically observed)

8 8 Overview of This Paper zConstructed synthetic Web proxy workload generation tool (ProWGen) that captures the salient characteristics of empirical Web proxy workloads zUse ProWGen to evaluate sensitivity of proxy caches to selected Web proxy workload characteristics

9 9 Research Methodology zDesign, construction, and parameterization of aggregate workload models, based on empirical traces (Web proxy access logs) zValidation of ProWGen (statistically, and versus empirical workloads) zSimulation evaluation of single-level caches ySensitivity to workload characteristics yEffect of cache size yEffect of cache replacement policy

10 10 ProWGen: Key Workload Characteristics z“One-timers” (60-70% docs are useless!!!) zZipf-like document referencing popularity zHeavy-tailed file size distribution (i.e., most files small, but most bytes are in big files) zCorrelations (if any) between document size and document popularity (debate!) zTemporal locality (temporal correlation between recent past and near future references) [Mahanti et al. Perf.Eval. 2000]

11 11 ProWGen (Conceptual View) ProWGen Software 1ZacL Input Parameters Synthetic Workload

12 12 ProWGen (Conceptual View) ProWGen Software 1ZacL P r Zipf Input Parameters Synthetic Workload

13 13 ProWGen (Conceptual View) ProWGen Software 1ZacL P r Zipf Input Parameters Synthetic Workload

14 14 ProWGen (Conceptual View) ProWGen Software 1ZacL P r Zipf F s LLCD Input Parameters Synthetic Workload

15 15 ProWGen (Conceptual View) ProWGen Software 1ZaCL P r Zipf F s LLCD -1 0 +1 Correlation Input Parameters Synthetic Workload

16 16 ProWGen: Workload Modeling Details zModeled workload characteristics yOne-time referencing yZipf-like referencing behaviour (Zipf’s Law ) yFile size distribution Body – lognormal distribution Tail – Pareto Distribution yCorrelation between file size and popularity yTemporal locality Static probabilities in finite-size LRU stack model Dynamic probabilities in finite-size LRU stack model

17 17 Validation of ProWGen zTo establish that the synthetic workloads possess the desired characteristics (quantitative and qualitative), and that the characteristics are similar to those in empirical workloads Example: analyze 5 million requests from a proxy server trace and parameterize ProWGen to generate a similar workload

18 18 Parameter Value Total number of requests Unique documents (of total requests) One-timers (of unique documents) Zipf slope Tail Index Documents in the tail Beginning of the tail (bytes) Mean of the lognormal file size distribution Standard deviation Correlation between file size and popularity LRU Stack Model for temporal locality LRU Stack Size 5,000,000 34% 72% 0.807 1.322 22% 10,000 7,000 11,000 Zero Static and Dynamic 1,000 Workload Synthesis

19 19 Zipf-like Referencing Behaviour Empirical Trace Slope = 0.81 Synthetic Trace Slope = 0.83

20 20 Transfer Size Distribution References Bytes transferred

21 21 Simulation Evaluation of Single-Level Web Proxy Caches: Some Research Questions zIn a single-level proxy cache, how sensitive is Web proxy caching performance to certain workload characteristics (one-timers, Zipf slope, heavy-tail index)? zHow does the degree of sensitivity change depending on the cache replacement policy?

22 22 Web Clients Web Servers Proxy server Aggregate Workload Simulation Model

23 23 Experimental Design: Factors and Levels zCache size y1 MB to 32 GB zCache Replacement Policy yRecency-based LRU yFrequency-based LFU-Aging ySize-based GD-Size zWorkload Characteristics yOne-timers, Zipf slope, tail index, correlation, temporal locality model

24 24 Performance Metrics zDocument Hit Ratio yPercent of requested docs found in cache (HR) zByte Hit Ratio yPercent of requested bytes found in cache (BHR)

25 25 Simulation Results (Preview) zCache performance is very sensitive to: ySlope of Zipf-like doc referencing popularity yTemporal locality property yCorrelations between size and popularity zCache performance relatively insensitive to: yOne-timers yTail index of heavy-tailed file size distribution

26 26 Sensitivity to One-timers (LRU) (a) Doc Hit Ratio(a) Byte Hit Ratio

27 27 Sensitivity to Zipf Slope (LRU) (a) Hit Ratio(b) Byte Hit Ratio Difference of 0.2 in Zipf slope impacts performance by as much as 10-15% in hit ratio and byte hit ratio

28 28 Sensitivity to Heavy Tail Index (LRU Replacement Policy) (a) Doc Hit Ratio(b) Byte Hit Ratio

29 29 Sensitivity to Heavy Tail Index (GD-Size Replacement Policy) (a) Hit Ratio(a) Byte Hit Ratio Difference of 0.2 in heavy tail index impacts performance by less than 3%

30 30 Sensitivity to Correlation (LRU) (a) Doc Hit Ratio(a) Byte Hit Ratio

31 31 (a) Doc Hit Ratio(b) Byte Hit Ratio Sensitivity to Temporal Locality (LRU)

32 32 Summary: Single-Level Caches zCache performance is sensitive to: ySlope of Zipf-like document referencing popularity (steeper slope implies better caching) yTemporal locality yCorrelation between size and popularity zCache Performance is insensitive to: yOne-timers yTail index of heavy-tailed file size distribution

33 33 Conclusions zProWGen is a useful tool for the generation of synthetic Web proxy workloads for the evaluation of Web proxy caches and Web proxy caching architectures zWeb proxy cache performance is quite sensitive to Zipf slope, temporal locality, and correlations (if any) between document size and document popularity

34 34 Future Work zExtend and improve ProWGen yRequest arrival process (timestamps) yFile modifications, types, and lifetimes yWeb page structure (spatial locality) yScaling the workload model(s)... zEvaluate multi-level Web proxy caches zPort to network emulation testbed

35 35 For More Information... zM. Busari, “Simulation Evaluation of Web Caching Hierarchies”, M.Sc. Thesis, Dept of Computer Science, U. Saskatchewan, June 2000 zProWGen tool: yhttp://www.cs.usask.ca/faculty/carey/software/ zEmail: carey@cs.usask.ca yhttp://www.cs.usask.ca/faculty/carey/


Download ppt "1 On the Sensitivity of Web Proxy Cache Performance to Workload Characteristics Mudashiru Busari Carey Williamson Department of Computer Science University."

Similar presentations


Ads by Google