Download presentation
Presentation is loading. Please wait.
2
1 Probabilistic Models for Web Caching David Starobinski, David Tse UC Berkeley Conference and Workshop on Stochastic Networks Madison, Wisconsin, June 2000
3
2 Overview Web Caching Goals Caching Levels Classical caching algorithms and the Independent Reference (IR) model Web caching issues New algorithms and analysis for Web caches Discussion
4
3 Web Caching Goals Reduce response latency Reduce bandwidth consumption Reduce server load Exploit the locality of reference
5
4 Web Caching Levels Internet Clients Server Browser cache Proxy cache Reverse proxy
6
5 Caching: Performance Cache buffers have finite capacity Goal: Maximize the proportion of requests served by the cache (hit ratio) Need to devise algorithms that keep the “hot” documents in the cache
7
6 Caching Algorithms LRU FIFO CLIMB (Transpose)
8
7 LRU (Least Recently Used) 1 2 3 4 5 The buffer is arranged as a stack 5
9
8 LRU (ii) 1 2 3 4 5
10
9 LRU (iii) 1 2 3 4 5 3
11
10 LRU (iv) 1 2 4 5 3
12
11 CLIMB (Transpose) 1 2 3 4 5
13
12 CLIMB (ii) 1 3 2 4 5
14
13 Analysis: The IR model N: total number of pages p i : the probability that page i (i = 1,2,…,N) is requested Independent of previous requests Remarks: –Model mostly justified for proxy caches –Studies show that web page popularity follow a Zipf law
15
14 Cache algorithms K: Capacity storage of the cache (in pages) Ideally, place the K pages with the greatest value of p i into the cache Problem: the values p i are unknown a priori
16
15 LRU, FIFO, CLIMB analysis Under the IR model, the cache dynamics can be described by a Markov chain Each state {I 1, I 2,…, I K } represents the identity (URL) and ordering of the pages within the cache
17
16 LRU – Stationary Probabilities Allows to compute hit ratio Similar results for FIFO and CLIMB
18
17 Analysis - Summary Best hit ratio for CLIMB followed by LRU followed by FIFO Convergence rate much faster for LRU and FIFO than CLIMB Some mathematical issues still open
19
18 New Issues Non-uniform page size Non-uniform access costs –Nearby vs. distant servers –Underloaded vs. overloaded servers Page updates
20
19 The Extended IR model (Size) Same assumptions as in the IR model + The size of page i is s i The cache size is K
21
20 Off-Line Problem Knapsack Problem!
22
21 Heuristics Place documents in the cache with the greatest p i /s i values Perform, at most, twice worse than the optimal solution (except for extreme cases) Goal: Devise new on-line algorithms that learn to order documents according to p i /s i values
23
22 Size-LRU algorithm Set s min = min{s 1,s 2,…,s N } A randomized algorithm When page i is requested then – Act like LRU with probability s min /s i – Otherwise, do not change the cache ordering
24
23 Result IR model LRU p i Extended IR model Size-LRU p i /s i Size-LRU is dual to LRU
25
24 Example: Size-LRU Stationary Probabilities
26
25 Numerical Example N=100 documents Page popularity Heavy-tailed document size
27
26 Numerical Example
28
27 Summary New issues in Web caching Size-LRU algorithm Dual to LRU Extensions for cost issue On-going research The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.