Presentation is loading. Please wait.

Presentation is loading. Please wait.

Zipf-Distributions & Caching

Similar presentations


Presentation on theme: "Zipf-Distributions & Caching"— Presentation transcript:

1 Zipf-Distributions & Caching
Christopher Fingar

2 Issues Do users requests for Web content fit a Z-distribution?
Do the behavioral tendencies of proxy traces: Apply generally to all web accesses? Stay specific to trace data? If applicable to web access can the distribution improve caching Of a fixed community3 3. Hierarchy design

3 Overview Z-Law Z-like behavior of web requests
What is it and difference between Z-“like” distribution Z-like behavior of web requests Z-Model and asymptotic behavior Cache replacement and Z-Model

4 The Law : P of a requests is proportional to 1/iα (popularity)
Request probability is relative to it’s popularity Where α = 1

5

6 What it means Z-Like Z-distribution with α ranging depending on trace data. Allows for models that don’t consider requests related to: Size Data change 1.1 most requests don’t follow exact Z-dist Models can be specific such that: Size and update rate has no affect on how often requested

7 Z-Like Results 4 studies find Z-like results
2 fairly close to the law 1 on a server that follows the law Another that is a low Z-like relation 1 study says not related to Z-law Results affected by: Timeframe of trace data age of data interpretations of the law

8 Time for another study 6 traces: DEC, UCB, UPisa, Questnet, NLAR, and FuNet Promising results: .64<= α<.83 No on 10/90 rule, (depends on α?) Size doesn’t matter Change isn’t important either 10/90 states 10% of documents = 90% of requests conflicting info says that 10/90 depends on α later says opposite that α depends on # of hot docs.

9 Promising results Each axis is log based
Means 1/iα relation Strongest results come from specific institutions People more like minded or just have a similar goal when browsing

10 Hot Docs. Not so hot It takes 70%-80% of documents to capture 90% of requests. This is why high hit rates are uncommon

11 Size and Change If size is important:
Replacement algorithms should consider size Overall design should consider size Low correlation found between access and size Change rate= ratio of changes and accesses Affects update of cache files Low correlation found between change rate and size

12 If Z-like behavior 3 cases of asymptotic behavior can be examined with a Z-based model. Infinite cache, finite requests Finite cache, infinite requests Page request interarrival-times

13 Infinite cache, finite requests
Stream of R requests Shows this behavior not just for trace data Seen in other behavior Not useable if R>N

14 Finite cache, infinite requests
Cache can hold C webpages Shows this behavior not just for trace data Behavior seen in other research Flattening Expected because of lack of infinite requests, then the cache exceeds trace To actually evaluate used R>10^6 to avoid this problem

15 Page request interarrival-times
Infinite arrival stream and request i How many requests come until i is requested again? Data isn’t perfect 0<α<1 Middle area where NlnN>> K>>ln hold best results like real world

16 Cache Replacement Model uses fixed probability and size
Address 4 replacement algorithms and determine highest hit ratio Can the model improve cache Replacement?

17 LFU 2 Kinds in the paper Perfect In-Cache
Objects’ counters are never removed even when object is removed from cache In-Cache Objects’ counter removed when object removed from cache

18 GD-Size & LRU GD-Size LRU Both document size and locality considered
Remove file that used least recently

19 How do they compare? When cache large perfect LFU and GD equal, Cache LFU worst

20 Why Compare? Purpose of comparison Outcome
Determine importance of temporal locality when designing replacement algorithm Outcome LRU outperformed by perfect-LFU LRU should be the best if temporal locality important Maybe temporal locality isn’t important Purpose of comparison determine if temporal locality is something that should be considered when designing replacement algorithm LRU outperformed by perfect-LFU -> LRU should be the best if temporal locality important Perfect-LFU isn’t possible the counters would become too numerous to check

21 Summary Z-Distribution Real World Z-Like results
Asymptotic Behavior observed and shown with Z-Model Cache Replacement


Download ppt "Zipf-Distributions & Caching"

Similar presentations


Ads by Google