Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Caching Characteristics of Internet and Intranet Web Proxy Traces Arthur Goldberg Ilya Pevzner Robert Buff Courant Institute of Mathematical Sciences.

Similar presentations


Presentation on theme: "1 Caching Characteristics of Internet and Intranet Web Proxy Traces Arthur Goldberg Ilya Pevzner Robert Buff Courant Institute of Mathematical Sciences."— Presentation transcript:

1

2 1 Caching Characteristics of Internet and Intranet Web Proxy Traces Arthur Goldberg Ilya Pevzner Robert Buff Courant Institute of Mathematical Sciences New York University

3 2 Clients, Servers and Proxy

4 3 HTTP Through a Proxy BrowserProxyServer Miss Hit

5 4 Potential Web Caching Benefits Reduce response time by delivering document from a closer and/or less loaded server than the origin serverReduce response time by delivering document from a closer and/or less loaded server than the origin server Save bandwidth costs between proxy and origin serverSave bandwidth costs between proxy and origin server

6 5 Goals Study large internet and intranet tracesStudy large internet and intranet traces Evaluate caching opportunities and problemsEvaluate caching opportunities and problems Examine cache size needs and document residence timesExamine cache size needs and document residence times

7 6 Part 1 Proxy trace sources and proxy configurations

8 7 Data Sources

9 8 ISP Usage 450,000 users450,000 users LoadLoad –Peak 500 unique clients500 unique clients 30 requests per second30 requests per second –Average 1M requests per day1M requests per day

10 9 ISP hardware details IBM RS/6000 systemIBM RS/6000 system 256 MB RAM256 MB RAM Three 4 GB disksThree 4 GB disks

11 10 ISP proxy configuration details 8 proxies nationwide8 proxies nationwide Netscape 2.5 proxyNetscape 2.5 proxy 5.5 GB cache size5.5 GB cache size Netscape extended-2 log formatNetscape extended-2 log format ParametersParameters –max-uncheck - 6 hours –lm-factor - 0.1 –term-percent - 80%

12 11 Intranet Usage 8,000 employees8,000 employees LoadLoad –Peak VariesVaries –Average 500K requests per day, over 10 hours500K requests per day, over 10 hours

13 12 Intranet hardware details Sun Microsystems Ultra 1 serverSun Microsystems Ultra 1 server 1 GB RAM1 GB RAM Seven 4 GB disksSeven 4 GB disks

14 13 Intranet proxy configuration details 2 proxies2 proxies Squid 1.1.21 proxySquid 1.1.21 proxy 12 GB disk cache size12 GB disk cache size 750MB memory cache size750MB memory cache size Extended log formatExtended log format

15 14 Part 2 Analysis of ISP and Intranet traces assuming unlimited cache storage

16 15 Key Cache Metrics Hit Ratio (HR )Hit Ratio (HR ) Fractional Bandwidth Savings (BT)Fractional Bandwidth Savings (BT)

17 16 Analyzing Caching Properties

18 17 ISP documents that cannot be cached, as per HTTP specification

19 18 Comment about “cookies” For Prodigy, RFC figures assume that Netscape proxy follows RFCFor Prodigy, RFC figures assume that Netscape proxy follows RFC In reality, Netscape proxy does not cache documents with cookiesIn reality, Netscape proxy does not cache documents with cookies Documents with cookies, account for 2% of responses in Prodigy traceDocuments with cookies, account for 2% of responses in Prodigy trace It follows that RFC figures for Prodigy may be up to 2% higher than shownIt follows that RFC figures for Prodigy may be up to 2% higher than shown

20 19 ISP Hit Ratio vs. Trace Length

21 20 ISP BT vs. Trace Length

22 21 Intranet HR vs. Trace Length

23 22 Intranet BT vs. Trace Length

24 23 Part 3 Analysis of ISP trace with finite cache sizes.

25 24 Prophetic Cache Replacement Algorithm A Prophetic cache stores exactly the set of documents that will be referenced in the futureA Prophetic cache stores exactly the set of documents that will be referenced in the future An on-line prophetic cache algorithm cannot be builtAn on-line prophetic cache algorithm cannot be built However, given a trace, prophetic caching decisions can be determined off- lineHowever, given a trace, prophetic caching decisions can be determined off- line

26 25 Prophetic Cache Replacement Algorithm (continued) Cache space used by a prophetic cache is the minimum size needed to avoid cache missesCache space used by a prophetic cache is the minimum size needed to avoid cache misses –notes: true for any maximum residence timetrue for any maximum residence time analyses make cyclical tracesanalyses make cyclical traces

27 26 Maximum Hit Rate as a function of residence time

28 27 Maximum Hit Rate as a function of residence time, by document size

29 28 Conclusions We analyze very long Web proxy traces from an ISP and an intranetWe analyze very long Web proxy traces from an ISP and an intranet We propose a new method to evaluate a proxy by comparing the actual hit rate with potential hit rateWe propose a new method to evaluate a proxy by comparing the actual hit rate with potential hit rate We show that it is important to keep the cache residence time above one dayWe show that it is important to keep the cache residence time above one day

30 29 Addresses E-mail: {artg,pevzner,buff}@cs.nyu.eduE-mail: {artg,pevzner,buff}@cs.nyu.edu WWW: www.cs.nyu.edu/{artg,pevzner,buff}WWW: www.cs.nyu.edu/{artg,pevzner,buff} Paper and presentation is available at www.cs.nyu.edu/artgPaper and presentation is available at www.cs.nyu.edu/artg


Download ppt "1 Caching Characteristics of Internet and Intranet Web Proxy Traces Arthur Goldberg Ilya Pevzner Robert Buff Courant Institute of Mathematical Sciences."

Similar presentations


Ads by Google