Download presentation
Presentation is loading. Please wait.
Published byGerard Lawson Modified over 9 years ago
2
1 Caching Characteristics of Internet and Intranet Web Proxy Traces Arthur Goldberg Ilya Pevzner Robert Buff Courant Institute of Mathematical Sciences New York University
3
2 Clients, Servers and Proxy
4
3 HTTP Through a Proxy BrowserProxyServer Miss Hit
5
4 Potential Web Caching Benefits Reduce response time by delivering document from a closer and/or less loaded server than the origin serverReduce response time by delivering document from a closer and/or less loaded server than the origin server Save bandwidth costs between proxy and origin serverSave bandwidth costs between proxy and origin server
6
5 Goals Study large internet and intranet tracesStudy large internet and intranet traces Evaluate caching opportunities and problemsEvaluate caching opportunities and problems Examine cache size needs and document residence timesExamine cache size needs and document residence times
7
6 Part 1 Proxy trace sources and proxy configurations
8
7 Data Sources
9
8 ISP Usage 450,000 users450,000 users LoadLoad –Peak 500 unique clients500 unique clients 30 requests per second30 requests per second –Average 1M requests per day1M requests per day
10
9 ISP hardware details IBM RS/6000 systemIBM RS/6000 system 256 MB RAM256 MB RAM Three 4 GB disksThree 4 GB disks
11
10 ISP proxy configuration details 8 proxies nationwide8 proxies nationwide Netscape 2.5 proxyNetscape 2.5 proxy 5.5 GB cache size5.5 GB cache size Netscape extended-2 log formatNetscape extended-2 log format ParametersParameters –max-uncheck - 6 hours –lm-factor - 0.1 –term-percent - 80%
12
11 Intranet Usage 8,000 employees8,000 employees LoadLoad –Peak VariesVaries –Average 500K requests per day, over 10 hours500K requests per day, over 10 hours
13
12 Intranet hardware details Sun Microsystems Ultra 1 serverSun Microsystems Ultra 1 server 1 GB RAM1 GB RAM Seven 4 GB disksSeven 4 GB disks
14
13 Intranet proxy configuration details 2 proxies2 proxies Squid 1.1.21 proxySquid 1.1.21 proxy 12 GB disk cache size12 GB disk cache size 750MB memory cache size750MB memory cache size Extended log formatExtended log format
15
14 Part 2 Analysis of ISP and Intranet traces assuming unlimited cache storage
16
15 Key Cache Metrics Hit Ratio (HR )Hit Ratio (HR ) Fractional Bandwidth Savings (BT)Fractional Bandwidth Savings (BT)
17
16 Analyzing Caching Properties
18
17 ISP documents that cannot be cached, as per HTTP specification
19
18 Comment about “cookies” For Prodigy, RFC figures assume that Netscape proxy follows RFCFor Prodigy, RFC figures assume that Netscape proxy follows RFC In reality, Netscape proxy does not cache documents with cookiesIn reality, Netscape proxy does not cache documents with cookies Documents with cookies, account for 2% of responses in Prodigy traceDocuments with cookies, account for 2% of responses in Prodigy trace It follows that RFC figures for Prodigy may be up to 2% higher than shownIt follows that RFC figures for Prodigy may be up to 2% higher than shown
20
19 ISP Hit Ratio vs. Trace Length
21
20 ISP BT vs. Trace Length
22
21 Intranet HR vs. Trace Length
23
22 Intranet BT vs. Trace Length
24
23 Part 3 Analysis of ISP trace with finite cache sizes.
25
24 Prophetic Cache Replacement Algorithm A Prophetic cache stores exactly the set of documents that will be referenced in the futureA Prophetic cache stores exactly the set of documents that will be referenced in the future An on-line prophetic cache algorithm cannot be builtAn on-line prophetic cache algorithm cannot be built However, given a trace, prophetic caching decisions can be determined off- lineHowever, given a trace, prophetic caching decisions can be determined off- line
26
25 Prophetic Cache Replacement Algorithm (continued) Cache space used by a prophetic cache is the minimum size needed to avoid cache missesCache space used by a prophetic cache is the minimum size needed to avoid cache misses –notes: true for any maximum residence timetrue for any maximum residence time analyses make cyclical tracesanalyses make cyclical traces
27
26 Maximum Hit Rate as a function of residence time
28
27 Maximum Hit Rate as a function of residence time, by document size
29
28 Conclusions We analyze very long Web proxy traces from an ISP and an intranetWe analyze very long Web proxy traces from an ISP and an intranet We propose a new method to evaluate a proxy by comparing the actual hit rate with potential hit rateWe propose a new method to evaluate a proxy by comparing the actual hit rate with potential hit rate We show that it is important to keep the cache residence time above one dayWe show that it is important to keep the cache residence time above one day
30
29 Addresses E-mail: {artg,pevzner,buff}@cs.nyu.eduE-mail: {artg,pevzner,buff}@cs.nyu.edu WWW: www.cs.nyu.edu/{artg,pevzner,buff}WWW: www.cs.nyu.edu/{artg,pevzner,buff} Paper and presentation is available at www.cs.nyu.edu/artgPaper and presentation is available at www.cs.nyu.edu/artg
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.