Presentation is loading. Please wait.

Presentation is loading. Please wait.

Summary of WWW Characterizations James E. Pitkow Xerox Palo Alto Research Center WWW Journal 99 발표자 : 노양우.

Similar presentations


Presentation on theme: "Summary of WWW Characterizations James E. Pitkow Xerox Palo Alto Research Center WWW Journal 99 발표자 : 노양우."— Presentation transcript:

1 Summary of WWW Characterizations James E. Pitkow Xerox Palo Alto Research Center WWW Journal 99 발표자 : 노양우

2 System Software Research Lab. 2 Contents mIntroduction /Client, Proxies and Gateways, Server /Traces and Analysis (distribution) mSummary /1994 /1995 /1996 /1997 /1998 mConclusion

3 System Software Research Lab. 3 Introduction mGrowth of Web Usages /representative characterization --> enjoyable Web surfing /various data sets at various points ( clients, proxy and gateways, servers ) 4 several invariants mClients /informative but rare --> browser implementation, sufficient APIs mProxy and Gateways /greater availability /less concentration on characteristics --> caching algorithm mServers /traffic analysis

4 System Software Research Lab. 4 1994 mA Caching Relay for WWW 4 DEC proxy, 4000/day from 100 users 4 document popularity --> Zipf 4 cache : hit (1/3), miss (1/3), invalidation (1/3) mMosaic Will Kill Me 4 Intel Intranet Proxies --> Images traffic mA Simple Yet Robust Caching Algorithm 4 Georgia IT server : recency > frequency 4 LRU : Server side cache-hit rate (80%) mInvalidation in Large Scale Object Cache 4 Harvest Cache, Xmosaic 4 HTML : frequently modified ( 75 days), Image (107 days)

5 System Software Research Lab. 5 1995 mCharcteristics of WWW Client-Based Traces 4 Xmosaic, 600 users, 6 months 4 transmission times, doc. size, doc. size versus # of requests (Pareto) 4 unix file systems : more small and large file exists mApplication Level Document Caching mExplaining WWW Traffic - Self - Similarity 4 1 second -- 100 seconds : self-similar 4 Busiest periods : self-similar, idle periods: non self-similar mCaching Proxies : Limitations and Potentials 4 # of requests per sever (Zipf), CGI (0.5%) mNetwork Behaviour of a Busy Web Servers /DEC, Congressional Election Server 4 images --> major traffic, inter-arrival time --> not Poission

6 System Software Research Lab. 6 1996 mWWW Cache Consistency 4 Microsoft, BU, Harvard 4 popularity : inverse with frequency of change 4 image : 65 %, CGI : 9 % 4 HTML : 50 days, GIF : 85 days mWeb Server Workload Characterization 4 University of Waterloog, Calgary, NASA, NCSA 4 10% documents --> 90% requests 4 10% domains --> 75% usags mEvaluating History Mechanism 4 Xmosaic, 6 weeks 4 new URL : 42%, revisting URL: 58%

7 System Software Research Lab. 7 1997 mStrong Regularites in Web Surfing 4 click per sites --> inverse Gaussian 4 average clicks 8.32, typical case : 1 click mShared User Behaviour 4 DEC, Korean National Proxy, Virginia Tech, AOL 4 Median file size : 2KB, Mean file size : 27 KB 4 25 % server : 80-95% requests, 90% bytes : 25% servers mCharacterizing WWW Queries 4 CGI : 4 % (KNP), 9 % (AOL), 12 % (VT) 4 99% queries : simple mWeb Facts and Fantasy 4 Educational (Harvard, Rice), Business (BUS, ISP, FSS, AE), Info (GOV, PROF) 4 Characterization of Sites ( size of the site, diversity of users, user access patterns) q Renovational growth ( Business )

8 System Software Research Lab. 8 1998 q Size growth ( Eudcational sites) q Visit growth by the same user ( Information sites ) q Attraction ( Adult Entertainment : Search Engine) 4 CGI : low requests, low traffic : counter, login, search engine 4 Peak Activity : network bottlneck mGenerating representative Web workloads 4 SURGE q file size : body (lognormal), tail (Pareto), popularity : Zipf, request size: Pareto, reading times : Pareto, …. q realistic benchmark : HTTP-NG

9 System Software Research Lab. 9 Conclusion mDynamic Web --> Several Invariants /file popularity, file size, # of request per user, /site popularity, life span, request type…. mFuture Research /Relation between file popularity and reoccurence rate /User’s navigation paths


Download ppt "Summary of WWW Characterizations James E. Pitkow Xerox Palo Alto Research Center WWW Journal 99 발표자 : 노양우."

Similar presentations


Ads by Google