Web Performance 성민영 SNU Computer Systems lab.. 2 차례 4 Modeling the Performance of HTTP Over Several Transport Protocols. 4 Summary Cache : A Scaleable.

Web Performance 성민영 SNU Computer Systems lab.

2 차례 4 Modeling the Performance of HTTP Over Several Transport Protocols. 4 Summary Cache : A Scaleable Web Cache Sharing Protocol. 4 Web Server Workload Characterization.

Modeling the Performance of HTTP Over Several Transport Protocols.

4 Transport Protocols (1/2) 4 TCP 4 Persistent-Connection HTTP (P-HTTP) –Proposed by Padmanabhan and Mogul. –A variant of HTTP that uses one TCP connection to carry multiple HTTP requests. –Amortizes TCP’s connection overhead over multiple HTTP interactions. –A version of P-HTTP is part of HTTP 1.1 spec. –Pipelining can be used to get better performance.

5 Transport Protocols (2/2) 4 Transaction TCP –Caches per-host information sufficient to bypass the TCP’s three-way handshake and avoid slow start. –Also, shortens TCP’s TIME_WAIT period from 240 to 12 sec. 4 UDP-Based Request-Response Protocols –Reliable message passing protocol built atop UDP. –ex) Asynchronous Reliable Delivery Protocol(ARDP). –ARDP borrows TCP-style flow-control, congestion- avoidance, and retransmission algorithms. –Avoids TCP’s three-way handshake.

6 Network and Traffic Model (1/3) 4 Network Model –Network characteristics round-trip time (rtt) bandwidth (bw) maximum segment size (mss) segment-transmission time (stt) stt = mss / bw maximum useful window size (muws) muws =  rtt / stt 

7 Network and Traffic Model (2/3) –Network characteristics for several existing network.

8 Network and Traffic Model (3/3) 4 Traffic Model –small page : single 5 KB web page –medium page : single 25 KB web page –large page : single 100 KB web page –small cluster : single 3,220 KB page with three embedded images, sizes 57,613B, 2,344B, and 14,190B –large cluster : single 100 KB page with 10 embedded 25 KB images

9 Protocol Analysis (1/7) 4 Classes of protocols –TCP, Connection caching protocols(P-HTTP, T/TCP), and UDP-based request-response protocols. 4 Minimum Transmit Times –minimum possible transaction time

10 Protocol Analysis (2/7) –a series of n independent requests pipelined 4 Simple Model –one round-trip overhead per reply rtt / (reply size /bw) 4 HTTP over TCP –TCP slow-start

11 Protocol Analysis (3/7) Performance Overhead for Ethernet, modem, and ISDN networks is reasonable (muws at most 2) Networks such as Fast-Ethernet, Fast-Internet, ADSL have higher overheads (much higher muws)

12 Protocol Analysis (4/7) TCP congestion avoidance overhead.(S TCP /S min )

13 Protocol Analysis (5/7) 4 HTTP over TCP with connection caching –P-HTTP, T/TCP

14 Protocol Analysis (6/7) –Observations Caching TCP performance is somewhat better than standard TCP for the cluster cases. Overhead is still high for the Fast-Ethernet and Fast- Internet ( large bandwidth-delay product) 4 HTTP over UDP-Based Protocols –ARDP avoids TCP’s three-way handshake

15 Protocol Analysis (7/7) –Avoiding the 3-way handshake is especially helpful for single, brief request-response interactions.

Summary Cache : A Scaleable Web Cache Sharing Protocol.

17 Web Caching 4 Internet Cache Protocol (ICP) –A web cache sharing protocol by Harvest group –ICP discovers cache hits in other proxies by having the proxy multicast a query message to all other proxies whenever a cache miss occurs. –Not widely deployed because of the overhead. –One of the alternatives : cache array routing protocol that partitions the URL space among proxies.

18 Overhead of ICP –Not a scalable protocol As the number of proxies increases, the overhead quickly becomes prohibitive. Simulations result shows that ICP incurs considerable overhead even when # proxies is as low as four. –The effort spent on processing ICP is proportional to the total number of cache misses experience by other proxies, instead of proportional to the number of actual remote cache hits.

19 Summary Cache (1/7) 4 Summary Cache Scheme –Each proxy stores a summary of URLs of documents cached at every other proxy. –If the requested document might be stored in other proxies, the proxy sends out requests to the relevant proxies to fetch the document. –Scalable : summaries do not have to be up to date or accurate. Errors –false misses, false hits, remote stale hits

20 Summary Cache (2/7) 4 Impact of Update Delays –Delaying update Delaying update until the percentage of cached documents that are new reaches a threshold. Updating summaries upon regular time intervals. –Simulation results shows that the degradation in total cache hit ratio increases linearly with the update threshold. –False hit ratio is very small though it does increases linearly with the threshold.

21 Summary Cache (3/7) 4 Summary Representation –Summaries need to be stored in the main memory. –Tow naïve summary representation exact-directory –use 16-byte MD5 –consumes too much memory. server-name –generates too many false hits, significantly increases the network traffic.

22 Summary Cache (4/7)

23 Summary Cache (5/7) 4 Bloom Filters –Invented by Burton Bloom in 1970. 1 1 1 1 m bits Bit Vector v Element a H 1 (a) =P 1 H 2 (a) =P 2 H 3 (a) =P 3 H 4 (a) =P 4

24 Summary Cache (6/7) –k independent hash functions with a range {1,…,m} –False positive after inserting n keys into a table of size m the probability of a false positive is the right side is minimized for k=ln2 * m / n,

25 Summary Cache (7/7) 4 Bloom Filters as Summaries –8, 16, 32 bits for each document. –4 hash functions. 4 Experiment Result –Bloom filter summaries has virtually the same cache hit ratio as the exact-directory approach. –In terms of total size of inter-proxy network msg, Bloom filter based summaries improve over ICP by 55% to 64%.

Web Server Workload Characterization : The Search for Invariants

29 Invariants Found in Web Server Workloads (1/3) 4 Success Rate –Success rate for lookups at server = 88% 4 File Types –HTML and image files account for 90-100% of requests 4 Mean Transfer Size –Mean transfer size <= 21 kilobytes 4 One Time Referencing –Approximately one-third of the files and bytes accessed in the log are accessed only once in the log.

30 Invariants Found in Web Server Workloads (2/3) 4 Size Distribution –File size distribution is Pareto with 0.40 <  < 0.63 4 Concentration of References –10% of the files accessed account for 90% of server requests and 90% of the bytes transferred. 4 Inter-Reference Times –File inter-reference times are exponentially distributed and independent

31 Invariants Found in Web Server Workloads (3/3) 4 Remote Requests –Remote sites account for >= 70% of the accesses to the server, and >= 60% of the bytes transferred. 4 Wide Area Usage –Web servers are accessed by 1000’s of domains, with 10% of the domains accounting for >= 75% of usage.

32 Self-Similarity 4 Recent work has suggested that WWW traffic may be self-similar 4 Moving from the bottom plot to the top plot, bursti ness clearly exists across several different time scales

Web Performance 성민영 SNU Computer Systems lab.. 2 차례 4 Modeling the Performance of HTTP Over Several Transport Protocols. 4 Summary Cache : A Scaleable.

Similar presentations

Presentation on theme: "Web Performance 성민영 SNU Computer Systems lab.. 2 차례 4 Modeling the Performance of HTTP Over Several Transport Protocols. 4 Summary Cache : A Scaleable."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Web Performance 성민영 SNU Computer Systems lab.. 2 차례 4 Modeling the Performance of HTTP Over Several Transport Protocols. 4 Summary Cache : A Scaleable.

Similar presentations

Presentation on theme: "Web Performance 성민영 SNU Computer Systems lab.. 2 차례 4 Modeling the Performance of HTTP Over Several Transport Protocols. 4 Summary Cache : A Scaleable."— Presentation transcript:

Similar presentations

About project

Feedback