Presentation is loading. Please wait.

Presentation is loading. Please wait.

Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.

Similar presentations


Presentation on theme: "Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture."— Presentation transcript:

1 Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture Group http://www.cs.rice.edu/CS/Architecture/

2 2 Anatomy of a Web Request  Static content web server CPU Network Interface Main Memory Interconnect Request File Headers Request Headers File Headers File Network Request File Headers File Headers File Request File Headers Request 95 % utilization

3 3 Problem  Inefficient use of local interconnect –Repeated transfers –Transfer every bit of data sent out to network  Local interconnect bottleneck  Transfer overhead exacerbates inefficiency –Overhead reduces available bandwidth –E.g. Peripheral Component Interconnect (PCI) 30 % transfer overhead

4 4 Solution  Network interface data caching –Cache data in network interface –Reduces interconnect traffic –Software-controlled cache –Minimal changes to the operating system  Prototype web server –Up to 57% reduction in PCI traffic –Up to 31% increase in server performance –Peak 1571 Mb/s of content throughput Breaks PCI bottleneck

5 5 Outline  Background  Network Interface Data Caching  Implementation  Experimental Prototype / Results  Summary

6 6 Network Interface Cache Network Interface Data Cache  Software-controlled cache in network interface CPU Main Memory Interconnect Request File Headers Request Headers File X Network

7 7 Web Traces  Five web traces –Realistic working set / file distribution  Berkeley computer science department  IBM  NASA Kennedy Space Center  Rice computer science department  1998 World Cup

8 8 Content Locality  Block cache with 4KB block size 8-16MB caches capture locality

9 9 Outline  Background  Network Interface Data Caching  Implementation –OS modification / NIC API  Experimental Prototype / Results  Summary

10 10 Unmodified Operating System  Transmit data flow Device Driver Network Stack File Page 1. Identify pages Page 2. Protocol processing Break into packets Packet Page Packet Page 3. Inform network interface Packet Page

11 11 Modified Operating System  OS completely controls network interface data cache  Minimal changes to the OS Device Driver Network Stack File Page 1. Identify pages (Unmodified) Cache Directory Page 2. Annotate (New step) Page 3. Protocol processing Break into packets (Unmodified) Packet Page Packet Page 4. Query directory (New step) Packet Page 5. Inform network interface (Unmodified) Packet Page

12 12 Operating System Modification  Device Driver –Completely controls cache –Makes allocation/use/replacement decisions  Cache directory (in device driver) –An entry is a tuple of file identifier offset within file file revision number flags –Sufficient to maintain cache coherence

13 13 Network Interface API  Initialize  Insert data into the cache  Append data to a packet  Append cached data to a packet TX Buffer Cache Main Memory Network Interface Inter- connect Append Append cached data

14 14 Outline  Background  Network Interface Data Caching  Implementation  Experimental Prototype / Results  Summary

15 15 Prototype Server  Athlon 2200+ processor, 2GB RAM  64-bit, 33 MHz PCI bus (2 Gb/s)  Two Gigabit Ethernet NICs (4 Gb/s) –Based on programmable Tigon 2 controller –Firmware implements new API  FreeBSD 4.6 –850 lines of new code/150 lines of kernel changes  thttpd web server –High performance lightweight web server –Supports zero-copy sendfile

16 16 Results: PCI Traffic ~1260 Mb/s is limit! ~60 % Content traffic PCI saturated 60 % utilization 1198 Mb/s of HTTP content 30 % Overhead

17 17 Results: PCI Traffic Reduction Low temporal reuse Low PCI utilization Good temporal reuse CPU bottleneck 36-57 % reduction with four traces

18 18 Results: World Cup Temporal reuse (84 %) PCI utilization (69 %) 57 % traffic reduction 7% throughput increase 794 Mb/s w/o caching 849 Mb/s w/ caching CPU bottleneck

19 19 Results: Rice Temporal reuse (40 %) PCI utilization (91 %) 40 % traffic reduction 17% throughput increase 1126 Mb/s w/o caching 1322 Mb/s w/ caching Breaks PCI bottleneck

20 20 Results: NASA Temporal reuse (71 %) PCI utilization (95 %) 54 % traffic reduction 31% throughput increase 1198 Mb/s w/o caching 1571 Mb/s w/ caching Break PCI bottleneck

21 21 Summary  Network interface data caching –Exploits web request locality –Network protocol independent –Interconnect architecture independent –Minimal changes to OS  36-57% reductions in PCI traffic  7-31% increase in server performance  Peak 1571Mb/s of content throughput –Surpasses PCI bottleneck


Download ppt "Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture."

Similar presentations


Ads by Google