KAIST CORE LAB. Chul Lee Performance Issues in WWW Servers Erich Nahum, Tsipora Barzilai, and Dilip Kandlur IBM T.J Watson Research Center SIGMETRICS Feb Chul Lee
KAIST CORE LAB. Chul Lee Contents Introduction Background Experimental Setup/Testbed Evaluation Result Conclusion and Critique
KAIST CORE LAB. Chul Lee Introduction Performance issues in WWW server on UNIX-style platforms Issues –new socket function –per-byte optimizations –per-connection optimizations
KAIST CORE LAB. Chul Lee Background
KAIST CORE LAB. Chul Lee Experimental Setup/Testbed Hardware –4 IBM 3P RS/6000 (128MB of RAM and 200MHz PowerPC604e) –100 mbps ethernet Workload Generator –WebStone as a microbenchmark –SURGE as a macrobenchmark
KAIST CORE LAB. Chul Lee
Experimental Setup/Testbed OS –AIX with several modifications Web server software –Flash POLL Single threaded event-driven server Fastest, well optimized, exploit almost all optimization Use poll() rather than select()
KAIST CORE LAB. Chul Lee WebStone benchmark Throughput in operation/sec
KAIST CORE LAB. Chul Lee Evaluation Result proposed socket function –acceptex() Combines accept(),getsockname(),recv() –send_file() over mmap()/writev() send_file() copy data from the file system mmap()/writev() copy data from user space send_file() perform poorly
KAIST CORE LAB. Chul Lee Evaluation Result Throughput in HTTP ops/sec
KAIST CORE LAB. Chul Lee Evaluation Result Per-Byte Optimization –send_file() with mbuf cache a close approximation of a zero-copy –Disabling the internet checksum certain network interface support Host CPU does not touch the data at all
KAIST CORE LAB. Chul Lee Evaluation Result Throughput in HTTP ops/sec
KAIST CORE LAB. Chul Lee Evaluation Result Per-Connection Optimization –send_file() with close() –Piggybacking the FIN –Delaying ACK of FIN –Delaying ACK of SYN-ACK
KAIST CORE LAB. Chul Lee
Evaluation Result
KAIST CORE LAB. Chul Lee Evaluation Result
KAIST CORE LAB. Chul Lee Evaluation Result Overall performance with SURGE
KAIST CORE LAB. Chul Lee Conclusion new socket function –little increase in performance using acceptex() per-byte optimizations –observed an increase in throughput up to 51% per-connection optimizations –Raising server throughput by up to 20% Aggregate benefits –Improved aggregate server performance by 25%
KAIST CORE LAB. Chul Lee Conclusion IBM ’ s AIX division released these features in AIX For future work, evaluation of these mechanisms with HTTP 1.1 workloads Critique –Good observation/evaluation –Contributed to the newly released OS –Considered only throughput as a metric