Download presentation
Presentation is loading. Please wait.
Published byMarlene Cook Modified over 9 years ago
1
Scalable Kernel Performance for Internet Servers under Realistic Loads. Gaurav Banga, etc... Western Research Lab : Research Report 1998/06 (Proceedings of the 1998 USENIX Annual Technical Conference) Computer Architecture Lab. CS Dept. KAIST 2000/11/ Kim, Sung-Wan
2
1/16 Contents Introduction Problems of select() & ufalloc() in event-driven servers Scalable select() & ufalloc() Experimental evaluation Performance of a live system Conclusions
3
2/16 Introduction Event-driven servers –A single thread manage all connections –Lower context-switching & synchronization overhead faster than a thread-per-connection or pre-forked system –But, perform poorly under real conditions select() & ufalloc() select() –Asynchronous I/O ufalloc() –Allocation of a new file descriptor for a process
4
3/16 Problems in select() & ufalloc() WAN environments –Larger round-trip time and packet losses than LAN environments –Many open connections select() –select() -> do_scan() -> selscan() -> soo_select() –select_wakeup() -> do_scan() -> selscan() -> soo_select() –soo_select() check to see if the condition is true Linear search for all opened socket ufalloc() –Single bitmap (first lower descriptor number) –Too cost
5
4/16 Environment Server –AlphaStation 500(400Mhz), 192 MB of main memory –Digital UNIX 4.0B –Squid 1.1.11, NetCache 3.1.2c-OSF Client –AlphaStation 500(333Mhz) –Digital UNIX 3.2C –S-Client Network –100Mbps FDDI Profiling –DCPI
6
5/16 CPU times in unmodified kernel
7
6/16 Scalable select() & ufalloc() select() –READY, INTERESTED, HINTS set –sowakeup() Records a hint in the HINTS sets of each of the threads in the referencing processes for which this socket is present in the INTERESTED set of the thread. ufalloc() –2-level bitmap 10 11110011 Level 0 map Level 1 map INTERESTED new = SELECTING U INTERESTED old READY new = C (INTERESTED new ^ (!INTERESTED old U READY old U HINTS)) READY to_user = SELECTING ^ READY new
8
7/16 Experimental Evaluation - Scalability with respect to connection rate * 750 infinitely slow connections
9
8/16 Experimental Evaluation - Scalability with respect to connection rate
10
9/16 Experimental Evaluation - Scalability with respect to connection count
11
10/16 Performance of a live system Server –A Web proxy system at DEC –AlphaStation 500 (500 MHz), 512 MB of RAM –Running the system for an entire day –Proxy Squid NetCache
12
11/16 Performance of a live system - NetCache with caching disabled
13
12/16 Performance of a live system - NetCache with caching disabled
14
13/16 Performance of a live system - NetCache with caching enabled
15
14/16 Performance of a live system - NetCache with caching enabled
16
15/16 Performance of a live system - Squid with caching disabled
17
16/16 Performance of a live system - Squid with caching disabled
18
17/16 Performance of a live system - Squid with caching disabled
19
18/16 Conclusions WAN delays Linear scaling in the select() & ufalloc() –lead to excessive kernel CPU computation Scalable versions –improve the performance of Web servers and proxies
20
19/16 select(maxfd, &readfds, &writefds, …, …); 1008 for (i = 0; i < maxfd; i++) { 1009 /* Check each open socket for a handler. */ 1010 if (fd_table[i].read_handler) { 1011 if (fd_table[i].stall_until <= squid_curtime) { 1012 nfds++; 1013 FD_SET( i, &readfds); 1014 } 1015 } 1016 if (fd_table[i].write_handler) { 1017 nfds++; 1018 FD_SET(i, &writefds); 1019 } 1020 }
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.