Computer Science Scalability of Linux Event-Dispatch Mechanisms Abhishek Chandra University of Massachusetts Amherst David Mosberger Hewlett Packard Labs Palo Alto
Computer Science Motivation Large Web and Internet traffic Heavily Loaded/Accessed Web Servers cnn.com, britneyspearsfans.com, … Starr Report, Napster ruling,... Challenge: Make Web Servers Scalable Clients WAN Web Server
Computer Science Server Scalability Issues Large number of concurrent/idle connections Last-mile problem: Slow end-connections High latency WAN traffic HTTP/1.1 Persistent Connections Heavy Request Loads Need for high throughput Pure Thread-based vs. Event-based servers Focus: Scalability of Event-based servers on Linux
Computer Science Outline Motivation Event-based Servers Linux Event-Dispatch Mechanisms Evaluation: Handling concurrent connections RT signals and Signal-per-fd enhancement Evaluation: Handling request load Concluding Remarks
Computer Science 1. Interest Set Specification Interest Set Event-Based Servers Server specifies Interest Set to Kernel Kernel notifies Server of Event on a connection Server handles I/O on the connection Kernel Server Connections 2. Network Event 3. Event Notification 4. I/O Handling
Computer Science Linux Event-Dispatch Mechanisms select() system call poll() system call /dev/poll interface POSIX.4 Real-Time Signals
Computer Science Interest Set select() system call Kernel Server Connections Ready Set Scan Interest Set specified on each call Notification requires scan of interest set
Computer Science poll() and /dev/poll Interest Set: List of pollfd structures Better for sparse interest sets, worse for dense sets Notification Requires scan of Interest Set r /dev/poll: Interest Set specified incrementally More compact ready set
Computer Science POSIX.4 Real Time Signals RT signals are queued Multiple signals of same type can be delivered RT signals carry a data payload (siginfo) Provides the context of the signal sigwaitinfo() system call: Dequeues signals Avoids overhead of calling signal handler Signal can be blocked
Computer Science 1. Associate RT Signal Interest Set Using Real Time Signals for Network I/O Interest Set specified incrementally No scanning of Interest Set required Kernel Server Sockets 2. Network Event 4.sigwaitinfo() 3. EnQueue Signal 5. DeQueue Signal Queue
Computer Science Outline Motivation Event-based Servers Linux Event-Dispatch Mechanisms Evaluation: Handling concurrent connections RT signals and Signal-per-fd enhancement Evaluation: Handling request load Concluding Remarks
Computer Science Evaluation: Handling Concurrent Connections Dispatch overhead and latency as a function of number of concurrent connections Experimental Setup 400 MHz P3 Linux test7 server μ -server using select(), /dev/poll or RT signals 10 clients running httperf Fixed request rate, increasing number of connections
Computer Science Server CPU Usage RT signal overhead independent of no. of concurrent connections 500 req/s
Computer Science Response Time RT signal response time independent of no. of concurrent connections 500 req/s
Computer Science Limitations of Real Time Signals Signal Queue Overflow: New events lost Can lead to “hung server” Unfair Allocation of Signal Queue Kernel Server Interest Set Sockets Network Event 32 Signal Queue Network Event Drop Network Event 3
Computer Science Handling Signal Queue Overflow Fallback mechanism select(), poll(), etc. Reconstruct current state Issues Server complexity Overhead of maintaining explicit interest sets Potential performance penalty
Computer Science RT Signal Enhancement: Signal-per-fd Goals: Avoid signal queue overflows Fair Allocation of signal queue Solution: Enqueue only one signal per socket Kernel Server Interest Set Sockets Network Event 2 Signal Queue Network Event Discard Network Event 1
Computer Science Signal-per-fd Idea: Signal queue length same as fdset size Bitmap used to efficiently determine presence/absence of signal in queue Advantages: Simpler Server Implementation No signal queue overflows No need for fallback mechanisms Fair Allocation of Signal Queue Resource Avoids too fine-grained event notification Coalesce multiple events for a socket
Computer Science Outline Motivation Event-based Servers Linux Event-Dispatch Mechanisms Evaluation: Handling concurrent connections RT signals and Signal-per-fd enhancement Evaluation: Handling request load Concluding Remarks
Computer Science Server Throughput Linear scaling of RT signals, signal-per-fd 6000 idle connections
Computer Science Server CPU Usage Linear Scaling of RT signals, signal-per-fd 6000 idle connections
Computer Science Related Work Event-Delivery API [BMD99] Performance studies: select() [BM98], /dev/poll [PL00] RT signals [PLT00] Web Servers: Event-based: Flash [PDZ99], phhttpd [Brown99] In-kernel: TUX, khttpd, AFPA [JKNRT01] Future: Linux 2.5 Asynchronous I/O?
Computer Science Summary Scalability issues with Linux Event-dispatch mechanisms Real Time Signals are scalable Performance independent of number of concurrent connections Signal Queue Overflow Problems Signal-per-fd enhancement potentially improves performance reduces server complexity provides fairness Patch available at