Lazy Asynchronous I/O For Event-Driven Servers Khaled Elmeleegy, Anupam Chanda and Alan L. Cox Department of Computer Science Rice University, Houston, Texas. Willy Zwaenepoel School of Computer and Communication Sciences EPFL, Lausanne, Switzerland.
2 Event-Driven Architecture Event-driven architecture is widely used for servers Event-driven architecture is widely used for servers Performance Performance Scalability Scalability
3 Problem Developing event-driven servers is hard for many reasons Developing event-driven servers is hard for many reasons We focus on problems with non-blocking I/O: We focus on problems with non-blocking I/O: Incomplete coverage Incomplete coverage Burden of state maintenance Burden of state maintenance
4 Lazy Asynchronous I/O (LAIO) Addresses problems with non-blocking I/O Addresses problems with non-blocking I/O Universality Universality Covers all I/O operations Covers all I/O operations Simplicity Simplicity Requires less code Requires less code High performance for event-driven servers High performance for event-driven servers Meets or exceeds alternatives Meets or exceeds alternatives
5 Outline Background Background Lazy Asynchronous I/O (LAIO) Lazy Asynchronous I/O (LAIO) Evaluation Evaluation Conclusions Conclusions
6 Outline Background Background Event-driven servers Event-driven servers Lazy Asynchronous I/O (LAIO) Lazy Asynchronous I/O (LAIO) Evaluation Evaluation Conclusions Conclusions
7 Event-Driven Servers Event loop processes incoming events Event loop processes incoming events For each incoming event, it dispatches its handler For each incoming event, it dispatches its handler Single thread of execution Single thread of execution Event Loop Handler #1 Handler #2 Handler #k
8 Event Handler I/O operation (Network/Disk) Complete handling event To event loop
9 Event Handler I/O operation (Network/Disk) Complete handling event Block If the I/O operation blocks If the I/O operation blocks The server stalls The server stalls To event loop
10 Outline Background Background Lazy Asynchronous I/O (LAIO) Lazy Asynchronous I/O (LAIO) API API Example of usage Example of usage Implementation Implementation Evaluation Evaluation Conclusions Conclusions
11 LAIO API Return Type Function Name Parameters intlaio_syscall int number,… void*laio_gethandlevoid intlaio_poll laio_completion[] completions, int ncompletions, timespec* ts
12 laio_syscall() Lazily converts any system call into an asynchronous call Lazily converts any system call into an asynchronous call If the system call doesnt block If the system call doesnt block laio_syscall() returns immediately laio_syscall() returns immediately With return value of system call With return value of system call Else if it blocks: Else if it blocks: laio_syscall() returns immediately laio_syscall() returns immediately With return value -1 With return value -1 errno set to EINPROGRESS errno set to EINPROGRESS Background LAIO operation Background LAIO operation
13 laio_gethandle() Returns a handle representing the last issued LAIO operation Returns a handle representing the last issued LAIO operation If operation didnt block, NULL is returned If operation didnt block, NULL is returned
14 laio_poll() Returns a count of completed background LAIO operations Returns a count of completed background LAIO operations Fills an array with completion entries Fills an array with completion entries One for each operation One for each operation Each completion entry has Each completion entry has Handle Handle Return value Return value Error value Error value
15 Event Handler With LAIO laio_syscall() Operation completed Complete handling event Yes No To event loop If operation completes If operation completes Handler continues execution Handler continues execution
16 Event Handler With LAIO laio_syscall() Operation completed Complete handling event Yes No To event loop If operation doesnt complete If operation doesnt complete laio_syscall() returns immediately laio_syscall() returns immediately Handler records LAIO handle Handler records LAIO handle Returns to event loop Returns to event loop Completion notification arrives later Completion notification arrives later Handle = laio_gethandle()
17 LAIO Implementation LAIO uses scheduler activations LAIO uses scheduler activations Scheduler activations Scheduler activations The kernel delivers an upcall when an operation The kernel delivers an upcall when an operation Blocks - laio_syscall() Blocks - laio_syscall() Unblocks - laio_poll() Unblocks - laio_poll()
18 laio_syscall() – Non-blocking case Issue operation Save context Enable upcalls System call blocks? Disable upcalls Return retval No laio_syscall() Application Library
19 laio_syscall() – Blocking case Issue operation Save context Enable upcalls System call blocks? laio_syscall() Application Library Yes
20 laio_syscall() – Blocking case Issue operation Save context Enable upcalls System call blocks? laio_syscall() Application Library Upcall on a new thread Kernel Yes Background laio_syscall
21 laio_syscall() – Blocking case Issue operation Save context Enable upcalls System call blocks? laio_syscall() Application Library Upcall on a new thread Kernel Yes upcall handler Steals old stack using stored context Library Background laio operation
22 laio_syscall() – Blocking case Issue operation Save context Enable upcalls System call blocks? laio_syscall() Application Library Upcall on a new thread Kernel Yes upcall handler Steals old stack using stored context Library Disable upcalls errno = EINPROGRESS Return -1 Background laio operation
23 Unblocking case List of completions is retrieved by the application using laio_poll() List of completions is retrieved by the application using laio_poll() Background laio operation completes, thread dies Upcall on the current thread Kernel upcall handler() Construct completion structure: laio operation handle. System call return value. Error code. Add completion to list of completions. Library
24 Outline Background Background Lazy Asynchronous I/O (LAIO) Lazy Asynchronous I/O (LAIO) Evaluation Evaluation Methodology Methodology Experiments Experiments LAIO vs. conventional non-blocking I/O LAIO vs. conventional non-blocking I/O LAIO vs. AMPED LAIO vs. AMPED Programming complexity Programming complexity Conclusions Conclusions
25 Methodology Flash web server Flash web server Intel Xeon 2.4 GHz with 2 GB memory Intel Xeon 2.4 GHz with 2 GB memory Gigabit Ethernet between machines Gigabit Ethernet between machines FreeBSD 5.2-CURRENT FreeBSD 5.2-CURRENT Two web workloads Two web workloads Rice 1.1 GB footprint Rice 1.1 GB footprint Berkeley 6.4 GB footprint Berkeley 6.4 GB footprint
26 Experiments: LAIO vs. conventional non-blocking I/O Flash-NB-BSingle Disk I/O Flash-NB-AIOSingle Disk I/O other than read and write Flash-LAIO-LAIOSingleNone Server-Network-DiskThreaded Blocking operations Compare performance
27 Performance: Large Workload
28 Performance: Small Workload
29 Why Be Lazy? Most potentially blocking operations dont actually block Most potentially blocking operations dont actually block Experiments: 73% - 86% of such operations dont block Experiments: 73% - 86% of such operations dont block Lower overhead Lower overhead Micro-benchmark: AIO is 3.2 times slower than LAIO for non-blocking operations Micro-benchmark: AIO is 3.2 times slower than LAIO for non-blocking operations
30 Experiments: LAIO vs. AMPED Flash-LAIO-LAIOSingleNone Flash-NB-AMPED Process-based helpers None Server-Network-DiskThreaded Blocking operations Compare performance
31 Asymmetric Multi-process Event- Driven (AMPED) Architecture Event-driven core Event-driven core Potentially blocking I/O handed off to a helper Potentially blocking I/O handed off to a helper Helper can block as it runs on a separate thread of execution Helper can block as it runs on a separate thread of execution Helper #1 Helper #2 Helper #n Event Loop Handler #1 Handler #2 Handler #k
32 Flash-NB-AMPED Stock version of flash Stock version of flash Two relevant helpers Two relevant helpers File read: reads a file from disk File read: reads a file from disk Name conversion: checks for a files existence and permissions Name conversion: checks for a files existence and permissions
33 Performance of LAIO vs. AMPED
34 Performance of LAIO vs. AMPED
35 Programming Complexity Where performance is comparable: Flash- LAIO-LAIO vs. Flash-NB-AMPED Where performance is comparable: Flash- LAIO-LAIO vs. Flash-NB-AMPED Flash-LAIO-LAIO is simpler Flash-LAIO-LAIO is simpler No helpers No helpers No state maintenance No state maintenance
36 State Maintenance With Non- blocking I/O I/O operation Operation completed Complete handling event Yes No To event loop If operation does not complete If operation does not complete Handler returns to event loop Handler returns to event loop handler
37 State Maintenance With Non- blocking I/O I/O operation Operation completed Complete handling event Yes No To event loop If operation does not complete If operation does not complete Handler returns to event loop Handler returns to event loop Operation is continued later Operation is continued later handler
38 State Maintenance With Non- blocking I/O I/O operation Operation completed Complete handling event Yes NoTo event loop Store state If operation does not complete: If operation does not complete: Handler returns to event loop Handler returns to event loop Operation is continued later Operation is continued later This may require storing state This may require storing state handler
39 Lines of Code Comparison Total code size File read Name Conversion Partial-write state maintenance 700 ComponentFlash-NB-AMPED lines of code Flash-LAIO-LAIO 9.5% reduction in lines of code
40 Conclusions LAIO is universal LAIO is universal Supports all system calls Supports all system calls LAIO is simpler LAIO is simpler Used uniformly Used uniformly No state maintenance No state maintenance No helpers No helpers Less lines of code Less lines of code LAIO meets or exceeds the performance of other alternatives LAIO meets or exceeds the performance of other alternatives
41 LAIO source can be found at: