Eliminating Receive Livelock in an Interrupt-Driven Kernel J. C. Mogul and K. K. Ramakrishnana Presented by I. Kim, 01/04/13
Introduction Interrupt vs. Polling –Interrupts are designed for relatively slow I/O devices Target applications –Host-based routing, Passive network monitoring, Network file system –High event rates and protocols without flow control mechanism Receive Livelock –Interrupt handlers eats up all system resources to handle input events –Starvation of normal kernel/user threads
Requirements High throughput –Maximum Loss Free Receive Rate (MLFRR) Low latency and jitter Fair allocation of resources –Packet reception, protocol processing, transmission, and other tasks Overall Stability
Traditional Interrupt-Driven System (4.2 BSD) Packet arrival -> interrupt -> device driver (buffer mgmt + DL) -> S/W interrupt Several queues among steps –Overload => queue overflow => drop Batch processing of burst traffics Receive livelock, longer latency, and starvation of transmission processing
Latency induced by interrupt and batch processing
Avoiding Livelock Limiting interrupt (adaptation bt. poll) –Packet drop upon reception Disable interrupt temporarily and re-enable later –High/low watermark on CPU occupancy Polling –Round-robin through registered devices No preemption while processing a packet already received –Ensure work conservation –Do nothing in receiving interrupt handler –Removing almost queues in IP processing chain
Measurements Methodology –IP packet router configuration –DEC 3000/300 Alpha based Digital UNIX 3.2 (OSF/1) : rather slow –3000/400 as a packet generator –10,000 UDP packets with 4 bytes payload –Count netstat output (Opkts) –With and without screend (user level PF)
Receive Livelock Example
Other Scheduling Heuristics Quota on packet burst Feedback from full queue Cycle-limit on device driver
Other performance results End-system transport protocols –Benefits user level delivery performance Promiscuous network monitor
Concluding Remarks Related and future works –Clocked interrupts –4.3 BSD terminal I/O –Lazy Receiver Processing –Feedback-based scheduling algorithm –Early packet drop –Extension to Multiprocessor Kernels This research provided improved performance in Click project