Presented by Kristen Carlson Accardi

Slides:

Advertisements

Similar presentations

24-1 Chapter 24. Congestion Control and Quality of Service (part 1) 23.1 Data Traffic 23.2 Congestion 23.3 Congestion Control 23.4 Two Examples.

Advertisements

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 2: Computer-System Structures Computer System Operation I/O Structure Storage.

1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Jonathan.

Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented by Reinette Grobler.

Soft Timers Efficient Microsecond Software Timer Support for Network Processing MOHIT ARON and PETER DRUSCHEL Rice University Published in ACM Transactions.

G Robert Grimm New York University Receiver Livelock.

General System Architecture and I/O.  I/O devices and the CPU can execute concurrently.  Each device controller is in charge of a particular device.

CHAPTER 2: COMPUTER-SYSTEM STRUCTURES Computer system operation Computer system operation I/O structure I/O structure Storage structure Storage structure.

2: Computer-System Structures

Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.

1 Chapter 2: Computer-System Structures  Computer System Operation  I/O Structure  Storage Structure  Storage Hierarchy  Hardware Protection  General.

Platform Architecture Lab USB Performance Analysis of Bulk Traffic Brian Leete

1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Oindrila.

CE Operating Systems Lecture 2 Low level hardware support for operating systems.

1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.

Silberschatz, Galvin and Gagne  Applied Operating System Concepts Chapter 2: Computer-System Structures Computer System Architecture and Operation.

CE Operating Systems Lecture 2 Low level hardware support for operating systems.

1 Lecture 1: Computer System Structures We go over the aspects of computer architecture relevant to OS design  overview  input and output (I/O) organization.

Time Management.  Time management is concerned with OS facilities and services which measure real time.  These services include:  Keeping track of.

CORE Lab. E.E. 1 Soft timers : efficient microsecond so ftware timer support for network proc essing Mohit Aron and Peter Druschel 17 th ACM Symposium.

Advanced Operating Systems CS6025 Spring 2016 Processes and Threads (Chapter 2)

Soft Timers : Efficient Microsecond Software Timer Support for Network Processing - Mohit Aron & Peter Druschel CS533 Winter 2007.

Real-Time Operating Systems RTOS For Embedded systems.

Computer System Structures Interrupts

Mohit Aron Peter Druschel Presenter: Christopher Head

Memory Management.

lecture 5: CPU Scheduling

Chapter 2: Computer-System Structures(Hardware)

Chapter 2: Computer-System Structures

Resource Management IB Computer Science.

CPU Scheduling CSSE 332 Operating Systems

Topics discussed in this section:

Reddy Mainampati Udit Parikh Alex Kardomateas

Memory Caches & TLB Virtual Memory

Virtual Memory - Part II

OPERATING SYSTEMS CS3502 Fall 2017

Mechanism: Limited Direct Execution

Scheduling CS 111 On-Line MS Program Operating Systems Peter Reiher

CS 286 Computer Organization and Architecture

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 2: Computer-System Structures Computer System Operation I/O Structure Storage.

Chapter 6: CPU Scheduling

Chapter 8: Main Memory.

Operating Systems CPU Scheduling.

TCP - Part II Relates to Lab 5. This is an extended module that covers TCP flow control, congestion control, and error control in TCP.

Chapter 9: Virtual-Memory Management

CS 258 Reading Assignment 4 Discussion Exploiting Two-Case Delivery for Fast Protected Messages Bill Kramer February 13, 2002 #

Chapter 6: CPU Scheduling

So far, On the networking side, we looked at mechanisms to links hosts using direct linked networks and then forming a network of these networks. We introduced.

Computer-System Architecture

Module 2: Computer-System Structures

CSE 451: Operating Systems Spring 2012 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.

Process & its States Lecture 5.

Architectural Support for OS

Chapter 6: CPU Scheduling

CSE 451: Operating Systems Autumn 2003 Lecture 2 Architectural Support for Operating Systems Hank Levy 596 Allen Center 1.

Module 2: Computer-System Structures

CSE 451: Operating Systems Autumn 2001 Lecture 2 Architectural Support for Operating Systems Brian Bershad 310 Sieg Hall 1.

Operating Systems CMPSC 473

Supporting Time-Sensitive Applications on a Commodity OS

CSE 451: Operating Systems Winter 2003 Lecture 2 Architectural Support for Operating Systems Hank Levy 412 Sieg Hall 1.

The Transport Layer Reliability

Architectural Support for OS

Chapter 6: Scheduling Algorithms Dr. Amjad Ali

Chapter 2: Computer-System Structures

Chapter 2: Computer-System Structures

Module 2: Computer-System Structures

Module 2: Computer-System Structures

Chapter 13: I/O Systems.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Virtual Memory.

Presentation transcript:

Presented by Kristen Carlson Accardi Soft Timers: Efficient Microsecond Software Timer Support for Network Processing Mohit Aron and Peter Druschel Presented by Kristen Carlson Accardi

CS533 - Concepts of Operating Systems Presentation Topics What Problems are we trying to solve What is a Soft Timer How do Soft Timers solve our problem Conclusions CS533 - Concepts of Operating Systems

What problem are we trying to solve Interrupt overhead Context switching Cache misses & pollution TLB misses & pollution Efficient implementation of proposed optimizations for networking Event generation at microsecond granularity without increasing overhead interrupts cause a context switch, which will result in stalls due to cache misses and TLB misses not only on entrance to the interrupt service handler, but also when the process that was interrupted resumes. If interrupts occur at frequent intervals, the cost of the context switches very negatively impacts overall system performance. "Normal" interrupts like for scheduling or disk happen only at 10s of milliseconds, but high speed networking can cause interrupts to happen at 10s of microseconds intervals. There have been 2 optimizations for networking that have been proposed in other papers: Rate Based Clocking and Network Polling. Both of these optimizations require some way of generating events at the microsecond level of granularity that do not increase interrupt overhead. CS533 - Concepts of Operating Systems

Operation of Traditional timers Hardware based Software will program a chip that will interrupt the CPU either after a given amount of time, or at a regular frequency. Each interrupt causes a context switch into kernel mode Interrupt service routine then runs, uses Cache and TLB, missing both initially Process resumes on exit from the ISR, and process may now no longer have relevant information in cache/TLB This is known as cache & TLB pollution For Fine grained timers, this would have to happen very frequently Hardware timer's work by programming a piece of hardware on the system to interrupt the CPU after a certain amount of time, or at a regular frequency. This causes context switch overhead and Cache and TLB misses in both the Interrupt service routine, as well as the process that is resumed after the interrupt. Why is this a problem? On high speed networks or applications requiring very small intervals between clocks, this will be a really huge overhead. CS533 - Concepts of Operating Systems

Generating Events without a Timer (sorta) Amortize the overhead by using “trigger states” that already exist in the system to generate a soft timer event Soft timer events are just a function call to a handler Hardware timers are used to guarantee an upper bound on the interval Soft Timers allow us to reduce granularity of the clock without adding more interrupt overhead 1. Soft timers try to take amortize the overhead of these context switches by inserting themselves at the end of frequently occurring activities such as syscalls, page faults, interrupt handlers and the idle loop. The paper calls these "trigger states". ===>Insert table showing possible trigger states. 2. Whenever a trigger state is reached a cpu register is read to determine if the "timer" has expired for the next event. If it has, a function call is done to invoke associated handlers. 3. because the amount of time in between events is probabilistic, a hardware timer is used to guarantee an upper bound on the amount of time in between events. ===> draw picture on board – time line showing soft event triggers vs. hardware timer expiration. CS533 - Concepts of Operating Systems

Trigger Events in the system Syscalls, page faults, interrupt handlers, idle loop This is dependent on the workload of the system and how fast the CPU is. Ip_output – can come from multiple sources. From app, or from a received ack. CS533 - Concepts of Operating Systems

CS533 - Concepts of Operating Systems TCP protocol has shortcomings on high bandwidth – high latency networks Long RTT causes window based congestion control algorithm to mistakenly assume the network is congested. Slow start algorithm also doesn’t work well on high latency networks. Fairness is compromised because the connections with a shorter RTT will be given more bandwidth Delayed ACKs or Compressed ACKs can cause bursty behavior and increase congestion "Rate Based Clocking" is desired for TCP protocols to efficiently utilize high-bandwidth-high delay networks, but Rate Based clocking requires packets to be produced in the 10s of microseconds granularity. what's high-bandwidth-high delay-> when you have a long RTT the TCP window based congestion control algorithm will limit performance because it will assume there is a lot of congestion on the network due to the length of time it takes a receiver to generate an "ACK". But, if it just takes a long time because your receiver is across the globe, the RTT isn't a good indication of congestion, and TCP will not be able to utilize all it's available bandwidth. The "slow start" feature of TCP also is problematic for the same reason. This also causes a "fairness" problem, in that connections with shorter RTT will be given more bandwidth. CS533 - Concepts of Operating Systems

Rate-based Clocking for TCP Rate based clocking has been proposed as a method of alleviating these issues. Packets are sent out at a specified rate regardless of the ACKs received. Rate based clocking requires sending packets at a rate of 1 every few tens of microseconds The higher bandwidth the network the faster the rate With hardware timers, this causes a lot of overhead Soft Timers can be used to reduce the number of interrupts this implementation would cause In order to implement Rate based clocking, the system must be able to send out packets at a rate of 1 every few tens of microseconds (depends on the speed of the network - the higher bandwidth the network, the faster the rate must be). If hardware based interrupt latency is measured at almost 5 microseconds, that is a very large. CS533 - Concepts of Operating Systems

Implementing Rate-based Clocking with Soft Timers Keep track of desired transmission rate Keep track of maximum allowable number of packets that can be transmitted in a burst Just in case time between events is longer Schedule rate of transmission to sustain an average which achieves the desired transmission rate The authors implemented Rate based clocking with soft timers. They used an algorithm which keeps track of the desired transmission rate, and the max allowable number of packets that can be transmitted in a burst. then, transmissions are scheduled to allow the rate of transmission to average out to the desired transmission rate, but prevents heavy bursts of traffic by using the max burst transmission rate. CS533 - Concepts of Operating Systems

Network Interrupt Processing Overhead Normally 2 interrupts per packet On High speed networks interrupts will occur every few microseconds Other processes will starve System performance suffers due to context switches and cache/TLB misses Normal packet transmission on a hardware interrupt based system usually involves 2 separate interrupts: one for receiving the packet, and one for indicating back to the OS (device driver) that the packet has been sent. On a busy high speed network, this can cause an interrupt to occur ever few microseconds. Other processes on the system will be starved because they are constantly being interrupted, and performance of the system will grind to a halt due to all the context switches and cache/TLB misses. CS533 - Concepts of Operating Systems

CS533 - Concepts of Operating Systems Network Polling Removes interrupt overhead by scheduling periodic checks for new packets Allows batch processing of packets Other processes can run Normal scheduling doesn’t allow for a small enough period of time between polling intervals Most implementations use Interrupts when network is slow, and switch to Polling mode when network is loaded Soft timers provide small granularity so that interrupts are only needed with CPU is not loaded. An alternative to interrupt driven processing of network packets, is polling mode. The scheduler will actually schedule the device driver to check for packets periodically. This also allows "batch processing" of packets after a context switch has already occurred instead of requiring one context switch per packet. This can allow other processes on the system to run in between polling intervals, however, if you want to maintain high performance, the time between polling intervals must be small. Normal scheduling does not allow for a small enough interval between polling. Instead of pure polling mode, most implementations will use interrupt mode for when the network isn't under heavy load, and use polling mode for when the network is heavily loaded. Soft-timers can be used to implement polling mode and provide a small enough interval between polling so that interrupts need not be used when the CPU is saturated, since the "trigger events" can occur after only a few microseconds. CS533 - Concepts of Operating Systems

Network Polling Performance Aggregation: number of packets per poll. Aggregation works with ratebased clocking, and interrupt backups, otherwise you’d be concerned about waiting so long to process. CS533 - Concepts of Operating Systems

Effect of removing sources of events Attempt to show that removal of ip_intr does not impact soft timers latency as traps & ip_output is the largest source of events. CS533 - Concepts of Operating Systems

CS533 - Concepts of Operating Systems Conclusions Soft timers reduce cache & TLB pollution Soft timers can provide fine grained timer events due to frequency of trigger states which allows Network Polling. Removing network interrupts will still result in sufficient number of trigger states Software timers reduce overhead in rate-based packet transmission Even with use of hardware timer as way of guaranteeing upper bound. Use of soft timers can be experimentally shown to reduce cache & TLB misses as compared to hardware timers. Soft timers can provide fine-grained timer event support due to the frequency of trigger states in a system. - Even if you remove network interrupts from the system, there are still enough trigger events to achieve a fine grained timer event. Software timers greatly reduce overhead in rate-based packet transmission while maintaining an average interval in the 10s of microseconds. ====>insert table 6 if you want. Even using the hardware timer to create an upper bound on the amount of time in between trigger events results in very little additional overhead on the soft timers. CS533 - Concepts of Operating Systems