Supporting Time-Sensitive Applications on a Commodity OS A. Goel, L. Abeni, C. Krasic, J. Snow and J. Walpole Proceedings of USENIX 5th Symposium on Operating.

Slides:



Advertisements
Similar presentations
Effects of Clock Resolution on the Scheduling of Interactive and Soft Real- Time Processes by Yoav Etsion, Dan Tsafrir, Dror G. Feitelson Presented by.
Advertisements

Chapter 7 Protocol Software On A Conventional Processor.
ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.
CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Jonathan.
Uniprocessor Scheduling Chapter 9. Aim of Scheduling The key to multiprogramming is scheduling Scheduling is done to meet the goals of –Response time.
Introduction to Operating Systems – Windows process and thread management In this lecture we will cover Threads and processes in Windows Thread priority.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Home: Phones OFF Please Unix Kernel Parminder Singh Kang Home:
Supporting Time-sensitive Application on a Commodity OS Ashvin Goel, Luca Abeni, Charles Krasic, Jim Snow, Jonathan Walpole Presented by Wen Sun Some Slides.
Supporting Time-Sensitive Applications on a Commodity OS by Ashvin Goel, Luca Abeni, Charles Krasic, Jim Snow, Jonathan Walpole Jimi Watson.
Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented by Reinette Grobler.
Chapter 13 Embedded Systems
Supporting Time-sensitive Application on a Commodity OS By Ashvin Goel, Luca Abeni, Charles Krasic, Jim Snow, Jonathan Walpole Presenter: Shuping Tien.
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
1 Chapter 13 Embedded Systems Embedded Systems Characteristics of Embedded Operating Systems.
1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.
1Chapter 05, Fall 2008 CPU Scheduling The CPU scheduler (sometimes called the dispatcher or short-term scheduler): Selects a process from the ready queue.
Scheduler Activations Jeff Chase. Threads in a Process Threads are useful at user-level – Parallelism, hide I/O latency, interactivity Option A (early.
Chapter 19: Real-Time Systems Silberschatz, Galvin and Gagne ©2005 AE4B33OSS Chapter 19: Real-Time Systems System Characteristics Features of Real-Time.
Introduction to Embedded Systems
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
Chapter 6 Scheduling. Basic concepts Goal is maximum utilization –what does this mean? –cpu pegged at 100% ?? Most programs are I/O bound Thus some other.
Scheduling policies for real- time embedded systems.
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
TCPivo A High-Performance Packet Replay Engine Wu-chang Feng Ashvin Goel Abdelmajid Bezzaz Wu-chi Feng Jonathan Walpole.
1 Scheduling The part of the OS that makes the choice of which process to run next is called the scheduler and the algorithm it uses is called the scheduling.
Cpr E 308 Spring 2004 Real-time Scheduling Provide time guarantees Upper bound on response times –Programmer’s job! –Every level of the system Soft versus.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
What Every Developer Should Know about the Kernel Dr. Michael L. Collard 1.
Scheduling Lecture 6. What is Scheduling? An O/S often has many pending tasks. –Threads, async callbacks, device input. The order may matter. –Policy,
Fall 2013 SILICON VALLEY UNIVERSITY CONFIDENTIAL 1 Introduction to Embedded Systems Dr. Jerry Shiao, Silicon Valley University.
Time Management.  Time management is concerned with OS facilities and services which measure real time, and is essential to the operation of timesharing.
Lecture 7: Scheduling preemptive/non-preemptive scheduler CPU bursts
Uniprocessor Scheduling
Kernel Locking Techniques by Robert Love presented by Scott Price.
Chapter 2 Processes and Threads Introduction 2.2 Processes A Process is the execution of a Program More specifically… – A process is a program.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Oindrila.
System Components ● There are three main protected modules of the System  The Hardware Abstraction Layer ● A virtual machine to configure all devices.
Processes, Threads, and Process States. Programs and Processes  Program: an executable file (before/after compilation)  Process: an instance of a program.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
Supporting Time Sensitive Application in a Commodity OS Ashvin Goel, Luca, Charles, Jim, Jonathan Walpole Oregon Graduate Institute, Portland Presented.
19.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 19: Real-Time Systems.
Chapter 19: Real-Time Systems Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 19: Real-Time Systems System Characteristics.
CS333 Intro to Operating Systems Jonathan Walpole.
Time Management.  Time management is concerned with OS facilities and services which measure real time.  These services include:  Keeping track of.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
1.  System Characteristics  Features of Real-Time Systems  Implementing Real-Time Operating Systems  Real-Time CPU Scheduling  An Example: VxWorks5.x.
1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Scheduling.
Advanced Operating Systems CS6025 Spring 2016 Processes and Threads (Chapter 2)
Real-Time Operating Systems RTOS For Embedded systems.
Real-Time Performance of Linux “A Measurement-Based Analysis of the Real-Time Performance of Linux” (L. Abeni, A. Goel, C. Krasic, J. Snow, J. Walpole)
CPU Scheduling Scheduling processes (or kernel-level threads) onto the cpu is one of the most important OS functions. The cpu is an expensive resource.
Mohit Aron Peter Druschel Presenter: Christopher Head
Chapter 19: Real-Time Systems
CS 6560: Operating Systems Design
Chapter 2 Scheduling.
Real-time Software Design
Chapter 6: CPU Scheduling
Chapter 6: CPU Scheduling
CSCI1600: Embedded and Real Time Software
CPU scheduling decisions may take place when a process:
Threads Chapter 4.
Chapter 6: CPU Scheduling
Chapter 19: Real-Time Systems
Supporting Time-Sensitive Applications on a Commodity OS
COMP755 Advanced Operating Systems
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

Supporting Time-Sensitive Applications on a Commodity OS A. Goel, L. Abeni, C. Krasic, J. Snow and J. Walpole Proceedings of USENIX 5th Symposium on Operating System Design and Implementation (OSDI) December 2002

Introduction Many applications time-sensitive –Ex: periodic execution with low jitter (e.g. soft modem) –Ex: quick response to external event (e.g. frame capture in videoconference) OS must allocate resources at appropriate times Needs: –High precision timing facility –Well-designed preemptible kernel –Appropriate scheduling Most commodity (Windows/Linux) OSes don’t Special OS enhancements can support real-time –But hard real-time, s.t. degradation of non-real-time applications suffer

Approach 1) Firm timers for efficient, high-resolution timing 2) Fine-grained kernel preemptibility 3) Priority and Reservation–based CPU scheduling Integrate into Linux kernel  Time-sensitive Linux (TSL) Evaluate to show benefits real-time application, but not degrade performance of other apps

Outline Introduction(done) Related Work(next) Requirements Implementation Evaluation Conclusions

Related Work Illustration of real-time implementation difficulties [6,15,16] Mathematical real-time scheduling [10,19] –But ignore practical issues such as non- preemptibility Practical real-time scheduling [12,17,22] –But performance of non-real-time apps suffers Real-time micro-kern-lish [4] –But hard-timers add more overhead New OSes [9] –But different API so hard to port legacy apps

Time-Sensitive Requirements From time needed to handle event until actual dispatch is kernel latency Need: Timing Mechanism, Responsive Kernel, CPU Scheduling Algorithm

Timer Mechanism Accurate timer adds the most to kernel latency To address, can use: 1.One-shot timer (hard timer) – on x86, use on-chip CPU Advanced Programmable Interrupt Controller (APIC). Needs to be reprogrammed each time. 2.Soft Timer – check for expired timers at strategic locations, reduce the number of interrupts Solution: Combine 1. and 2., called firm timers

Responsive Kernel If timer is accurate, might still not have low kernel latency if kernel cannot respond –(Traditionally, thread in kernel runs until done) Solution: reduce size of non-preemptible regions

CPU Scheduling Algorithm Need to schedule the right process as quickly as possible Solutions: –Priority-based scheduler – pre-assign priorities and schedule in that order –Proportion-period scheduler – schedule with an upper-bound on delay

Misc Note, any one alone not sufficient! –High-resolution timer doesn’t help if kernel not preemtible or: –Responsive kernel not useful without accurate time Note, tasks may not be independent: –Ex: X server operates (and is scheduled) in FIFO order, even if apps that request them have different priority –Ex: Video application with higher priority than X server will have priority inversion (waiting on low priority) (will address later)

Outline Introduction(done) Related Work(done) Requirements(done) Implementation –Firm Timers(next) –Fine-Grained Preemptibility –CPU Scheduling Evaluation Conclusions

Periodic Timers Commodity OSes implement timing with periodic timers. –Ex: on Intel x86, interrupts generated with Programmable Interval Timer (PIT) –Ex: is 10 ms on Linux, thus is max latency Can reduce latency by reducing period, but adds more interrupt overhead Instead, move to one-shot timer Ex: two tasks, period 5 and 7 ms, timer period 1 ms, 35 ms running time –Periodic: 35 interrupts generated –One-shot: 11 interrupts generated (5, 7, 10, 14 …) –Plus, one-shot timer reduces timer latency

Firm Timer Design One-shot timer costs: timer reprogramming and fielding timer interrupts –Reprogramming cost has decreased in modern hardware (P2+) PIT on x86 used to use slow out on bus Newer APIC resides on CPU chip –Thus, remaining cost is interrupt cost Reduce by using soft-timers –Poll for expired timers at strategic points where context switch is occurring Ex: system call, interrupt, exception return Two new problems: poll cost and added timer latency Can solve 2 nd problem with timer overshoot –Provides upper bound on latency –Tradeoff between accuracy and overhead 0  hard timers, large  soft-timers At 100 MHz, theoretical accuracy of 10 nanoseconds

Firm Timer Implementation Timer queue for each queue, sorted by expiry When timer expires –execute callback function for each expired timer –Reprogram APIC Global overshoot value (but could be done per timer) Accessible through standard POSIX functions: nanosleep(), pause(), setitimer(), select() and poll()

Outline Introduction(done) Related Work(done) Requirements(done) Implementation –Firm Timers(done) –Fine-Grained Preemptibility(next) –CPU Scheduling Evaluation Conclusions

Reasons Scheduler Cannot Run Interrupts disabled –Hopefully, short Another thread in critical region Commodity OSes have no preemption for entire kernel period –Ex: when interrupt fires or duration of system call –Unless known it will be long (ex: disk I/O) –Preemption latency under Linux can be 30 ms

Enabling More Preemption 1) Add more preemption points –Must be done manually 2) Allow preemption anytime not using shared data structures –Protect shared structures with locks –Can still result in long latencies Combining 1) and 2) works best –(Done by Robert Love [11]) –(Authors evaluated in [1])

Outline Introduction(done) Related Work(done) Requirements(done) Implementation –Firm Timers(done) –Fine-Grained Preemptibility(done) –CPU Scheduling(next) Evaluation Conclusions

CPU Scheduling Priority CPU scheduling is simple, POSIX compliant –But assumes applications well-behaved in that mis-behaving high-priority application can deny service for others So, combine with proportion-period scheduling on top of the priority scheduling to give protection

Proportion-Period CPU Scheduling For single independent tasks, assign highest priority task –Mis-behaving task can consume “too much” –Need temporal protection Proportion-period provides temporal protection by allocating fixed CPU amount each period –Task executes as “real-time” (highest priority) for time Q every T –Period determined by application requirements (Ex: 30ms for video) Implemented using Earliest Deadline First (EDF)

Priority CPU Scheduling Priority inversion occurs when an application has multiple tasks that are independent –Example: Video application uses X –Video is highest since time-sensitive –Sends frame to X server and blocks –X server may be preempted by other medium priority task, hence delaying Video client To solve, use highest-locking priority (HLP) [19] in which task inherits priority when using shared resource –Example: display is shared resource so X server gets highest priority of blocking clients

Outline Introduction(done) Related Work(done) Requirements(done) Implementation(done) Evaluation(next) Conclusions

Evaluation 1) Behavior of time-sensitive applications running on TSL 2) The Overheads of TSL Setup: –Software Linux Robert Love’s lock-breaking preemptible kernel patch Proportion-period scheduler –Hardware 1.5 GHz Intel P4 512 MB RAM

Latency in Micro Benchmarks Test low-level components of kernel latency: timer, preemption and scheduling –Time-sensitive process that sleeps for a specified amount of time (using nanosleep() ) –Results: 10 ms in standard Linux, few microseconds in TSL Test preemption latency under loads –Results: Linux worst case 100 ms (when copying data from kernel to user space), but typically less than 10 ms and is hidden by timer latency. TSL is 1 ms. (Result details in [1])

Latency in Real Applications Tested two applications: –mplayer – a open-source audio/video player –Proportion-period scheduler - a kernel-level “application”

mplayer Details Synchronizes audio and video using time- stamps Audio card used as timing source When video frame decoded, time stamp compared with audio clock. –If late, then play –If early, then sleep for time then play If kernel not responsive or has coarse timing, will be poor audio/video synch and high inter-frame display jitter –Correlate with each other, so only look at synch

Testing MPlayer Compare Linux with TSL under: –Non-kernel CPU load – run user-level stress test –Kernel CPU load – large (8 MB) mem buffer copied to a file (one write() call), 90% in kernel mode –File-system load – large dir (linux src, files, 180 MB data, ext2 ) copied (via DMA) recursively and flushed For each test, run mplayer for 100 seconds at real-time priority

Non-kernel CPU Load : Linux -5 ms to 50 ms when X server run normal prio

Non-Kernel CPU Load : TSL (Much lower, but can still be 35 msec)

(X server at real-time, 250 microseconds (not y-axis)) (This config used for all others) Non-Kernel CPU Load : TSL

Kernel CPU Load : Linux (90 msec for Linux, since done in non-preemptible section)

Kernel CPU Load : TSL (Skew improves to less than 400 microseconds)

File System Load : Linux (Skew often low, but as high as 120 msec)

File System Load : TSL (Skew less than 500 microseconds, often lower)

Comparison with Real-Time Kernel Linux-SRT [6], includes finer-grained timers and reservation scheduler –(See figure 5a, 5b, 5c) Non-kernel CPU load skew less than 2ms, but as high as 7 ms (compare w/TSL of 250 microsec) Kernel CPU load worst case was 60 ms (compare w/TSL of 400 microsec) File-System load worst case was 30 ms (compare w/TSL of 500 microsec) Shows real-time scheduling and more precise timers insufficient. Responsive kernel also required.

Proportion-Period Scheduler Simultaneously ran 2 time-sensitive apps with proportions of 40% and 20% and periods of 8192 microsec and 512 microsec Each process records time via gettimeofday() and records in array Measure performance by differences in array compared with period

Maximum Deviation Deviations low. Higher when load is high. Maximum gives you bounds. Example: soft-modem needs CPU every 4 to 16 ms so could be supported.

System Overhead Measure 2 costs: 1.Costs of executing code at newly inserted preemption points 2.Costs of executing firm timers

Cost of Preemption Memory access test (sequentially access 128 MB array), fork test (create 512 processes) and file-system access test (copy 2 MB buffers to 8 MB file) –Designed in [1], should be worst case Tests hit additional preemption checks Measure ratio of completion time under TSL / Linux Result: memory.42%+-.18%, fork.53%+-.06%, file sys had no significant overhead

Cost of Firm Timers Firm timers user hard and soft timers. Costs: –Hard timers costs only – interrupt handling and cache pollution –Hard and soft timers common costs – manipulation timers from queue executing preemption for expired thread –Soft timers costs only – checking for expired timers

Firm Timers : Setup Timer process - time-sensitive process is periodic task, wakes up via setitimer() call, measures time, goes to sleep Throughput process – povray, a ray- tracing program rendering skyvase benchmark, measure elapsed time Run timer with 10 ms period since is supported by Linux

Firm Timer Overhead (Different overshoot values. 8 times w/95% confidence intervals) (Only small decrease in overhead with larger overshoot)

Firm Timer Overhead (Larger decrease in overhead since more timers) (Linux slower with 500 since synchronizes 50 procs. Artifact of setup)

Firm Timer Overhead at High Frequencies (Compare with hard timers only since Linux cannot do 1 ms)

Discussion Firm timers lower overhead when soft- timer checks find timers Firm timers higher overhead when soft- timer checks find nothing and timer goes off –From their work, firm timers lower when more than 2.1% of timer checks find timer

Conclusions TSL can support applications needing fine-grained resource allocation and low latency response –Firm timers for accurate timing –Fine-grained kernel preemptibility for improving kernel response –Proportion-period scheduling for providing precise allocation of tasks Variations of less than 400 microseconds under heavy CPU and file system load Overhead is low