Real-Time Systems Introduction Johnnie W. Baker
What is a Real-Time System Correctness of the system depends not only on the logical results, but also on the time in which the results are produced. Works in a reactive and time-constrained environment Examples Real-time temperature control of a chemical reactor Space mission control system Nuclear power generator system Many safety-critical systems
Parts of Typical Real-Time System Controlling System Computer system acquires information about the environment through sensors Performs computations on data Activates the actuators through some controls Controlled System Operating Environment
Example – Car Driver Mission Reach destination safely Controlled System Car Operating Environment Road Conditions and other cars Controlling System Sensors Human driver: eyes and ears Computer: Camera, Infrared receiver, Laser telemeter Controls Accelerator, Steering wheel, Break Pedal Actuators Wheels, Engine, Brakes
Example – Car Driver (Cont.) Critical Tasks Steering and Braking Non-critical Tasks Turning on the radio Performance A measure of the goodness of outcome, relative to the best outcome possible under the given circumstances Reliability (of driver) Fault-tolerance is a must
Typical Real-Time System Parts
Tasks and Jobs Jobs are units of work that are scheduled and executed by the systems. The set of related jobs that can be solved by the same algorithm are called a task. A job is an instance of a task. Use of “instance” is same as in complexity theory Not all books distinguish between tasks & job. In some situations, it is awkward to maintain a distinguish between these two concepts. However, it is important in some situations to understand the differences between these two concepts.
Type of Deadlines A deadline d is a time at which a task/job must be complete. A hard deadline means that it is vital for the safety that this deadline always be met. A soft deadline means that it is desirable to finish executing the task/job by the deadline, but that no catastrophe occurs if completion is late Some soft tasks only require that the task be completed as soon as possible. A firm deadline means there is no value in completing the task after its deadline.
Typical Pictorial Interpretation
Periodic Tasks Periodic Tasks are activated regularly at fixed rates. These tasks are time-driven Example: Monitoring temperature of a patient The length of time between to successive activations is called the period. In many practical cases, a periodic task can be characterized by its computation time C and its deadline D. Often the deadline is set equal to the period. Consists of a sequence of identical “jobs” or “instances” of the task.
Aperiodic Tasks Aperiod tasks are tasks which are activated irregularly at some unknown and possibly unbounded rate. Event driven Example – Activated when condition of patient changes. Sporatic tasks are tasks that are activated with some known and bounded rate. Necessary to bound the workload generated by such tasks.
Some Task/Job Parameters Arrival Time (or Release time): The time r at which a job becomes ready for execution Computation Time: The time C necessary for the processor to execute the task without interruptions. Absolute Deadline : The time d before which a task should be completed to avoid damage (if hard) or system degradation (if soft). Start Time: The time s at which a job starts its execution Finish Time: The time f at which a job finishes its execution. Response Time: The time R between the finish time and the request time (i.e., R = f – r )
Some Task/Job Parameters (cont.) Criticalness: A parameter related to the consequence of missing the deadline Usually hard, soft, firm Lateness: A parameter L giving the delay of the task completion with regard to its deadline. (L = f – d) Laxity (or Slack time): The maximum time X that a task can be delayed on its activation and still complete within its deadline. (X = d – r – C)
Computing Systems Considered Uniprocessor Multiprocessor Systems Called MIMD Computers in Parallel Architecture Multiprocessors (shared memory systems) UMA or SMP (Symmetric Multiprocessors) NUMA Multicomputers (Message Passing Systems) Traditional in Real-Time Systems & Complexity theory to refer to MIMD systems as multiprocessors Distributed Systems
UMA or SMP
UMA or SMP Straightforward extension of uniprocessor Add CPUs to bus All processors share same primary memory Memory access time same for all CPUs Uniform memory access (UMA) multiprocessor Processors communicate using shared memory Also called a symmetrical multiprocessor (SMP)
Problems Associated with Shared Data with SMP/UMA The cache coherence problem Replicating data across multiple caches reduces contention among processors for shared data values. But how can we ensure different processors have the same value for same address? The cache coherence problem occurs when an obsolete value is still stored in a processor’s cache.
Distributed Multiprocessor or NUMA Distributes primary memory among processors Increase aggregate memory bandwidth and lower average memory access time Allows greater number of processors Also called non-uniform memory access (NUMA) multiprocessor Local memory access time is fast Non-local memory access time can vary Distributed memories have one logical address space
Distributed Multiprocessors or NUMA
Multicomputers Distributed memory multiple-CPU computer Same address on different processors refers to different physical memory locations Processors interact through message passing
Typically, Two Flavors of Multicomputers Commercial multicomputers Custom switch network Low latency (the time it takes to get a response from something). High bandwidth (data path width) across processors Commodity clusters Mass produced computers, switches and other equipment Use low cost components Message latency is higher Communications bandwidth is lower
Multicomputer Communication Processors are connected by an interconnection network Each processor has a local memory and can only access its own local memory Data is passed between processors using messages, as dictated by the program Data movement across the network is also asynchronous A common approach is to use MPI to handling message passing
Multicomputer Communications (cont) Multicomputers can be scaled to larger sizes much easier than multiprocessors. The data transmissions between processors have a huge impact on the performance The distribution of the data among the processors is a very important factor in the performance efficiency.
Message-Passing Disadvantages Programmers must make explicit message-passing calls in the code This is low-level programming and is error prone. Data is not shared but copied, which increases the total data size. Data Integrity Difficulty in maintaining correctness of multiple copies of data item.
Message-Passing Advantages No problem with simultaneous access to data. Allows different PCs to operate on the same data independently. Allows PCs on a network to be easily upgraded when faster processors become available.
Real-Time System Size & Coordination Currently, in most real-time systems, either The entire system code is loaded into memory or Else there are well-defined phases and each phase is loaded just prior to its execution. These options may not be practical for the larger systems that will arise in the next generation In many applications, subsystems are highly independent of each other & little coordination is needed. Simplifies many aspects of building & analyzing system Increasing system size and/or task coordination give rise to many problems and complicate providing predictability
Environments The environment in which a real-time system operates plays a large role in the design of the system Many environments are well-defined and “deterministic”. Give rise to small static real-time systems Characteristics of all tasks known in advance All deadlines can be guaranteed to be met. Other environments may be complicated, and less controllable. Many future applications expected to be more complex, distributed, non-deterministic, & dynamic. Systems to handle these complex environments are expected to require dynamic real-time systems Observe: It does not follow from above that either static real-time systems have to be small and uncomplicated Complex systems have to be dynamic
Predictability One of the most important properties that a hard real-time system should have. The system should be able to predict the evolution of tasks and guarantee in advance that all critical timing constraints will be met. The reliability of this guarantee depends on a number of factors which involve The architectural features of the computer hardware For uniprocessors, the policies adopted in the kernel Programming language used to implement the application.
Predictability & Types of Systems In a static system where characteristics of tasks are known in advance, guarantees can be given at design time that all their timing constraints will be met. For dynamic systems, characteristics of all tasks are not known in advance. Generally accepted for dynamic systems that Essentially no guarantees can be given at design time Guarantees can only be given at run time using an online schedulability analysis approach.
Predictability Redefined The online schedulability analysis approach to provide guarantees: Determines whether a given task can be completed by its deadline without jeopardizing other tasks. If constraints can not be met, task is rejected and system may invoke some type of recovery action. Predictability for dynamic real-time systems is reinterpreted to mean that Once a task (or job?) is admitted into a system, its deadline guarantee is never violated as long as the assumptions under which it was admitted hold. See Reference 6 in Murthy & Manimaran on page 18
Determining Task Performance Difficult to obtain specific information on task characteristics for dynamic real-time systems For schedulability analysis, worst-case analysis is assumed Possibly derived by extensive simulation, testing, etc. Values used may not be true worst-case values Actual values may exceed these “worst-case” values on rare occasions. Called a specification violation Real-time systems are supposed to monitor such events and take recovery action when they occur. See Reference [2] pg 18 in text for more information.
Common Misconceptions about RTS Real-time computing is equivalent to fast computing. Belief is that a sufficiently fast computer will solve problem Predictability, not speed, is the foremost goal in real-time systems design. Fast computing does not, in general, support predictability Predictability depends on factors like architecture, implementation language, and environment
Common Misconceptions about RTS (cont) Real-time computing is assembly coding, priority interrupt programming, and writing device drivers To meet stringent timing requirements, current practices rely heavily on these techniques. Are labor intensive, difficult to trace, and a major source of bugs. A primary objective in RTS is to automate the production of highly efficient code
Common Misconceptions about RTS (cont) Real-time systems operate in a static environment. A real-time system will have to satisfy different timing constraints at different times. E.g., in ATC the takeoff, high altitude flight, landing The environment is usually nondeterministic, resulting in aperiodic tasks Claims these demands require a dynamic system
Common Misconceptions about RTS (cont) The problems in real-time systems have all been solved in other areas of computer science. RTS present unique problems that have not been addressed in other areas Deadline requirements, tasks are periodic or aperiodic, task have synchronization or resource constraints, etc. Assumptions in other areas may make results useless These areas include operation research, databases, & software engineering Real Time Databases has different set of problems than traditional databases.
Overview of Issues in RTS Resource Management Scheduling, resource reclaiming, fault-tolerance, communications This is focus of text Architecture issues Processor architecture, network architecture, I/O architecture Software Specification and verification of real time systems Programming Languages Databases
Task Scheduling Overview Involves allocation of processors, resources, and time in way that performance requirements are met. Scheduling algorithms must satisfy timing, resource, and precedence constraints Must provide predictable intertask communication
Preemptive vs Non-preemptive Scheduling once a task starts execution, it executes to completion Has less ability to produce a schedule Has less overhead because of less context switching Preemptive scheduling Task can be preempted and resumed later Allows arriving tasks with higher priority to move ahead of lower priority tasks already executing Greater overhead due to context switching. Provides greater ability to produce a schedule Preempted tasks frequently redo earlier computation Task may not resume computing on same processor
Task-Scheduling Paradigms Static table-driven approaches Perform static schedulability analysis Resulting schedule is usually stored in a table and controls when a task will start executing. Static priority-driven preemptive approaches Perform schedulabilty analysis but does not create table Tasks executed in a highest-priority-first basis
Task-Scheduling Paradigms (cont) Dynamic planning-based approach Schedulability of task is checked at run time Arriving tasks is accepted for execution if it is found to be schedulable. Once scheduled, a task is guaranteed to meet performance requirements. Dynamic best-effort approaches No schedulability check is done Systems does its best to meet requirements of task Task may be aborted during execution
Performance Metrics RT Systems Metric Non-RT System Metrics Schedulabilty: Ability to schedule tasks to meet their deadlines. Non-RT System Metrics Throughput, response time, minimizing schedule length.
Resource Reclaiming Problem of reclaiming unused time processor and resource time due to Task executing in less time than worst-case Task being deleted from current schedule
Fault Tolerance RT systems must be able to deliver the expected performance, even in presence of faults. Fault tolerant is an inherent requirement of any real-time system. Basic principle of fault tolerance is redundancy. Costs both time and money System design must trade off amount of fault-tolerance required with level of fault tolerance
Real Time Communication Any type of communication in which the messages involved have timing constraints. Two categories Periodic messages are generated in communicating periodic tasks. If periodic task encounters a delay longer than its period, it is considered lost. Often used for sending sensor data Aperiodic messages are generated in communicating aperiodic tasks Arrival pattern is stochastic in nature Has end-to-end deadline and is considered lost if it fails to reach destination before this deadline.
Architecture Issues Needs predictability in instruction execution time memory Memory access Context switching Interrupt handling Avoids caches, virtual memory, superscalar features. Fast & predictable I/O systems required Support for fast and reliable communications Support for error handling Support for scheduling algorithms Support for real-time operating system Support for a RT language features.
Software Engineering Issues: Requirements, Specification, Verification Functional requirements: Operation of the system and their effects Address logical correctness Non-functional requirements: Timings, constraints F & NF requirements need to be precisely defined and used together to construct the specification of the system A specification is an abstract mathematical statement of properties to be exhibited by system. It is checked for conformity with the requirements Its properties can be examined independent of the way it is implemented Should not enforce any decisions about the structure of software, the programming language to be used, or system architecture
Software Engineering (cont.) Usual process for specifying computing system behavior involves Enumerating events or actions that the system participates in And describing the order in which they can occur. Not well understood how to extend such approaches for real-time constraints
Real-Time Languages Some desirable features a real-time language should support are Language constructs for expressing timing constraints and keeping track of resource utilization. Support for schedulability analysis, with the goal of being able to perform schedulability checks at compile time. Support for reusable real-time software modules using object-oriented methodology. Support for predicting the timing behavior of real-time programs in distributed systems A few formal languages that support timing constraints have been developed. Used to specify & verify small scale systems Do not adequately address fundamental RT problems
Real-Time Databases Most conventional database systems are disk-based They use transaction logging and two-phase locking protocols to ensure transaction atomicity and serializability While these preserve data integrity, they result in relatively slow and unpredictable response times. Important real-time database systems issues include Transaction scheduling to meet deadlines Explicit semantics for specifying timing and other constraints Checking the database system’s ability of meeting transaction deadlines during initialization of application
Additional Basic Concepts from Stankovic & Buttazzo From Chapter 2 of each Book
Goals of Additional Slides In the previous slides, we covered the concepts in Chapter 1 of text by Murthy and Manimaran. Before moving on to scheduling, we add in some additional preliminary material covered in Ch. 2 of book by Stankovic et. al. or in Ch. 2 of book by Buttazzo. Some of this was mentioned earlier, but not covered completely. Since it is available at our website, Chapter 2 in Stankovic is added to our list of reading & study assignments.
Deadlines and Release Times Ref: Stankovic, page 16-17 The deadline time for one instance of a periodic task is usually the release time of its next instance. ri,j = (j-1)Ti Here, r is the release time, i denotes the i-th task, j indicates the j-th instance of a task, and T denotes the period (or interarrival time of a task) Equivalently, the release time for one instance of a periodic task is usually the deadline time for the preceding instance of this task. Note di,j = ri,j + Ti = jTi
Deadlines and Release Times For sporadic tasks, we assume that the release time of two consecutive instances must be separated at least by their minimal interarrival time. Symbolically, ri,j ≥ ri,j-1 + Ti The deadline of a sporadic task is often assumed to be equal to the earliest possible release time of the next instance. Symbolically, di,j = ri,j + Ti Review the chart & discussion on pg 17.
Initial Assumptions for Scheduling Liu & Layland (Ref 9 in Ch 2 of Stankovic) A1: All hard tasks are periodic A2: Jobs are ready to run at their release time A3: Deadlines are equal to periods A4: Jobs do not suspend themselves A5: Jobs are independent in that No synchronization is required between them Jobs do not have shared resources, other than the CPU. There is no relative dependencies or constraints on release times or completion times. A6: There is no overhead costs for preemption, scheduling, or interrupt handling. A7: Processing is fully preemptable at any point.
Assumptions for Scheduling (Cont) The preceding assumptions are initially made in Ch. 3 of Stankovic However, these are not practical for most actual systems. See pg 18-19 of Stankovic for possible relaxations of some of these initial assumptions.
Static, Dynamic, & Offline Scheduling Ref: Stankovic, pg 19-22. Static Scheduling refers to the fact that the scheduling algorithm has complete knowledge about the task set and its constraints such as deadlines, computation times, etc. Static scheduling is viewed as realistic for many real-time systems, such as simple process control applications. Dynamic Scheduling Algorithms (in this book) has complete knowledge of the currently active tasks, but new arrivals may occur in the future that are not known to the algorithm at the time the current set of activities are being scheduled.
Static, Dynamic, & Offline Scheduling (cont) Off-line scheduling: is done prior to the design of the scheduler. In particular, off-line scheduling is not the same as static scheduling. In static real-time scheduling, an off-line schedule is found that meets all deadlines. The designer must identify a maximum set of tasks and their worst case assumptions The off-line analysis is sometimes used to produce a static set of priorities that is used to drive the schedule that is produced. If a real-time system is operating in a more dynamic environment , then it is not feasible to meet assumptions of static scheduling (where everything is not known a priori)
Schedulability Ref: Buttezzo pg 22-23, Stankovic, pg 23-24, Uniprocessor: Given a set of tasks, a schedule is an assignment of tasks to the processor so that each is executed to completion. Multiprocessor: Given sets of tasks, processors, and resources, a schedule is an assignment of processors and resources to tasks so that all tasks are completed. A schedule is said to be feasible if all tasks can be completed according to a set of specified constraints. Stankovic suggests that tasks might be restricted to hard tasks. A set of jobs is schedulable if all there exists at least one algorithm that can produce a feasible schedule.
Optimality of Algorithms Ref: Stankovic, pg 23-24 An optimal real-time scheduling algorithm is one which may fail to meet a deadline only if no other scheduling algorithm can meet this deadline. This definition of “optimal” is the typical one used in real-time scheduling. The usual non-real-time definition of optimal says an algorithm is optimal if it minimizes (maximizes) some cost function. Running time for uniprocessors and cost = (running time nr processors) for parallel. Another metric sometime used is to maximize the number of task arrivals that meet their deadline.