9/9/ :17 1 Can Real-Time Systems be built with Off the Shelf Components? Krithi Ramamritham Real-Time Systems Laboratory University of Massachusetts, Amherst & Indian Institute of Technology, Bombay
9/9/ :17 2 Talk Outline Using off-the-shelf components for Real-Time applications? Future application characteristics Problems and challenges Characteristics and limitations of OTS OSs OTS component based solutions for distributed Real-Time applications User-level scheduling of communicating RT tasks Conclusions, recommendations, future
9/9/ :17 3 EWS Network Field Devices Field Network PLC Field Devices Field Network PLC Operator Commands Video/audio Device monitoring Background Server M.Services LynxOS M. Services VxWorks M. Services Operator Stations M. Services NT Industrial Control Environment
9/9/ :17 4 Future Application Characteristics Interactions and communication Human/device to human/device Co-existence of multiple types of media Small control and signal data Periodic updates Bursty file/page/image access and transfer Continuous media Computer and communication technology advances
9/9/ :17 5 What Are the New Challenges? Communication centered Operating System is no-longer standalone Support for real-time and non-real-time co-existence Integrated, end-to-end solutions required Application-level control of system resources Need to support new applications and technologies
9/9/ :17 6 System Research to Meet the Challenges Network Architecture Network Interface Design Resource Manager Scheduling Algorithms Scheduler Middleware Services Traffic Management Rate Control
9/9/ :17 7 Real-Time Spectrum Hard Soft
9/9/ :17 8 Real-Time OS Spectrum Hard Soft Real-Time Operating System General-Purpose Operating System VxWorks, Lynx, QNX,... Intime, HyperKernel, RTLinux (Windows NT, Linux)
9/9/ :17 9 Using General Purpose Operating Systems GPOS offer some capabilities useful for real-time system builders RT applications can obtain leverage from existing development tools and applications Some GPOSs accepted as de-facto standards for industrial applications
9/9/ :17 10 Windows NT -- for RT applications? Scheduling and priorities Preemptive, priority-based scheduling non-degradable priorities priority adjustment No priority inheritance No priority tracking Limited number of priorities No explicit support for guaranteeing timing constraints
9/9/ :17 11 Windows NT -- for RT applications? (contd.) Quick recognition of external events Priority inversion due to Deferred Procedure Calls (DPC) I/O management Timers granularity and accuracy High resolution counter with resolution of 0.8 sec. Periodic and one shot timers with resolution of 1 msec. Rich set of synchronization objects and communication mechanisms. Object queues are FIFO
9/9/ :17 12 Talk Outline Using off-the-shelf components for Real-Time applications? Future application characteristics Problems and challenges Characteristics and limitations of OTS OSs OTS component based solutions for distributed Real-Time applications User-level scheduling of communicating RT tasks Conclusions, recommendations, future
9/9/ :17 13 Goals - I Evaluate the real-time capabilities of NT. Identify areas where NT is (not) suitable for real- time applications. Determine to what extent the unpredictable parts of NT can be “masked”. Offer recommendations to designers using NT.
9/9/ :17 14 Priority Model Real-time class Idle Above Normal Normal Below Normal Lowest Highest 31 Time-critical Dynamic classes 15 Time-critical High class 1 Idle Normal class Idle class Thread Level
9/9/ :17 15 Thread Priority = Process class + level Real-time class Idle Above Normal Normal Below Normal Lowest Highest 31 Time-critical Dynamic classes 15 Time-critical High class 1 Idle Normal class Idle class Thread Level
9/9/ :17 16 Scheduling Threads scheduled by executive. Priority based preemptive scheduling. Interrupts Deferred Procedure Calls (DPC) System and user-level threads
9/9/ :17 17 Servicing an interrupt Power Failure APC Normal Exec. Dispatch/DPC Device X Interrupt Dispatch Table 1 Device interrupts Interrupt Service Routine DPC Routine Device Driver DPC DPC FIFO Queue DPC DPC FIFO Queue user-level threads Interrupts DPC
9/9/ :17 18 Servicing an interrupt Power Failure APC Normal Exec. Dispatch/DPC Device X Interrupt Dispatch Table 1 Device interrupts 2 Transfer control to ISR Interrupt Service Routine DPC Routine Device Driver DPC DPC FIFO Queue DPC DPC FIFO Queue user-level threads Interrupts DPC
9/9/ :17 19 Servicing an interrupt Power Failure APC Normal Exec. Dispatch/DPC Device X Interrupt Dispatch Table 1 Device interrupts 2 Transfer control to ISR Interrupt Service Routine DPC Routine Device Driver DPC DPC FIFO Queue DPC DPC FIFO Queue user-level threads Interrupts DPC
9/9/ :17 20 Servicing an interrupt Power Failure APC Normal Exec. Dispatch/DPC Device X Interrupt Dispatch Table 1 Device interrupts 2 Transfer control to ISR Interrupt Service Routine DPC Routine Device Driver DPC DPC FIFO Queue DPC DPC FIFO Queue user-level threads Interrupts DPC 3 Stop int.. and queue DPC
9/9/ :17 21 Servicing an interrupt Power Failure APC Normal Exec. Dispatch/DPC Device X Interrupt Dispatch Table 1 Device interrupts 2 Transfer control to ISR Interrupt Service Routine DPC Routine Device Driver DPC DPC FIFO Queue DPC DPC FIFO Queue user-level threads Interrupts DPC 3 Stop int.. and queue DPC
9/9/ :17 22 Servicing an interrupt Power Failure APC Normal Exec. Dispatch/DPC Device X Interrupt Dispatch Table 1 Device interrupts 2 Transfer control to ISR Interrupt Service Routine DPC Routine Device Driver 3 Stop int.. and queue DPC DPC DPC FIFO Queue DPC DPC FIFO Queue 4 Task level drops and DPC can execute user-level threads Interrupts DPC
9/9/ :17 23 Servicing an interrupt Power Failure APC Normal Exec. Dispatch/DPC Device X Interrupt Dispatch Table 1 Device interrupts 2 Transfer control to ISR Interrupt Service Routine DPC Routine Device Driver 3 Stop int.. and queue DPC DPC DPC FIFO Queue DPC DPC FIFO Queue 4 Task level drops and DPC can execute 5 Transfer control to driver’s DPC user-level threads Interrupts DPC
9/9/ :17 24 Servicing an interrupt Power Failure APC Normal Exec. Dispatch/DPC Device X Interrupt Dispatch Table 1 Device interrupts 2 Transfer control to ISR Interrupt Service Routine DPC Routine Device Driver 3 Stop int.. and queue DPC DPC DPC FIFO Queue DPC DPC FIFO Queue 4 Task level drops and DPC can execute 5 Transfer control to driver’s DPC 6 Execution of DPC routine user-level threads Interrupts DPC
9/9/ :17 25 I/O Handling I/O request is sent to device driver. Device completes operation and interrupts. Complete I/O request. Buffered I/ODirect I/O APC Device System User’s space Device System User’s space (Keyboard, mouse) (disk, network)
9/9/ :17 26 Prototype Software Architecture Operator Station Acquisition &Control Equipment Heartbeat Timer Command Ack. Real Video Receiver Producer Consumer Buffer Highest Priority Normal Priority Real-Time Class
9/9/ :17 27 Performance Metrics Round Trip Time (RTT) as seen by the operator input. Rate of execution of sensor data processing entities. Quality of the video output.
9/9/ :17 28 Workload Number of sensor data streams (2-20). Period of new sensor values ( ms). Period of control messages ( ms). Amount of work done in processing data. One Video and audio.
9/9/ :17 29 Operator Command Performance Two 1KB Sensor data streams 1 second update rate 30 ms period for control messages No work
9/9/ :17 30 Operator command Performance NT Scheduler - same RT priority 16 Sensor data streams (1KB) Update rate (100, 200, 500, 1000 ms) 90 ms period for control messages work
9/9/ :17 31 Operator command Performance (user-level scheduling) Rate Monotonic with limited levels 16 Sensor data streams (1Kb) Update rate (100, 200, 500, 1000 ms) 50 ms period for control messages work Time-cognizant dispatcher + plan 8 Sensor data streams (1Kb) Update rate (100, 200, 500, 1000 ms) 100 ms period for control messages work
9/9/ :17 32 Design principles and recommendations: Do not depend on NT scheduler to accomplish timing behavior in interactive applications. Utilize user-level scheduling to achieve higher predictability. If possible, characterize duration of I/O activity and its frequency. Lock pages in memory for real-time threads. Manage and control utilization of systems resources.
9/9/ :17 33 Use a General Purpose OS for RT? It is possible to improve the predictability of real-time tasks. It is not possible to “mask” all sources of unpredictability within NT “as is”. Designer needs to be aware of the effects DPC queue on any user thread. I/O handling. Prototype application demonstrates the uses of the recommendations. K. Ramamritham, C. Shen, O. González, S. Sen, S.B. Shirgurkar. Using Windows NT for Real-Time Applications. In Proc. of the 4th IEEE Real-Time Technology and Applications Symposium, Denver, CO, June 1998.
9/9/ :17 34 Talk Outline Using off-the-shelf components for Real-Time applications? Future application characteristics Problems and challenges Characteristics and limitations of OTS OSs OTS component based solutions for distributed Real-Time applications User-level scheduling of communicating RT tasks Conclusions, recommendations, future
9/9/ :17 35 Challenges in Supporting Communicating Real-Time Tasks Network Architecture Network Interface Design Resource Manager Scheduling Algorithms Scheduler Middleware Services Traffic Management Rate Control
9/9/ :17 36 Communications Middleware Transporting multiple media types, control, data, visual, image, alarms, video and audio Integrating control with visual monitoring Enabling plug-and-play QoS management
9/9/ :17 37 Writer DPA_1 DPA_n Reader_i Network Node 1 Node 2 ReMA ReMA = Reflective Memory Area Real-Time Channel-based Reflective Memory (RT-CRM) DPA = Data Push Agent DRA = Data Receive Agent DRA_n
9/9/ :17 38 RT-CRM Operation Modes Synchronous/Asynchronous Writer DPA_1 DPA_n Reader_i Network Node 1 Node 2 ReMA Blocking / Non-blocking
9/9/ :17 39 Network Node 1 Node 2 Using MidART services Write Cmd DPA Process Cmd DPA Result Synchronous mode
9/9/ :17 40 End-to-End QoS Provisions Writer DPA_1 DPA_n Reader_i Network Node 1 Node 2 ReMA Memory-to-Memory Application-to-Application
9/9/ :17 41 GPOS Problems No priority inheritance Among user processes/threads No Priority tracking From user to system/network threads Limited number of priorities No explicit support for guaranteeing timing constraints
9/9/ :17 42 Problem alleviated: No priority inheritance and limited priorities No priority tracking Hidden protocol stack Priority inversion Server-based User Level Scheduling Solution: Modified Dual Priority Scheduling
9/9/ :17 43 Dual Priority Scheduling (York) Middle Band Low Band High Band Three Priority Bands Non-real-time tasks Real-time tasks before promotion time Real-time tasks after promotion time
9/9/ :17 44 Dual Priority in MidART Middle Band Low Band High Band Time Critical Band Four Priority Bands Non-real-time tasks Real-time tasks before promotion time Critical real-time tasks Real-time tasks after promotion time
9/9/ :17 45 Communication Server ReMA DPA ReMA DPA ReMA DPA High Promoted Middle Low Queues Rate Controllers Data Pusher To promotion time Request = actual messages Queues limited priorities Calculation of promotion time
9/9/ :17 46 Admission control To calculate the schedulability T_i: Promotion Time i = Deadline i - worst case response time i worst case response time i = worst case delay (w i ) + release jitter i w i m+1 = C i + S o + [ w i m + Jj C j ] j hp(i) T j Computation timeBlocking time Interference of high priority tasks
9/9/ :17 47 Rate Control with Uniform Message Size Bounding the priority inversion time due to non-preemptive message transmission. The “optimal” message size is a platform dependent parameter. Receiving node capability is the key - the known live-lock problem. Match value between the system timer granularity and the data push latency. (E.g., in our case, 8KB)
9/9/ :17 48
9/9/ :17 49 Communication Server ReMA DPA ReMA DPA ReMA DPA High Promoted Middle Low Queues Rate Controllers Data Pusher To promotion time STOP Uniform message size to allow transmission with fix preemption points
9/9/ :17 50 Communication Server ReMA DPA ReMA DPA ReMA DPA High Promoted Middle Low Queues Rate Controllers Data Pusher To network Uniform message size to allow transmission with fix preemption points
9/9/ :17 51 Problem alleviated: No priority inheritance and limited priorities No priority tracking Hidden protocol stack Priority inversion Server-based User Level Scheduling Solution: Modified Dual Priority Scheduling Real-time data pusher Rate control Uniform message size Integration of real-time and non-real-time applications on the same computing and communication platform
9/9/ :17 52 Industrial Control Environment EWS Network Field Devices Field Network PLC Field Devices Field Network PLC Operator Stations Periodic Monitoring 10 msec) Command and control 20 msec) Server
9/9/ :17 53
9/9/ :17 54
9/9/ :17 55 Summary: User Level Communication Communication Scheduler End-to-End Task Model Local scheduling of precedence constrained distributed tasks Integration of real-time and non-real-time tasks C. Shen, O. González, K. Ramamritham, and I. Mizunuma. User Level Scheduling of Communicating Real- Time Tasks. In Proc. Of fifth IEEE Real-Time Technology and Applications Symposium, Vancouver, Canada, June 1999.
9/9/ :17 56 Conclusion Characterization and understanding of limitations imposed by GPOS. Development and evaluation of an efficient user-level scheduling scheme for communicating real-time tasks Support coexistence of real-time and non-real-time using general purpose operating systems and networks Ongoing work for QoS management support at the middleware level
9/9/ :17 57 Incorporation of Quality of Service QoS management is needed to support multiple real-time applications on single platform Operating system Application Dynamic QoS management. Extend QoS guarantees provided by network to end point applications Predictable performance for real-time and multimedia applications according to user QoS parameters.
9/9/ :17 58 System Research to Meet the Challenge Network Architecture Network Interface Design Resource Manager Scheduling Algorithms Scheduler Middleware Services Traffic Management Rate Control Qos Management
9/9/ :17 59 Research Questions What type of support is needed to optimize resource use? How to implement efficient QoS adaptation at the middleware level? The real problem Integrated end-to-end Quality of Service management From the mouse to the wire (User to Network) From the wire to the glass (Network to Display)
9/9/ :17 60 Problems Specification of user QoS requirements Efficient adaptation mechanisms Mode change Integrated scheduling of multiple resources Resource reservation Account for interrupt in a GPOS Support for management of trade-offs between different types of requirements dependability vs. performance