ASQF Fachgruppe Automation/ Automation Day 1 Realtime capable Linux: Requirements (Standard)LinuxRealtime LinuxRealtime kernel One (physical) address space with n-tasks Application program 100% source compatibility better binary compatibility API Response time: < 10 µsResponse time: ~1ms Response time: < 10 µs same homogeneity One process address space with n-threads All tasks have one common API implemented by one common library All threads of a process have one common API implemented by one common library
ASQF Fachgruppe Automation/ Automation Day 2 Realtime capable Linux: The important questions Is the POSIX-API suitable? –Only a subset is relevant for realtime applications –The mechanism for the notification of asynchronous events determine widespread usage Can it provide for competitive performance? –Ideally it should be as good as specialized realtime kernels –Ultimately cycle times of 31,25 µs should be possible What about additional changes to the Linux kernel ? –Ideally none –Only minimal changes allow for long time support
ASQF Fachgruppe Automation/ Automation Day 3 Thread management –Creation,Termination, Scheduling Synchronization –Mutex –Spinlock Communication –POSIX Message Queue –Semaphore –Shared Memory Timer – types (periodic, one shot, absolute, relative) – resolution – clock tick – notification The POSIX-API: Minimal set for realtime tasks needs improvements
ASQF Fachgruppe Automation/ Automation Day 4 The POSIX Notification-API: The Timer-API Timer sigevent object SIGEV_THREAD CLOCK_REALTIME SIGEV_SIGNAL SIGEV_NONE timer_create (clockid_t clockid, timer_t *timerid, struct sigevent *evp); Problems: SIGEV_THREAD: The implementation within the glibc is not realtime capable another implementation fixes it (deterministic, performance improvement by a factor of 10) SIGEV_SIGNAL: Sending a signal to the entire process is not deterministic. Potentially the entire thread list of a process must be scanned. minimal overhead and deterministic behavior require sending to a thread (Linux can handle it) no response based on signal handlers (signal handlers are for synchronous exceptions only) having explicit wait only allows for extremely fast notification.
ASQF Fachgruppe Automation/ Automation Day 5 SIGEV_THREAD: The realtime capable implementation sigwaitinfo() handler carrier thread signal to thread realtime implementation Solution: creation of carrier thread at timer creation Trade off between memory consumption and independent handlers controllable same pthread_attr object with same values share an existing carrier thread different pthread_attr object or the same but with different values create new carrier thread C ompletely deterministic Performance improvement more than a factor of 10 sigwaitinfo() signal to thread pthread_exit() handler pthread_create() carrier thread managing thread current glibc implementation pthread_exit() handler sigwaitinfo() handler carrier thread signal to thread
ASQF Fachgruppe Automation/ Automation Day 6 The POSIX Notification-API: Extensions(1) t timer fires sigwait() Scenario a sigwait() Scenario b timer overrun sigwait() Scenario c Can be queried This situation must get signalled An additional call sigevent_set_notification() makes sending the signal to a thread available at the API level. This call maps the threadid of the API to the tid of the kernel allows via an additional encoding of the sigev_notify field for supervision of timeliness implementation with best possible deterministic behavior The existing sigevent structure needs no changes. timer fires POSIX definition (not very helpful)
ASQF Fachgruppe Automation/ Automation Day 7 The POSIX Notification-API: Extensions(2) A device must be a possible source of a timer tick (clock) Example: A Profinet Realtime Ethernet Controller ERTEC provides for a 250 µsec tick, not necessarily synchronous with CLOCK_REALTIME additional call register_clock() A firing timer is one special type of an event arbitrary asynchronous IO events should get signalled the same way additional call event_create() Timer sigevent object SIGEV_THREAD (realtime capable implementation of the library module) CLOCK_REALTIME SIGEV_SIGNAL (inhibited) direct IO-Event create_event() Timer Device Driver clock register_clock() sigevent_set_notification() Signal to thread (optional with supervision) Extensions to the POSIX API
ASQF Fachgruppe Automation/ Automation Day 8 Realtime capable Linux: Base concept Linux kernel realtime kernel threads controlled by realtime scheduler threads controlled by Linux-Scheduler system call entry exactly the same for both kernels Ipipe (ADEOS) Common Library Linux Process POSIX API HW- Interrupts Soft- Interrupts static priorities (POSIX-counting) realtime kernel Linux Linux + Realtime boundary implementation defined Linux
ASQF Fachgruppe Automation/ Automation Day 9 Typical partitioning of a realtime thread setup phase time critical response path Much functionality needed but neither realtime requirements nor high performance requirements For all calls not implemented, the thread temporarily runs under the control of the Linux scheduler Wait for event event Realtime capable Linux: Functionality of the realtime kernel minimal functionality needed minimal realtime kernel -timer functions and its notification -communication and synchronization -scheduling
ASQF Fachgruppe Automation/ Automation Day 10 Realtime capable Linux: Limits of homogeneity Linux Kernel Realtime Kernel threads managed by the realtime scheduler threads managed by the Linux scheduler Ipipe Linux process HW- Interrupts Soft- Interrupts Signal source eg. timer Signal source eg. IO event A signal can only get sent to that domain which created the signal source This restriction is important for best possible performance (allows for separation of implementation) acceptable and manageable (violations are reported)
ASQF Fachgruppe Automation/ Automation Day 11 Realtime capable Linux: Implementation Kernel Two additional loadable modules: A device driver –functional extensions of the API (API-calls ioctl() system calls, additional internal interfaces for realtime capable device drivers) –Interface for initialization/configuration of the realtime module eg. registration of a process as realtime process The realtime module (20K for x86) –Functionality of the realtime kernel –Communication between realtime and Linux kernel Three modified kernel files: –futex.c (hook), posix_timers.c (clock registration), mqueue.c Ipipe modifications: –Systemcall handling (additional hook at system call exit + optimization) –Interrupt handling (optimization, support for truly preemptive realtime kernel)
ASQF Fachgruppe Automation/ Automation Day 12 Realtime capable Linux: Implementation glibc Modified functions: Initialization –Registration of the realtime process Spinlock function also for static priorities Realtime capable implementation for SIGEV_THREAD Additional functions Device driver functions via ioctls –Registration of a timer clocks –Registration of an (IO-)event Thread specific sigevent notification (the kernel is able to do it) fast message queues
ASQF Fachgruppe Automation/ Automation Day 13 Testscenario: Measurement with getuid(), since getpid() is handled inside glibc Measurement with time stamp counter (94 clocks resp. 12/5 clocks) glibc generated with either int 80 or sysenter system entry (…/…) Cel. 2,8 GHz 1000 clocks = 357 ns Athlon 2 GHz 1000 clocks = 500 ns 343/147 Linux original ipipe original 1124/ /914 Linux Domain Realtime Domain Linux + Ipipe patch + AuD patch + AuD optimized + AuD patch + AuD optimized 1260/ / / /257 Realtime capable Linux: System call overhead in clocks 317/ / / /641 /185
ASQF Fachgruppe Automation/ Automation Day 14 high priority thread sem_post()/mq_send() sem_wait()/mq_receive() up helper thread of lowest priority down Low priority thread must have a chance to proceed to sem_wait() / mq_receive() sem_wait()/mq_receive() sem_post()/mq_send() low priority thread Realtime capable Linux: Performance inter-thread communication Testscenario: Two threads communicating either via semaphores or through POSIX message queues Note: Times for down are always longer than for up, since an additional sem_wait() / mq_receive()-call is included.
ASQF Fachgruppe Automation/ Automation Day 15 Realtime Domain Linux Domain up down up down up Realtime capable Linux: Interthread communication in clocks Celeron 2,8 GHz 1000 clocks = 357 ns Athlon 64 2,0 GHz 1000 clocks = 500 ns up down up down up down Semaphore Message Queue Semaphore Message Queue Times for Realtime Linux first switch to interrupted Linux thread afterwards switch to woken up Linux thread (message queue) semaphores (futex) need an additional proxy thread for futex_wake optimized System call entry: int 80
ASQF Fachgruppe Automation/ Automation Day 16 Realtime Domain Linux Domain up down up down up Realtime capable Linux: Interthread communication in clocks Celeron 2,8 GHz 1000 clocks = 357 ns Athlon 64 2,0 GHz 1000 clocks = 500 ns up down up down up down Semaphore Message Queue Semaphore Message Queue optimized System call entry: sysenter Times for Realtime Linux first switch to interrupted Linux thread afterwards switch to woken up Linux thread (message queue) semaphores (futex) need an additional proxy thread for futex_wake
ASQF Fachgruppe Automation/ Automation Day 17 ISR response thread response min max min max Athlon64 2,0 GHz 1000 clocks = 500 ns Celeron 2,8 GHz 1000 clocks = 357 ns Realtime capable Linux: Interrupt response time in clocks Test scenario: An application thread issues via a driver an ipipe soft interrupt The ISR resumes via send_event another application thread ( waiting with sigtimedwait()) ISR frequency varied between 100 µsec and 1 sec System load generated via GUI (Eclipse-Start, file transfer, etc.) The minimal figures are of interest in several respect: for high interrupt frequencies representative for the average and therefore the load typical for path length, and therefore useful for scalability projections What can be achieved with less powerful HW (e.g. SOC with mediocre memory interface )? for a dedicated processor of a multicore system (private TLB, L1-Cache, L2-Cache) the guaranteed worst case time comes closer to the minimal figure
ASQF Fachgruppe Automation/ Automation Day 18 Realtime capable Linux: Positioning to RT_PREEMPT The kernel implementation is transparent to application programs –An application binary for a dual domain system runs without any changes on a Linux only system and therefore also on a RT_PREEMPT system Possible coexistence scenarios with RT_PREEMPT –RT_PREEMPT as an alternative if no extremely strong realtime requirements –An additional realtime domain for extremely strong requirements for response time and thread communication, if demonstration is required (proof) tiny kernel, short paths large complex kernel, lengthy paths, nested PI mutexes, RCU locks Commonalities with RT_PREEMPT – The same modifications of the glibc are needed. – The same kernel enhancements (registering a clock, realtime capable POSIX message queues) are needed
ASQF Fachgruppe Automation/ Automation Day 19 Realtime capable Linux: Conclusion The POSIX-Realtime-Extensions need some fine tuning –Minimal extensions to the API definition improve usability and allow for best possible performance Realtime under Linux can be provided transparently –The performance figures are comparable to specialized realtime kernels Ipipe is a sustainable base –Small optimizations improve system call handling and interrupt response