Fault-Tolerant Rate- Monotonic Scheduling Sunondo Ghosh, Rami Melhem, Daniel Mosse and Joydeep Sen Sarma
Outline Background System, task and fault models IBRMS FTRMS Conditions & bounds Simulation & conclusion
Definitions & classifications Real-time scheduling algorithms Preemptive & non-preemptive Tasks Periodic & aperiodic Real-time systems Static & dynamic Three kinds of hardware faults Permanent, transient or intermittent This paper focus on adding time redundancy to a schedule of preemptive, periodic real-time tasks such that faults can be tolerated.
System, task & fault models Sets of independent, periodic, preemptive tasks are considered. A task is eligible for execution at the beginning of its period and has to complete before the end of its period. The computation time Ci Period Ti Utilization Ui = Ci/Ti Total utilization of n tasks
Rate monotonic scheduling Task with higher request rates will have higher priority assignment. Proved result : such a priority assignment is optimum utilization bound: any set of n tasks with a total utilization below is schedulable on a uniprocessor system. for large values of n, the RMS bound
Inserted-Backup RMS General fault tolerance approach is to insert enough slack in the schedule to guarantee re- execution. The amount of slack available over an interval of time is proportional to the length of that interval. Treat it as a backup task B with backup utilization U B The same reserved time is being used as the backup for all the tasks in the system
An example of IBRMS schedule C 1 =1.5, T 1 =5, U 1 =30%, C 2 =2,T 2 =8, U 2 =25% Assume U B =30%
Conditions (recovery from a single fault) [S1]: For every task Si, a slack of at least Ci should be present between kTi and (k+1)Ti [S2]: If there is a fault during the execution of task Sr then the recovery scheme should enable task Sr to re- execute for a duration Cr before its deadline [S3]: When a task re-execution, it should not cause any other task to miss its deadline. If the task set satisfies [S1], then following recovery scheme ensure that both [S2] and [S3] are satisfied.
Recovery scheme Recovery mode: Any instance of a task that has a priority higher than that of re-execute task t and a deadline greater than Dt will be delayed until recovery is complete.
Recovery algorithm
Schedulability test If the total utilization is lower than the bound( least upper bound), then the task set is schedulable. U naive = U LL -U B ( U B = max{Ui}) U G-FT-RMS =U LL (1-U B )
Minimum fault interval: proved: one fault can be tolerated within T n +T n-1 if the backup task with and the recovery scheme RS is used Recovery from multiple faults U BT =m*max{Ci/Ti}, at least m faults can be tolerate. FTRMS bounds can be further improved by making assumption about task utilizations. please search it in paper if you are interested UB = max{Ui}
example Three tasks T1= 10, T2=15, T3=24 C1=2.5, C2=3, C3=3.6 Then U1=25%, U2=20%, U3=15% Fault tolerance requirement: Each task should tolerate one transient fault task 3 should tolerate one additional transient fault within Tn+Tn-1 U B1 =max{Ui}=25%, U B2 =15%.
Simulation & conclusion Using an event-driven simulator to compare pure RMS and FT+RMS ☻ Schedulability ☻ Utilization ☺ Lost tasks