Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prabhat Kumar Saraswat Paul Pop Jan Madsen

Similar presentations


Presentation on theme: "Prabhat Kumar Saraswat Paul Pop Jan Madsen"— Presentation transcript:

1 Prabhat Kumar Saraswat Paul Pop Jan Madsen
Task Migration for Fault-Tolerance in Mixed-Criticality Embedded Systems Prabhat Kumar Saraswat Paul Pop Jan Madsen Workshop on Adaptive and Reconfigurable Embedded Systems (at ESWeek’09) October 11, 2009, Grenoble, France

2 Online task migration and utilization allocation algorithm
Problem Formulation Mapping? Soft Task ? ? ? ? Hard Task ? Utilization? Given: Implementation and fault occurrence Determine: Mapping and Utilization Such that: Deadlines for all hard real-time tasks are satisfied Graceful degradation for soft tasks Use stacks Make bus grey Online task migration and utilization allocation algorithm

3 Example Embedded Applications Timing Requirements Hard Soft
Safety Requirements Permanent Faults Transient Faults No fault tolerance Example Automotive Applications ABS (Antilock Breaking) Engine Control Steering Wheel Transmission Control Audio Climate Control Power Seat Sun Roof Driver’s Info. panel Same platform Economic Pressures Multicore Real time systems can be classified into two types, hard real-time and soft real-time. Hard real-time systems have strict timing constraints. The deadlines for the hard tasks should be satisfied otherwise it leads to catastrophic degradation of the system. While soft real-time systems have flexible timing constraints. The deadline misses are tolerable. Static approaches decide before the runtime the decisions regarding recovering from faults. However, when the systems become complex, these approaches are quite expensive. Due to the ever changing requirements of the system, and the dynamic nature of faults. It is important that the decision regarding fault recovery can be taken at runtime. In this paper we consider a mixed real-time system and provide an online algorithm for migration of tasks from a faulty processor to the healthy tasks. FAULTS! Hard Constraints Soft Constraints

4 Outline Application Model Platform Model Example
Task Migration and Bandwidth Allocation (TMBA) Experimental results Conclusions

5 Application Model Hard real-time tasks WCET Deadline Period
Safety-criticality Permanent faults Transient faults Soft real-time tasks Probability Execution Time PDF Soft deadline Period Periodic

6 A Platform Model τ1 τ11 τ12 Without checkpointing Execution Segment
With checkpointing and fault recovery Checkpointing Overhead Error Detection Overhead Recovery Overhead Execution Segment Without checkpointing

7 Constant Bandwidth Server
Each soft task is assigned a CBS with parameters: Qi – maximum server budget (bandwidth) Ti – server period (equal to the period of the soft task) A soft task is allowed to execute for only Qi units of time every period Ti Probability of meeting the deadline (QoS) depends on Qi Soft Hard Processor Util.

8 CBS Example [Abeni 98] 3 2 2+7 2+7+7 1 2+7+7+7 2 4 6 8 10 12 14 16 18
Hard WCET=2 Period=3 3 2 2+7 2+7+7 1 Soft Requests CBS Bandwidth = 2 Period = 7 2 4 6 8 10 12 14 16 18 20 22

9 Stochastic Analysis Example
How does Q affects the QoS? (Probability of meeting the deadline for soft tasks) Important to choose right Q!

10 τ6 τ5 τ9 τ6 τ5 τ9 τ7 τ7 τ4 τ1 τ1 τ10 τ8 τ8 τ2 τ10 τ3 τ3 τi τi
Example PE3 Fails! PE1 PE2 PE3 τ6 18(33) 150 τ5 9(34) 150 τ9 12(46) 200 τ6 29(33) 150 τ5 29(34) 150 τ9 33(46) 200 τ7 12(39) 190 τ7 31(39) 150 τ4 8 25 τ1 8 20 τ1 8 20 τ10 30(48) 230 τ8 19(62) 300 τ8 31(62) 300 τ2 15 35 τ10 35(48) 230 τ3 13 40 τ3 13 40 Offline Solution QoS : % 99.54% τi Q (Deadline) Period Initial 72.21% QoS Offline τi WCET Period

11 τ6 τ9 τ5 τ6 τ5 τ9 τ7 τ7 τ4 τ1 τ1 τ10 τ8 τ8 τ2 τ10 τ3 τ3 τi τi
Example PE3 Fails! PE1 PE2 PE3 τ6 13(33) 150 τ9 17(46) 200 τ5 11(34) 150 τ6 29(33) 150 τ5 29(34) 150 τ9 33(46) 200 τ7 14(39) 190 τ7 31(39) 150 τ4 8 25 τ1 8 20 τ1 8 20 τ10 20(48) 230 τ8 23(62) 300 τ8 31(62) 300 τ2 15 35 τ10 35(48) 230 τ3 13 40 τ3 13 40 Time: Proposed <<Offline Proposed Solution QoS : % 99.54% τi Q (Deadline) Period Initial Offline 72.21% 70.58% QoS Proposed τi WCET Period

12 Greedy based Task Migration and Bandwidth Allocation (TMBA)
Iteration System QoS Decision Tryingτ4 on PE X Can’t be mapped Tryingτ4 on PE Tryingτ9 on PE Tryingτ9 on PE Tryingτ10 on PE Tryingτ10 on PE Greedy Hard tasks considered first Tasks ordered according to their Utilizations CBS parameters are adjusted proportionally to their means. Failed Processor PE1 τ6 13(33) 150 τ1 8 20 τ3 13 40 (0.4) (0.32) τ6 (0.08) τ9 17(46) 200 τ10 20(48) 230 τ10 (0.09) τ9 (0.08) PE1 τ5 (0.07) τ2 (0.4) τ4 (0.32) τ7 (0.07) τ8 (0.07) PE2 τ5 11(34) 150 τ7 14(39) 190 τ8 23(62) 300 15 35 8 25 PE2 τ6 29(33) 150 τ5 29(34) 150 τ5 (0.19) τ6 (0.18) τ10 35(48) 230 τ1 (0.4) τ7 31(39) 190 τ7 (0.16) τ10 (0.15) τ1 8 20 τ8 (0.10) τ9 33(46) 200 τ9 (0.16) τ8 31(62) 300 τ2 15 35 τ3 (0.4) τ3 13 40 τ4 (0.32) τ3 (0.32) τ4 8 25

13 Experimental Results Case Study – Portable media player
QoS reported by TMBA : % Optimal QoS : % Synthetic benchmarks 10 – 78 tasks mapped on 3 – 18 PEs WCETs for hard tasks in between 3 to 18 ms PDFs for soft tasks generated to match real life scenarios Initial QoS of the system ≈100% Average 7% spare utilization on each PE All tasks are safety-critical No. of permanent faults – 1 to 3 QoS resulted by TMBA is quite close to the offline. (difference of only 0.66%) TMBA runs in polynomial time Hard deadlines were satisfied for all cases

14 Conclusion A greedy-based online heuristic is proposed for migration of safety-critical tasks to tolerate permanent faults on a mixed hard/soft real-time system. Better design choices can be made by taking stochastic execution times of soft tasks into consideration. Proposed heuristic provides very good quality solutions.

15 Thanks Questions?


Download ppt "Prabhat Kumar Saraswat Paul Pop Jan Madsen"

Similar presentations


Ads by Google