Download presentation
Presentation is loading. Please wait.
Published byNatasha Poynor Modified over 10 years ago
1
Software Fault Tolerance (SWFT) How to Design, Develop and Evaluate Robust SW and OS’s Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de Prof. Neeraj Suri Abdelmajid Khelil (Majid) Constantin Sârbu (Dinu) Brahim Ayari Dept. of Computer Science TU Darmstadt, Germany
2
2 Outline of today’s lecture Course info Course goals Research related to course © DEEDS Group SWFT WS ‘07
3
3 Related Courses Lectures SW/OS Fault-Tolerance Kanonik: Introduction to Trusted Systems Seminars Embedded Mobile Computing Secure and Reliable OS Labs Selected Topics in Dependable SW & Mobile Computing
4
4 Course Info Lecture (in English) Wed. (11:40am - 1:20pm), C120 Exercises (in E & G): Thu., 3. DS (10:45-13:20), C110 Starts 25 th 2007 Course webpage: http://www.deeds.informatik.tu-darmstadt.de © DEEDS Group SWFT WS ‘07
5
5 Grading Related Credit points: 7.5 - SWS: 5 (2+3) Exam Mid-term exam: 25% (E or G), December 17th, 2007 Presentations: 24% (E or G) Final exam: 51% (E or G) Exercises: (E or G) Practice + presentations Lab stuff: Optional Do some live programming Gain some practical experience Please take this opportunity! May improve your grade (bonus points) If you have a suggestion for a lab discuss it with us! © DEEDS Group SWFT WS ‘07
6
6 Learn more... We have a selection of sub-projects related to this lecture will be targeted to interests of students See example See slides research@DEEDS We offer Bachelor/masters theses HiWi Fun © DEEDS Group SWFT WS ‘07
7
7 Course Goals Learn software fault tolerance concepts Learn how to develop robust programs how to deal with software bugs software fault tolerance: continuation of service in the face of failures Learn concepts and mechanisms to build software fault tolerance tools Learn how to evaluate and test robust SW/OS Learn some SW issues related to (a) mobile SW and (b) security © DEEDS Group SWFT WS ‘07
8
8 Course Outline 1.Introduction/Concepts of SWFT 2.SW-FT Mechanisms: Design Aspects Process pairs, selective retries, graceful degradation,… Checkpointing, N-copy programming (NCP), N-version Programming (NVP), micro-reboots,... Robust programming, … 3.Evaluation of fault-tolerant SW & OS’s SW reliability SW/OS stress testing Hardening of OS’s, Patching OS Driver profiling and testing 4.Transactional/Mobile SW Mobile transcation (FT, recovery..), Wireless sensor networks (Energy-efficient FT, spatial/temporal redundancy..) 5.SW and Security: Buffer overflows etc © DEEDS Group SWFT WS ‘07
9
9 Literature Most lectures will be based on research papers: URLs of papers available via class page References on slides (available on web) Coverage for exams is primarily (a) the lecture content and (b) issues covered over the Exercises....so attending is important © DEEDS Group SWFT WS ‘07
10
10 Research@DEEDS Related to Course DEEDS: Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de Dependable Embedded Systems & Software (DEEDS)
11
11 What isn’t an Embedded System? response motes sensors UAV’s XBW FBW
12
12 The Spread: Dependability, Safety, Security X-by-Wire: Safety-Service Critical Systems (Aerospace/Automotive) System Architecture Design Protocols (Synchronous, Membership, Diagnosis, Recovery, Scheduling) Dependability Evaluation Verification & Validation (V&V) Experimental fault injection (PROPANE) Formal Methods Distributed Systems: Byzantine Consensus, Failure Detectors, Verification Mobile/WSN Networks: Fault-tolerant protocols, routing, reliability analysis CPU Architectures: Energy-efficient FT, Transient Resilience, … Operating Systems: OS Robustness Evaluation, Driver Testing & Evaluation, Vulnerability Profiling, Embedded/Desktop OS, …
13
13 Complexity (Devices, Systems) Within a Domain in-vehicle networks aircraft flight control
14
14 Automotive/Aerospace (Federated Systems) Dist. Resources Nodes + Comm. Comm.DiagnosticsSteeringEnv. Ctrl.Braking Engine/Flight Control NavigationUser I/OMultimediaBody Elec. Applications Middleware Resources multiple nodes, varied criticality buses, clusters, bridges (HW, SW), …
15
15 Re-usable Core (technology + domain invariant) Core Services MW + Arch Platforms Shared, Distributed/Networked High-level Services app 1 app 2 app n Managing Complexity: Federated to Integrated Applications will change Multi-Domain Solutions: Automotive, Aerospace, Control Compositional Framework (+Tools) Integrates diverse criticality apps Delineation over integration for functionality and safety Flexible building blocks & interfaces Technologies will change Benefits Design flexibility, short time- to-market Reduced number of nodes Reduced complexity and cost
16
16 P1: X-by-Wire Protocols On-Line Diagnosis Enhance sustained autonomic system operations Self-healing On-line recovery (transient faults) Self-diagnosing Maintenance actions (permanent faults) Challenges Avoid overreaction to transient faults The cure can be worse than the disease! Support mixed-criticalities applications From X-by-Wire to Comfort applications Portability for time-triggered (TT) platforms Add-on, middleware approach Contact: Marco Serafini (marco@informatik.tu-darmstadt.de)
17
17 P2: Mobile Database Systems Mobile transactions Commit protocols Challenges: Frequent perturbations Heterogeneity Wireless links (WLAN, UMTS, …) Mobile nodes (laptops, PDAs, …) Failures Unpredictable disconnections Node/Communication failures Infrastructure-based vs. ad-hoc Mobile Ad-hoc NETworks (MANETs) Wireless Sensor Networks (WSNs) Wired Network WLAN UMTS GPRS Contacts: Brahim Ayari (brahim@informatik.tu-darmstadt.de) Abdelmajid Khelil (khelil@informatik.tu-darmstadt.de) WAVE
18
18 P3: Dependable Ad-hoc Sensor Networks Applications Car2Car communication Cooperative driving Announcements Tracking & monitoring Measurement Disaster rescue Research challenges Energy (efficiency, maintenance..) Frequent failures (detection, diagnosis..) Safety-critical applications Reliable communication Contacts: Faisal Karim (fkarim@informatik.tu-darmstadt.de) Abdelmajid Khelil (khelil@informatik.tu-darmstadt.de) WAVE WLAN ZigBee
19
19 P4: Energy Efficient Dependable Systems Trends Heterogeneous systems Increased dependence upon technology Mobility low voltage smaller noise margins more transient errors Increased complexity Integration/communication between systems Energy efficient fault tolerance Evaluate Characterize Optimize/trade-off Dimensions Design-time vs. run-time Time vs. space System level vs. components level Service degradations and reconfiguration Contact: Neeraj Suri (suri@informatik.tu-darmstadt.de)
20
20 P5: Robustness Evaluation of Embedded OS/SW Problem: SW systems are vulnerable to errors in Commercial-Off-The-Shelf (COTS) components. Characterization of impact of 3 rd party SW is hard. Approach: Focus on device drivers Error propagation analysis using fault injection Robustness enhancing wrappers Applications: Verification COTS integration (acceptance) Robustness enhancement Contact: Andréas Johansson (aja@informatik.tu-darmstadt.de)
21
21 P6: Improved Testing of Device Drivers Problem Faulty COTS drivers used in modern OSs have a significant impact on system reliability. They are hard to test as execute in kernel space and are delivered sans source code. Applications Black-box testing for COTS drivers Profiling, debugging System activity monitoring Research aims Profile driver behavior at runtime Expedite driver testing by focusing on runtime activity Test methods tuned to OS/driver operational profiles … Hardware Layer System Services OS kernel Application 1Application p Contact: Constantin Sarbu (cs@cs.tu-darmstadt.de) Driver Monitor
22
22 Questions? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? © DEEDS Group SWFT WS ‘07
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.