Download presentation
Presentation is loading. Please wait.
Published byMyles Ryan Modified over 9 years ago
1
Reducing Risk by Managing Software Related Failures in Networked Control Systems Girish Baliga, Google, Inc Scott Graham, Air Force Inst. of Technology (AFIT) Carl A. Gunter, Dept. of Computer Science, UIUC P. R. Kumar, Dept. of ECE and CSL, UIUC
2
Information Technology Convergence Lab Vision Sensors Automatic Control Ad Hoc Network Planning and Scheduling
3
Networked Control Systems Network Sensor 1 Supervisor Controller 2 Actuator 1 Plant 1 Sensor 2 Actuator 2 Plant 2 Controller 1 Filter 1
4
Software related failures Programming errors –Simple errors such as incorrect storage size can be catastrophic –E.g. Arianne 5 failure was due to overflow in a 16 bit integer variable! Passive failures –Software, node, and link failures can cut-off sub-systems –E.g. Car controller failures can cause a car to collide with other cars Active failures –Faulty software can interfere with other sub-systems –E.g. Car controller or sensor errors can cause car collisions Byzantine failures –Malicious agents can actively interfere with system operation –E.g. Rogue cars can try to block intersections and collide with other cars
5
Preventing software related failures Robust control laws –Control laws can be designed to tolerate software failures –But, errors could exist in control law implementations! Software verification using formal methods –Formal methods could be used to verify software implementations –But, failures could occur in systems software, libraries, hardware, or links –Also, software verification is very hard for large systems Presence of software errors must be a basic assumption in system design
6
Controller Plant Component based design Control system design Supervisor Sensor Actuator Plant Controller Component based design Component based software design isolates programming errors
7
Virtual Collocation Etherware (Baliga & Kumar ‘03) Etherware manages all software components in a networked control system Etherware –Location independence –Semantic addressing of components –System startup and upgrade during execution –Time translation –Automatic migration of components for performance Etherware manages software failures –Quick and efficient component restarts –Maintain interconnections across failures Transport Layer Network Layer MAC Physical Layer Application Layer Service 2 Service 3 Timing Discrete Event Scheduler Kalman filter Trajectory Planner Car controller Model Predictive Controller Set Point Generation Image Processing Control Law Optimization
8
SensorController MessageStream Message streams connect software components - Message streams are setup and managed automatically by Etherware - Message streams are persisted across component restarts Etherware mechanisms for managing software related failures Kalman Filter Filter Filters intercept messages - Filters can be added to components and message streams - Filters can be used to manage component interactions
9
Local temporal autonomy VisionSensor 2 VisionSensor 1 VisionServer Supervisor Controller 1 Actuator 1 State estimator State estimator Control buffer Local temporal autonomy reduces component dependencies to tolerate passive failures
10
Component restarts
12
CA Supervisor CA Filter Collation VisionSensor 2 VisionSensor 1 VisionServer Supervisor Controller Actuator Collation of multiple independent inputs safeguards from active failures
13
Security Supervisor CA Supervisor CA Filter Security overrides VisionSensor 2 VisionSensor 1 VisionServer Supervisor Controller Actuator Override Security overrides are used to manage Byzantine failures - Security overrides must preserve low-level safety mechanisms
14
Safety preserving security overrides
16
Conclusions Presence of software failures - basic assumption of systems design –Component based design isolates failures –Etherware provides mechanisms to manage software failures –Design principles to manage risk due to software failures: »Component based design to contain programming errors »Local temporal autonomy to tolerate passive failures »Collation to safeguard from active failures »Safety preserving security overrides to manage Byzantine failures
17
Contact information Email: prkumar@uiuc.eduprkumar@uiuc.edu Website: http://decision.csl.uiuc.edu/~testbed/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.