CSE 466 – Fall 2000 - Introduction - 1 Physical Layer API (same as Lab)  enq(), deq(), scan()  Periodically  if not currently master select non-empty.

CSE 466 – Fall 2000 - Introduction - 1 Physical Layer API (same as Lab)  enq(), deq(), scan()  Periodically  if not currently master select non-empty outgoing queue send start condition (capture bus)  On Interrupt  if bus master and not end of frame send next byte, otherwise send stop condition  if bus slave  get byte and process header information, or place into appropriate incoming queue based on previous header information.  Deal w/ flow control somehow (control frames for acknowledgement)  Deal w/ Security dst port Timer Interrupt and External Interrupt Routines Function Calls

CSE 466 – Fall 2000 - Introduction - 2 Debugging A Complete Development Environment Networking File I/O Device I/O Kernel Task Scheduling Memory Management Resource Protection Interrupt Handling Task-message Passing Task event management File I/O File Creation and Deletion File I/O Directory Structures and Navigation Device I/O Device Drivers Networking Protocols and device driver support for data communication. Task to Task message passing over a network. Debugging Simulation Monitor based Debugging Hardware Emulator

CSE 466 – Fall 2000 - Introduction - 3 Debugging  Simulation  you’ve seen the limits of that approach.  In-System Debug (w/ a Monitor)  Ability to download code and upload state information (RAM, Regs)  Breakpoints  replace instruction w/ SWI or JSR  SWI waits for user commands  On “continue” monitor executes the trapped instruction and returns control to the app  Not good for final test unless monitor is part of shipped system  Requires system resources: interrupts, RAM, ROM.  HW Emulation  A circuit board w/ a socket connector that acts just like real processor. Allows debugging without a monitor Monitor Program Debugging Host serial interface R/W code and data space SW based single stepping Single Stepping in the 8051 ISR for External Interrupt 0 JNB P3.2,$ ;Wait Here Till INT0 Goes High JB P3.2,$ ;Now Wait Here Till it Goes Low RETI ;Go Back and Execute One Instruction

CSE 466 – Fall 2000 - Introduction - 4 Projects/Reports  Option 1: A final lab assignment … to be negotiated.  Some improvement on music player (I think we can do 20K music at 11 MHz anyway). Maybe Streaming from NT through one player to another player over I2C. No upload needed.  Chat  A PCB for the Music Player w/I2C and Serial ports  Option 2: A report/presentation  design approaches for some existing system  fuzzy logic controller design  automotive apps and networking (ECU, Airbag, CAN)  Embedded TCP (IP?/I2C?)  Embedded web-servers  Robot architectures  Convergence devices  Object technology for 8-bit microcontrollers  hw that we didn’t cover  motors  watchdog circuits  DAC, ADC. audio codecs  display…vga/lcd  radios  Memory devices and interfaces

CSE 466 – Fall 2000 - Introduction - 5 Safety  Examples  Terms and Concepts  Safety Architectures  Safe Design Process  Software Specific Stuff  Sources  Hard Time by Bruce Powell Douglass, which references Safeware by Nancy Leveson

CSE 466 – Fall 2000 - Introduction - 6 What is a Safe System? Brake Pedal Sensor ProcessorBus Brake w/ local controller Engine w/ local controller Is it safe? What does “safe” mean? How can we make it safe? Add electronic watch dog between brake and bus Add mechanical linkage from separate brake pedal directly to brake Add a third mechanical linkage….

CSE 466 – Fall 2000 - Introduction - 7  Reliability of component i can be expressed as the probability that component i is still functioning at some time t.  Is system reliability P s (t) =  P i (t) ?  Assuming that all components have the same component reliability, Is a system w/ fewer components always more reliable ?  Does component failure  system failure ? burn in period Terms and Concepts time Pi(t) = Probability of being operational at time t Low failure rate means nearly constant probability 1/(failure rate) = MTBF

CSE 466 – Fall 2000 - Introduction - 8 A Safety System  A system is safe if it’s deployment involves assuming an acceptable amount of risk…acceptable to whom?  Risk factors  Probability of something bad happing  Consequences of something bad happening (Severity)  Example  Airplane Travel – high severity, low probability  Electric shock from battery powered devices – hi probability, low severity severity probability danger zone (we don’t all have the same risk tolerance!) airplane autopilot mp3 player PC safe zone

CSE 466 – Fall 2000 - Introduction - 9 More Precise Terminology  Accident or Mishap: (unintended) Damage to property or harm to persons. Economic impact of failure to meet warranted performance is outside of the scope of safety.  Hazard: A state of the the system that will inevitably lead to an accident or mishap  Release of Energy  Release of Toxins  Interference with life support functions  Supplying misleading information to safety personnel or control systems. This is the desktop PC nightmare scenario. Bad information  Failure to alarm when hazardous conditions exist

CSE 466 – Fall 2000 - Introduction - 10 Faults  A fault is an “unsatisfactory system condition or state”. A fault is not necessarily a hazard. In fact, assessments of safety are based on the notion of fault tolerance.  Systemic faults  Design Errors (includes process errors such as failure to test or failure to apply a safety design process)  Faults due to software bugs are systemic  Security breech  Random Faults  Random events that can cause permanent or temporary damage to the system. Includes EMI and radiation, component failure, power supply problems, wear and tear.

CSE 466 – Fall 2000 - Introduction - 11 Component v. System  Reliability is a component issue  Safety and Availability are system issues  A system can be safe even if it is unreliable!  If a system has lots of redundancy the likelihood of a component failure (a fault) increases, but so may increase the safety and availability of that system.  Safety and Availability are different and sometimes at odds. Safety may require the shutdown of a system that may still be able to perform its function.  A backup system that can fully operate a nuclear power plant might always shut it down in the event of failure of the primary system.  The plant could remain available, but it is unsafe to continue operation

CSE 466 – Fall 2000 - Introduction - 12 Single Fault Tolerance (for safety)  The existence of any single fault does not result in a hazard  Single fault tolerant systems are generally considered to be safe, but more stringent requirements may apply to high risk cases…airplanes, power plants, etc. Backup H2 Valve Control Main H2 Valve Control watchdog protocol If the handshake fails, then either one or both can shut off the gas supply. Is this a single fault tolerant system?

CSE 466 – Fall 2000 - Introduction - 13 Is This? Backup H2 Valve Control Main H2 Valve Control watchdog handshake common mode failures

CSE 466 – Fall 2000 - Introduction - 14 Now Safe? Backup H2 Valve Control Main H2 Valve Control watchdog handshake Separate Clock Source Power Fail-Safe (non-latching) Valves What about power spike that confuses both processors at the same time? Maybe the watchdog can’t be software based. Does it ever end?

CSE 466 – Fall 2000 - Introduction - 15 Time is a Factor  The TUV Fault Assessment Flow Chart  T0: Fault tolerance time of the first failure  T1: Time after which a second fault is likely  Captures time, and the notion of “latent faults”  T0 – tolerance time for first fault  T1 – Time after which a second fault is likely  Based on MTBF data  Safety requires that  T test <T 0 <T 1 1 st Fault hazard after T 0 ? System Unsafe yes no 2 nd Fault System Safe Fault Detected before T 1 ? yes no yes no hazard?

CSE 466 – Fall 2000 - Introduction - 16 Latent Faults  Any fault this is not detectable by the system during operation has a probability of 1 – doesn’t count in single fault tolerance assessment  Detection might not mean diagnosis. If system can detect secondary affect of device failure before a hazard arises, then this could be considered safe Backup H2 Valve Control Main H2 Valve Control watchdog handshake stuck valves could be latent if the controllers cannot check their state. May as well assume that they are stuck!

CSE 466 – Fall 2000 - Introduction - 17 Fail-Safe Design (just an example)  On reset processor checks status. If bad, enter “safe mode”  power off  reduced/altered functionality  alarm  restart  Safe mode is application dependent Processor Watchdog protocol failure reset status

CSE 466 – Fall 2000 - Introduction - 18 Safety Architectures  Self Checking (Single Channel Protected Design)  Redundancy  Diversity or Heterogeneity Brake Pedal Sensor Computer Bus Brake Engine Control watchdog protocol parity/crc Periodic internal CRC/Checksum computation (code/data corruption)

CSE 466 – Fall 2000 - Introduction - 19 Single Channel Protection  Self Checking  perform periodic checksums on code and data  How long does this take?  Is T0<T1?  Feasibility of Self Checking  Fault Tolerance Time  Speed of the processor  Amount of ROM/RAM  Redundant Memory Subsystems: special hw  Recurring cost v. Development cost tradeoff Computer (code corruption) Computer Bus Brake Engine Control parity/crc on the bus

CSE 466 – Fall 2000 - Introduction - 20 Redundancy  Homogeneous Redundancy  Low development cost…just duplicate  High recurring cost  No protection against systemic faults Computer (code corruption) Brake Engine Control Computer Voting Bus could be implemented similar to collision detection

CSE 466 – Fall 2000 - Introduction - 21 Multi-Channel Protection  Heterogeneous Redundancy  Protects against random and some systemic faults.  Best if implementation teams are kept separated  Space shuttle: five computers, 4 same 1 different Proc/SW 1 Brake Engine Control Proc/SW 2 Voting Bus

CSE 466 – Fall 2000 - Introduction - 22 Design for Safety 1.Hazard Identification and Fault Tree Analysis 2.Risk Assessment 3.Define Safety Measures 4.Create Safe Requirements 5.Implement Safety 6.Assure Safety Process 7.Test,Test,Test,Test,Test

CSE 466 – Fall 2000 - Introduction - 23 1. Hazard Identification – Ventilator Example HazardSeverityTolerance Time Fault Example LikelihoodDetection Time MechanismExposure Time Hypo- ventilation Severe5 min.Vent FailsRare30secIndep. pressure sensor w/ alarm 40sec Esophageal intubation Medium30secC02 sensor alarm 40sec User mis- attaches breathing hoses neverN/ADifferent mechanic al fittings for intake and exhaust N/A Over- pressuriza tion Sever0.05secRelease valve failure Rare0.01secSecondary valve opens 0.01sec Human in Loop

CSE 466 – Fall 2000 - Introduction - 24 FMEA Example – notably missing time factor!

CSE 466 – Fall 2000 - Introduction - 25 Fault Tree Analysis – working backwards  Pacemaker Example single fault hazard

CSE 466 – Fall 2000 - Introduction - 26 FMEA – Working Forward  Failure Mode: how a device can fail  Battery: never voltage spike, only low voltage  Valve: Stuck open? Stuck Closed?  Motor Controller: Stuck fast, stuck slow?  Hydrogen sensor: Will it be latent or mimic the presence of hydrogen?  FMEA  For each mode of each device perform hazard analysis as in the previous flow chart  Huge search space

CSE 466 – Fall 2000 - Introduction - 27 2. Risk Assessment  Determine how risky your system is S: Extent of Damage Slight injury Single Death Several Deaths Catastrophe E: Exposure Time infrquent continuous G: Prevenability Possible Impossible W: Probability low medium high 1 2 3 4 5 6 7 8 3 4 5 7 6 - 1 2 2 3 4 6 5 - - 1 W3 W2W1 S1 S3 S2 G2 G1 G2 G1 S4 E2 E1 E2 E1

CSE 466 – Fall 2000 - Introduction - 28 Example Risk Assessment DeviceHazardExtent of Damage Exposure Time Hazard Prevention Probabil ity TUV Risk Level Microwave Oven IrradiationS2E2G2W35 PacemakerPacing too slowly Pacing too fast S2E2G2W35 Power station burner control ExplosionS3E1--W36 AirlinerCrashS4E2G2W28

CSE 466 – Fall 2000 - Introduction - 29 3. Define the Safety Measures  Obviation: Make it physically impossible (mechanical hookups, etc).  Education: Educate users to prevent misuse or dangerous use.  Alarming: Inform the users/operators or higher level automatic monitors of hazardous conditions  Interlocks: Take steps to eliminate the hazard when conditions exist (shut off power, fuel supply, explode, etc.  Restrict Access. High voltage sources should be in compartments that require tools to access, w/ proper labels.  Labeling  Consider  Tolerance time  Supervision of the system: constant, occasional, unattended. Airport People movers have to be design to a much higher level of safety than attended trains even if they both have fully automated control

CSE 466 – Fall 2000 - Introduction - 30 4. Create Safe Requirements: Specifications  Document the safety functionality  eg. The system shall NOT pass more than 10mA through the ECG lead.  Typically the use of NOT implies a much more general requirement about functionality…in ALL CASES  Create Safe Designs  Start w/ a safe architecture  Keep hazard/risk analysis up to date.  Search for common mode failures  Assign responsibility for safe design… hire a safety engineer.  Design systems that check for latent faults  Use safe design practices…this is very domain specific, we will talk about software

CSE 466 – Fall 2000 - Introduction - 31 5. Implement Safety – Safe Software Language Features Compile Time Checking Run Time Checking Exception Handling Re-use Objects Operating Systems Protocols Testing Regression Testing Exception Testing (Fault Seeding)

CSE 466 – Fall 2000 - Introduction - 32 Language Features  Compile Time Checking: Lint like tools  No overhead at run time (code size, execution time) Pascal, example of a strongly typed language Program WontCompile1; type MySubRange = 10.. 20; Day = {Mo, Tu, We, Th, Fr, Sa, Su}; var MyVar: MySubRange; MyDate: Day; begin MyVar := 9; {will not compile – range error} MyDate := 0; {will not compile – wrong type)  But, compile time checking can’t catch range errors

CSE 466 – Fall 2000 - Introduction - 33 Error Checking in Software

CSE 466 – Fall 2000 - Introduction - 34 Testing

CSE 466 – Fall 2000 - Introduction - 1 Physical Layer API (same as Lab)  enq(), deq(), scan()  Periodically  if not currently master select non-empty.

Similar presentations

Presentation on theme: "CSE 466 – Fall 2000 - Introduction - 1 Physical Layer API (same as Lab)  enq(), deq(), scan()  Periodically  if not currently master select non-empty."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSE 466 – Fall 2000 - Introduction - 1 Physical Layer API (same as Lab)  enq(), deq(), scan()  Periodically  if not currently master select non-empty.

Similar presentations

Presentation on theme: "CSE 466 – Fall 2000 - Introduction - 1 Physical Layer API (same as Lab)  enq(), deq(), scan()  Periodically  if not currently master select non-empty."— Presentation transcript:

Similar presentations

About project

Feedback