Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.

Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök

Outline Introduction System Specification Fault model Some terminology Methodology Analysis Reliable communication HW/SW Partitioning

Introduction System reliability aspects are generally considered to the end of the design process, at low abstraction levels Working at low abstraction levels introduces more overhead Not all systems can be considered at low levels It is better to handle fault detection at higher levels It is better to asses if fault detection should be done in HW or SW for system performance

Introduction At system level several parameters are considered and an alternative design is chosen among several alternatives –Time constraints –Power consumption –Testability –Area

Introduction Fault detection facilities are introduced at system level –HW/SW binding of components is affected System Specification: which parts are critical and need fault detection Design methodologies: how these detection facilities are applied either in HW or SW HW/SW partitioning: which parts are in SW, which are in HW. Guided by methodologies

System Specification Language must support.. User should eb able to specify which sections require reliability aspects For ex: SystemC or OCCAM Architecture; CPU(dsp or general purpose), Coprocessors, (ASIC or FPGA)

FAULT MODEL Single Functional Failure –Any number of physical faults causes a functional model to perform incorrectly –HW is faulty, software is affected by hardware –CPU, communication channels, one of Co processors, memory may fail –Module failure is detected before any other fails Temporal, architectural and informational redundancy is adopted

Some Terminology Nominal :original system function elements Checking: redundant elements for fault detection Checker: element to compare checking and nominal Each of these elements can be independently implemented in either HW or SW

HW or SW Nominal SW, Checker SW, Checking SW Checking and checker are either executed by system processor or a dedicated processor Ex: Self checking SW, Assertions, Dual_processor and VLIW

HW or SW (Cont’d) Nominal SW, checker HW and checking SW Interface for functional Redundancy check, VLIW with hardware, Dma checker Nominal SW, checker HW and checking HW CED solutions are implemented totally in HW, EX: Dynamically configurable checker

HW or SW (Cont’d) Nominal HW, Checker HW, Checking HW Classical Approach. Ex: Duplication, TSC devices

Methodologies Analysis - Concepts Number and type of processing elements Whether special architecture is necessary Synchronization issues between processing elements Allocation of checker memory space Checker structure and complexity Selection of a checker methodolgy to raise errors in case of mismatches

Methodologies Analysis - Metrics Detection latency: the time between the instant an error occurs and the instance it is detected Coverage: how many of the existing faults can be detected Performance degradation: overhead caused by fault detection facilities compared to nominal functions

Methodologies Analysis – Metrics (Cont’d) Material cost: cost of physical components Design Cost: effort needed to design the system

Reliable Communication Apart from data processing communication needs to be reliable Hardware redundancy ; lines duplication Information redundancy; data encoding Best effective when data encoding is used when SW is involved and hardware sections employ dedicated lines (dublicated, encoded)

HW/SW Partitioning After systems is specified, methodologies has been assessed, different alternatives have been produced with cost functions partitioning step takes place. Evaluate cost functions, evaluate constraints of the user Reliability aspects make it more complex Make partitioning in two stages!

HW/SW Partitioning (Cont’d) First level: classical aspects and functions are taken into account Second level: given the first solution reliability aspects are introduced and a solution between solution set that has the best trade off and that satisfies the first constraints is chosen. If no reliability constraints is given second level is not carried

HW/SW Partitioning (Cont’d) If specific architecture is required for reliability (for example dual processor) fist level benefits from earlier partitioning solutions A solution may not exist after reliability constraints are introduced and first level may need to be repeated

HW/SW Partitioning (Cont’d) Reliability constraints may be which druve the second stage –Hard, ex: % 100 fault coverage –Soft, ex: any fault coverage Parameters considered –Fault coverage –Performance degradation –Detection latency –Area overhead

Conclusion Design for reliability has been merged into HW/SW codesign process resulting in a final design that has on-line fault detection properties Future work is introducing fault tolerancy into HW/SW codesign process

Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.

Similar presentations

Presentation on theme: "Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.

Similar presentations

Presentation on theme: "Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök."— Presentation transcript:

Similar presentations

About project

Feedback