Design of Distributed Real-Time Systems Ramani Arunachalam.

Design of Distributed Real-Time Systems Ramani Arunachalam

Case Study: MARS ● MARS (Maintainable Real-time system) – Distributed, fault-tolerant, hard real-time – Objectives ● Guaranteed timeliness ● Testability ● Maintainability ● Fault-tolerance ● Systematic software development – Time-triggered architecture

Objectives ● Guaranteed timeliness – Based on resource adequacy at peak load – Statistical assurances not enough ● Testability – Architecture should support testability of timeliness ● Maintainability – Needed to remedy hardware faults, design errors and respond to change requests – Localized consequences -> minimized effort

Objectives ● Fault Tolerance – Redundancy – On-line maintenance ● Systematic software development – No 'trial and error' integration – OS guarantees predictable temporal behaviour

State View ● Time Triggered observation of states – Observe RT entities at predefined intervals ● Intelligent input output – Observation grid – Intelligent sensor ● Preprocesses raw data from input device ● observes at finer granularity called Perception granularity

State View ● Intelligent actuator – Post-processes data from computer system before sending to output device ● State Messages – Produced at observation points – Minimal synchronization requirement – No need for buffer management – Unidirectional (from RT entity)

Structure ● Clusters – Autonomous subsystems – Disjoint name spaces – State message exchanges – Composed of Fault-tolerant units (FTUs) – Real-time communication channel (TDMA) ● FTU – Composed of replicated components – Active and shadow components

Structure ● Component – Smallest replaceable unit – Fail-silent (Correct results or none) – Termination upon failure ● Task Execution – Task : Software inside component – Starts at predefined time – Proceeds without any communication or synchronization – Execution time is deterministic

Operation ● Results of periodic tasks sent as state messages ● Execution time of communication is also predefined ● A Real-time transaction is a progression of processing and communication actions between a stimulus from and a response to the environment. ● Static scheduling (at compile time!) ● At run-time, no surprises ● Modes (operating, emergency)

Fault-tolerance ● Two levels of redundancy ● Active redundancy at FTU level – If a component fails, standby becomes active ● Time redundancy at component level – Every task is executed twice and results compared ● TDMA monitor – Monitors temporal behaviour – Controls the output from component ● Distributed clock synchronization

Fault-tolerance ● Replica determinism – All replicated components perform the same state changes at the same point in time – Prohibit reading of local time – All replicas should agree when to change mode ● Component reintegration – i-state, h-state – Reintegration point: when size of h-state is small – New component gets the h-state at this point

Summary ● Maintenance – Failed component doesn't affect FTU – On-line reintegration after repair – Change in software ● Does it fit in current schedule? ● Otherwise, new mode with new schedule ● Summary – Strict separation of functionality, timeliness and dependability. – Designed for temporal behaviour, testing simplified.

Delta-4 XPA ● Objectives – “A real-time system is not assured to meet deadlines outside operational envelope” – Bounded-demand school ● operational envelope is predictable ● Impractical assumption for complex systems – Unbounded-demand school ● Complete definition of operational envelope is not possible ● Graceful degradation if it falls outside the envelope – XPA implements hard real-time but falls into best- effort behaviour when required.

DELTASE Group management Layer Time and Group communication Abstract network layer (physical + MAC+ firmware)

Architecture ● Network infrastructure – FDDI supports urgent traffic, built-in fault tolerance – Token bus/ring has media redundancy for availability ● Time – Internal time maintained by distributed time server – Clocks synchronized to tens of microseconds – External time – one of the standard time ● Group communication – Services from atomic multicast to datagram – Very fast services of varying reliability

Architecture ● Group communication – Distributed replication management ● BestEffortN – guarantee delivery to N elements ● BestEffortTo - guarantee delivery to named elements ● AtLeastN, atLeastTo – guaranteed service even when sender fails ● Group management – Distributed Group manager object – Management and distribution of groups of objects – Incorporates knowledge of various modes of replication

Architecture ● Application support environment (Deltase) – Client-server and producer-consumer interactions – Apps written using deltase or converted using preprocessors ● Timeliness – What to do under overload conditions? ● Static off-line scheduling – too many possibilities ● On-line scheduling – can find feasible schedules if not overload.

Timeliness ● Scheduling policy uses “precedence” – Combination of priority and earliest-deadline – Few priority classes to avoid unfairness – Within priority class, earliest-deadline-first. ● Design-time and run-time timeliness – Targetline : instant chosen by designer for provision of service – Liveline and deadline: earliest and latest time at which service may be provided – Violation of these detected at runtime and design-time actions defined.

Preemption ● Leader-follower model for replication – Decisions made by a privileged replica i.e. Leader – Preemption point ● Point at which an interrupt will be served – High precedence msg arrives for a process not running currently ● Increase the process's precedence to that of msg ● Causes the process to be scheduled ● These actions propogated to followers ● Followers perform identical operations

Desynchronization ● Followers must not be too apart from leaders ● Followers too fast – Reach the preemption point before leader – remain blocked until leader notifies ● Followers too slow – Leader timestamps notifications – If follower didn't execute the action by T+t(desync) ● Desynchonization event raised ● Another follower takes over

Summary ● Communication support using groups – Oriented to distributed computing ● Tradeoffs between QOS and efficiency – Group mgr uses atomic multicast for orderly delivery – Leader-follower uses reliable, non-ordered delivery ● Group management service – Executes leader-follower, detects replica failure – Clone the replica at another node.

Design of Distributed Real-Time Systems Ramani Arunachalam.

Similar presentations

Presentation on theme: "Design of Distributed Real-Time Systems Ramani Arunachalam."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Design of Distributed Real-Time Systems Ramani Arunachalam.

Similar presentations

Presentation on theme: "Design of Distributed Real-Time Systems Ramani Arunachalam."— Presentation transcript:

Similar presentations

About project

Feedback