1 / 22

Design of Distributed Real-Time Systems

Ramani Arunachalam. Design of Distributed Real-Time Systems. Case Study: MARS. MARS (Maintainable Real-time system) Distributed, fault-tolerant, hard real-time Objectives Guaranteed timeliness Testability Maintainability Fault-tolerance Systematic software development

isabel
Download Presentation

Design of Distributed Real-Time Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ramani Arunachalam Design of Distributed Real-Time Systems

  2. Case Study: MARS • MARS (Maintainable Real-time system) • Distributed, fault-tolerant, hard real-time • Objectives • Guaranteed timeliness • Testability • Maintainability • Fault-tolerance • Systematic software development • Time-triggered architecture

  3. Objectives • Guaranteed timeliness • Based on resource adequacy at peak load • Statistical assurances not enough • Testability • Architecture should support testability of timeliness • Maintainability • Needed to remedy hardware faults, design errors and respond to change requests • Localized consequences -> minimized effort

  4. Objectives • Fault Tolerance • Redundancy • On-line maintenance • Systematic software development • No 'trial and error' integration • OS guarantees predictable temporal behaviour

  5. State View • Time Triggered observation of states • Observe RT entities at predefined intervals • Intelligent input output • Observation grid • Intelligent sensor • Preprocesses raw data from input device • observes at finer granularity called Perception granularity

  6. State View • Intelligent actuator • Post-processes data from computer system before sending to output device • State Messages • Produced at observation points • Minimal synchronization requirement • No need for buffer management • Unidirectional (from RT entity)

  7. Structure • Clusters • Autonomous subsystems • Disjoint name spaces • State message exchanges • Composed of Fault-tolerant units (FTUs) • Real-time communication channel (TDMA) • FTU • Composed of replicated components • Active and shadow components

  8. FTU FTU

  9. Structure • Component • Smallest replaceable unit • Fail-silent (Correct results or none) • Termination upon failure • Task Execution • Task : Software inside component • Starts at predefined time • Proceeds without any communication or synchronization • Execution time is deterministic

  10. Operation • Results of periodic tasks sent as state messages • Execution time of communication is also predefined • A Real-time transaction is a progression of processing and communication actions between a stimulus from and a response to the environment. • Static scheduling (at compile time!) • At run-time, no surprises • Modes (operating, emergency)

  11. Fault-tolerance • Two levels of redundancy • Active redundancy at FTU level • If a component fails, standby becomes active • Time redundancy at component level • Every task is executed twice and results compared • TDMA monitor • Monitors temporal behaviour • Controls the output from component • Distributed clock synchronization

  12. Fault-tolerance • Replica determinism • All replicated components perform the same state changes at the same point in time • Prohibit reading of local time • All replicas should agree when to change mode • Component reintegration • i-state, h-state • Reintegration point: when size of h-state is small • New component gets the h-state at this point

  13. Summary • Maintenance • Failed component doesn't affect FTU • On-line reintegration after repair • Change in software • Does it fit in current schedule? • Otherwise, new mode with new schedule • Summary • Strict separation of functionality, timeliness and dependability. • Designed for temporal behaviour, testing simplified.

  14. Delta-4 XPA • Objectives • “A real-time system is not assured to meet deadlines outside operational envelope” • Bounded-demand school • operational envelope is predictable • Impractical assumption for complex systems • Unbounded-demand school • Complete definition of operational envelope is not possible • Graceful degradation if it falls outside the envelope • XPA implements hard real-time but falls into best-effort behaviour when required.

  15. DELTASE Group management Layer Time and Group communication Abstract network layer (physical + MAC+ firmware)

  16. Architecture • Network infrastructure • FDDI supports urgent traffic, built-in fault tolerance • Token bus/ring has media redundancy for availability • Time • Internal time maintained by distributed time server • Clocks synchronized to tens of microseconds • External time – one of the standard time • Group communication • Services from atomic multicast to datagram • Very fast services of varying reliability

  17. Architecture • Group communication • Distributed replication management • BestEffortN – guarantee delivery to N elements • BestEffortTo - guarantee delivery to named elements • AtLeastN, atLeastTo – guaranteed service even when sender fails • Group management • Distributed Group manager object • Management and distribution of groups of objects • Incorporates knowledge of various modes of replication

  18. Architecture • Application support environment (Deltase) • Client-server and producer-consumer interactions • Apps written using deltase or converted using preprocessors • Timeliness • What to do under overload conditions? • Static off-line scheduling – too many possibilities • On-line scheduling – can find feasible schedules if not overload.

  19. Timeliness • Scheduling policy uses “precedence” • Combination of priority and earliest-deadline • Few priority classes to avoid unfairness • Within priority class, earliest-deadline-first. • Design-time and run-time timeliness • Targetline : instant chosen by designer for provision of service • Liveline and deadline: earliest and latest time at which service may be provided • Violation of these detected at runtime and design-time actions defined.

  20. Preemption • Leader-follower model for replication • Decisions made by a privileged replica i.e. Leader • Preemption point • Point at which an interrupt will be served • High precedence msg arrives for a process not running currently • Increase the process's precedence to that of msg • Causes the process to be scheduled • These actions propogated to followers • Followers perform identical operations

  21. Desynchronization • Followers must not be too apart from leaders • Followers too fast • Reach the preemption point before leader • remain blocked until leader notifies • Followers too slow • Leader timestamps notifications • If follower didn't execute the action by T+t(desync) • Desynchonization event raised • Another follower takes over

  22. Summary • Communication support using groups • Oriented to distributed computing • Tradeoffs between QOS and efficiency • Group mgr uses atomic multicast for orderly delivery • Leader-follower uses reliable, non-ordered delivery • Group management service • Executes leader-follower, detects replica failure • Clone the replica at another node.

More Related