1 / 23

FT-ERF Fault-Tolerance in an Event Rule Framework for Distributed Systems

FT-ERF Fault-Tolerance in an Event Rule Framework for Distributed Systems. Hillary Caituiro-Monge, Graduate Student. Advisor: Javier Arroyo-Figueroa, Ph.D. Presentation 3. Presentation Objectives. Understand the Architecture of the Scalable and Fault-Tolerant ERF Architecture

lore
Download Presentation

FT-ERF Fault-Tolerance in an Event Rule Framework for Distributed Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FT-ERFFault-Tolerance in an Event Rule Framework for Distributed Systems Hillary Caituiro-Monge, Graduate Student. Advisor: Javier Arroyo-Figueroa, Ph.D. Presentation 3

  2. Presentation Objectives • Understand the Architecture of the Scalable and Fault-Tolerant ERF Architecture • Relate Challenges on Active Replication • Analyze Core Lacks among RUBIES replicas, with the purpose of Achieve Fault-Tolerance: • Lack of Timing Synchronization of Rule Evaluation Cycles (REC) • Lack of Consistency of Event Sets (ES) • Distributed Agreement Protocol

  3. Presentation Objectives • Introduce Research New Objective

  4. DISTRIBUTION DIMENSION RUBIES(γ11,δ1) RUBIES(γ21,δ2) RUBIES(γN1,δN) RUBIES(γ12,δ1) RUBIES(γ22,δ2) RUBIES(γN2,δN) REPLICATION DIMENSION RUBIES(γ1M,δ1) RUBIES(γ2M,δ2) RUBIES(γNM,δN) SCALABLE AND FAULT TOLERANT ERF ARCHITECTURE

  5. REPLICATION CLASS DIAGRAM

  6. DISTRIBUTION CLASS DIAGRAM

  7. Challenges on Active Replication • Strong replica consistency • All replicas must have the same state after method invocations • Duplicated invocation detection and suppression

  8. Lack of Timing Synchronization of Rule Evaluation Cycles (REC) among RUBIES replicas • It is a source of non-deterministic behavior among RUBIES replicas • It is not triggered in response to direct or indirect client’s method invocation • It is always running • Thereby the replicas consistency is not reachable by means of interface based consistency mechanisms

  9. Lack of Timing Synchronization of Rule Evaluation Cycles (REC) among RUBIES replicas • Each replica from a group has its independent REC, where the • Starting time differs • Duration time differs • Making a scenario where each group member or replica runs each REC including different events.

  10. Lack of Consistency of Event Sets (ES) among RUBIES replicas • It is a source of non-deterministic behavior among RUBIES replicas • The ES’ content changes different for each replica • The ES’ content changes for two reasons: • Incoming events • Died events

  11. Lack of Consistency of Event Sets (ES) among RUBIES replicas • The ES’ content changes different for each replica, it is as consequence of delivery communication delay of events to each replica.

  12. What is the problem? • Each replica, belong to same group, includes dissimilar events for each consecutive equivalent REC execution. • As result each RUBIES replica posts different events in different times and with different state. • Such behavior is a problem for load distribution and/or replication.

  13. What is the issue? • Strong replica consistency • Synchronize rule evaluation cycles among RUBIES replicas • Turn consistent event sets among RUBIES replicas

  14. How to do it? • Distributed Agreement or Consensus Protocol (Currently working in this) • RUBIES replicas must start each REC after an agreement. • RECs must have an unique ID • RECs of same ID must run simultaneously

  15. How to do it? • Distributed Agreement or Consensus Protocol (Currently working in this) • RUBIES replicas must include same events for RECs of same ID • Agreement must include which events will consider • Sliding window

  16. Research New Objective • The proposed research will focus on the fault-tolerance problem in ERF. • The main purpose is to design and implement a strong replica consistency mechanism to achieve fault-tolerance.

  17. Procedure • Select an Active Replication Software • Must be CORBA Fault-Tolerant Compatible • Must be portable • Must not be intrusive • No commercial • Make an Distributed Agreement Protocol • Related Above

  18. OGS (Object Group Service) • Non-intrusive • Service approach. • Requiring no change to the underlying ORB • Compliant with the CORBA specification • Not proprietary. • Designed and implemented as a set of CORBA objects. This makes it interoperable between different ORBs. • Plans to extend OGS and make it compliant with FT-CORBA specification. • White box.

  19. Eternal Systems FTORB • Non-intrusive • Interception approach. • CORBA objects above the ORB support the interfaces of the OMG Fault-Tolerant standard specifications • Replication mechanisms below the ORB that provide strong replica consistency • Interceptors to reach independence of the ORB and applications.

  20. Others • GMS (Group Communication Service) • IRL • Isis+Orbix Electra • AQua

  21. Comparison among Fault-Tolerant CORBA systems Carlo Marchetti et. al. “Architectural Issues on Fault Tolerance in CORBA”, in Proceedings of the SSGRR 2000 Computer & Business Conference, L'Aquila, Italy, 2000

  22. Conclusion • For Fault-Tolerance in ERF is necessary the design and implementation of an agreement protocol with the purpose of achieve strong replica consistency. • Strong replica consistency will enable ERF for distributed scenarios, such as replication, load distribution, load balancing, and so on.

  23. Thanks

More Related