1 / 16

Fault and Intrusion Tolerant (FIT) Event Broker & BFT- SMaRt

Fault and Intrusion Tolerant (FIT) Event Broker & BFT- SMaRt. A. Casimiro , D. Kreutz , A. Bessani , J. Sousa, I. Antunes, P . Veríssimo University of Lisboa, Portugal Meeting PT, November 27, 2012. Cloud Infrastructures. Switching and Routing. Control. Events. Events.

bunny
Download Presentation

Fault and Intrusion Tolerant (FIT) Event Broker & BFT- SMaRt

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fault and Intrusion Tolerant (FIT) Event Broker& BFT-SMaRt A. Casimiro, D. Kreutz, A. Bessani, J. Sousa, I. Antunes, P. Veríssimo Universityof Lisboa, Portugal Meeting PT, November 27, 2012

  2. Cloud Infrastructures Switching and Routing Control Events Events Monitoring Tools and Control Engines Control Alert!Cloud infrastructures are one of the new hot targets of attacks! Events Control Control Events Storage farm Processing farm

  3. Example scenario:Portugal Telecom Cloud Computing Infrastructure • SmartCloud product • First and main problem: • Centralized monitoring approach • Diversity of monitoring tools • ArchSight, Pulse, SCOM Problems: faults and attacks; diversity is hard to achieve in practice. ArcSight or other tool Agent-Based Events Agent with ArchSight Events Events ArcSight (engine) Agentless Events Events Monitoring Probe

  4. The TRONE approach FaultandIntrusionTolerant (FIT) Event Broker AutomatedFailureDiagnosis Multi-homing for fastreconfiguration 1 2 3

  5. FIT Event BrokerGoals and challenges • Overarching goals: • To provide support for trustworthy and resilient monitoring of cloud/datacenter infrastructures • To achieve improved Quality of Protection without neglecting Quality of Service (performance) needs • Some specific challenges: • Deal with large flows of information (events) • Support different kinds of events (e.g. different criticality) • Low intrusiveness and easy integration

  6. FIT Event Broker Assumptions • System entities: • Probes, event collectors/brokers, consoles • Some event processing may be done by collectors • Fully connected network • E.g., all the entities lie in the same monitoring VLAN • Partially synchronous system • Clocks may be used to timestamp events • Faults • Some FIT brokers may crash or fail in a Byzantine way • We do not require/enforce clients (probes/consoles) to be correct • If this is a problem for monitoring, then it must also be solved

  7. FIT Event Broker Baseline design options • Topic-based Publish-Subscribe paradigm • Good fit to considered scenarios • State Machine Replication • Active replication is better for Byzantine fault tolerance • f out of n replicas of a FIT Broker may fail in a Byzantine way • Public-key cryptography • Client authentication, avoid attacks from malicious probes • Event channels with support for QoP and QoS • Differentiated fault-tolerance support (e.g. crash only or BFT)

  8. FIT Event Broker High level architectural view

  9. FIT Event BrokerInterface Create event channel In: TAG and CLASS Destroy event channel In: TAG Register to channel In: TAG Publish event In: EVENT Subscribe to channel In: TAG Receive event Out: EVENT

  10. FIT Event BrokerInternal state • From the SMR perspective, it is important to identify the relevant state that needs to be maintained consistent across replicas • Data related to the broker configuration • Existing channels and their CLASS • Registered publishers and subscribers • Data related to events • Events that are ready to be delivered • All client input that affects the state of the FIT broker state (e.g. channel and subscription data, some events) must be handled as a state machine command

  11. BFT-SMaRtOverview • Java-based platform for BFT SMR, available at http://code.google.com/p/bft-smart/ • Actively being developed and improved in our group • BFT SMR “common” features • State machine programming model • n ≥ 3f+1 replicas required • A small step away from being a commercial product  • Advanced features • Replica recovery (state transfer) • Reconfigurations • Extensible API: e.g. custom voter

  12. BFT-SMaRtServiceinvocation FIT Broker state Agreementonorder performedbySMaRt PROBE

  13. BFT-SMaRtExecutionand response Commands are delivered to the FIT broker, whichupdatesthestate/queues andreplies Votingonclient side

  14. BFT-SMaRtImplementation& Evaluation • TheFIT Broker iscurrentlybeingimplemented… • …andintegratedwith BFT-SMaRt • Evaluation: • Throughput • Aimis to dealwith 40K events/sec • Resilience • Measure performance underattack • Verifyrecoveryandreconfigurationcapabilities • A simple demo isavailable

  15. BFT-SMaRtImplementation& Evaluation • Preliminaryresultsavailable [DAIS 2012] Throughput for up to 100 channels

  16. Summary • FIT Event Broker – Event dissemination support • For easier deployment of multiple monitoring tools • Manage which events are propagated, to which consoles, with which QoS • BFT-SMaRT – Byzantine fault tolerant replication • First usable implementation of BFT replication • Leading edge worldwide • Resilience against malicious attacks with small overhead • Portugal Telecom’s cloud infrastructure is being used as real use case for application and evaluation of the work

More Related