300 likes | 522 Views
Fault Tolerance and Security. Geraint Price Information Security Group Royal Holloway. Outline. Introduction Background Security Fault Tolerance Major Contributions A Personal Perspective Future Challenges Conclusions. Introduction.
E N D
Fault Tolerance and Security Geraint Price Information Security Group Royal Holloway
Outline • Introduction • Background • Security • Fault Tolerance • Major Contributions • A Personal Perspective • Future Challenges • Conclusions Security and Protection of Information 2005
Introduction • Computer Security and Fault Tolerance share a subset of goals • The ability to tolerate or mitigate failure in a computer system • The assumptions that underpin traditional solutions make their merger non-trivial • Security: Remove any replication and tighten control • Fault Tolerance: Replicate and compare results Security and Protection of Information 2005
Introduction – II • Recent cross-over research began with Reiter’s work on Rampart (mid 90s) • Spawned a new interest in the application of fault tolerant mechanisms in security: • Tacoma: Provision of replication for mobile agents • MAFTIA: A large-scale project to study survivability in Internet applications • We concentrate on two avenues of research: • Development of the fault model • Progression of the replication mechanisms Security and Protection of Information 2005
Background – Security • Why the relatively late interaction? • In our opinion, it has much to do with the history of computer security: • Trusted Computing Base • Research was weighted towards confidentiality and integrity – not availability • Others had noted this gap in the computer security literature [Needham,’94] Security and Protection of Information 2005
Background – Security – II • Very little in the open literature that dealt with Denial of Service (the absence of availability) • A notable exception [Gligor, ‘86]: • An increase in Maximum Waiting Time (MWT) • Legitimate and other forms of denial of service – system returns before MWT • Interesting exception [Turn and Habibi, ‘86]: • A security function is fault tolerant, if given the presence of a fault, the system’s security policy remains intact Security and Protection of Information 2005
Background – Fault Tolerance • Fault Modelling: • Fault → Error → Failure • Fault: Adjudged or hypothesized cause of error • Error: The part of the system that may lead to failure • Failure: Service deviates from specification • Four techniques within the dependability paradigm: • Fault prevention, fault tolerance, fault removal, fault forecasting Security and Protection of Information 2005
Background – Fault Tolerance – II • Replication Mechanisms: • Underlying group communication mechanisms • Early work conducted at Cornell University: • Isis toolkit: CBCAST (Causal broadcast), ABCAST (Atomic broadcast) • Group Structures: • State Machine Approach: Active replication, which masks the failure of a proportion of the servers • Primary Backup Approach: Passive replication, if the primary fails, then a backup takes over Security and Protection of Information 2005
Major Contributions • Rampart • Castro and Liskov • Quorum Systems • MAFTIA • Tacoma • Other Projects Security and Protection of Information 2005
Rampart • Group communication implemented by Reiter [Reiter, ’94 & ‘96] • First system to implement replicated service based on Byzantine agreement protocols • Main communication structure derived from the earlier work on Isis at Cornell • Extension over the Isis work through its ability to tolerate the malicious failure of a proportion of the servers within the group Security and Protection of Information 2005
Rampart – II • Choices over communication primitives within Rampart: • State machine approach to replication • Digital signatures to provide message authentication in group communication primitive • Lack of efficiency and scalability • Although it has its drawbacks, it inspired the majority of the remaining work • The main research agenda as a result was the search for more efficient protocols Security and Protection of Information 2005
Castro & Liskov • A new replication mechanism to overcome efficiency concerns [Castro & Liskov, ‘99] • Two main differences to Rampart: • Primary backup model • Pair-wise symmetric key Message Authentication Codes • A test implementation over NFS was only 3% slower than Digital Unix NFS • Efficiency gains are due to optimistic protocols under normal operation Security and Protection of Information 2005
Quorum Systems • Data replication in a group of servers [Malkhi & Reiter, ‘97] • Move away from the state machine approach • Increase scalability by removing the server-to-server communication for a read operation • However, their work does require server-to-server communication for state update, and hence a write operation Security and Protection of Information 2005
MAFTIA • Malicious and Accidental Fault Tolerance for Internet Applications • Large EU funded project: • 6 partners • Expertise in fault tolerance, distributed computing, cryptography, formal verification and intrusion detection • 3 main areas of work: conceptual framework and architecture; mechanisms and protocols; formal verification and assessment Security and Protection of Information 2005
MAFTIA – Conceptual Model • Extension of the Fault → Error → Failure model • Re-defining a Fault as an Intrusion: • Intrusion: A malicious, externally-induced fault resulting from an attack that has been successful in exploiting a vulnerability • Attack: A malicious interaction fault, through which an attacker aims to deliberately violate one or more security properties • Vulnerability: A fault created during development of the system, or during operation, that could be exploited to create an intrusion Security and Protection of Information 2005
MAFTIA – Conceptual Model – II • In breaking down an Intrusion, they highlight the possibility of targeting the removing or preventing of both Attacks and Vulnerabilities • Although MAFTIA’s main focus was Intrusion Tolerance, they classify a whole range of security mechanisms according to the fault prevention, tolerance, removal and forecasting paradigms mentioned earlier Security and Protection of Information 2005
MAFTIA – Hybrid Failure Model • Composite fault model with a hybrid failure assumption • The presence and severity of vulnerabilities, attacks and intrusions varies from component to component • Assumptions present in their architectural design: • Built on top of trustworthy components: • Java Card • Trusted Timely Computing Base (TTCB) • Trusted Middleware component Security and Protection of Information 2005
MAFTIA – Hybrid Failure Model – II • The key element of the MAFTIA architecture is the TTCB: • Provision of time based services through the use of a Control Channel • Dedicated and heavily protected security kernel – fail silent rather than arbitrary failure • Implementation of a reliable broadcast protocol that can tolerate up to f of f+2 failures [Correia et al., ‘02 ] Security and Protection of Information 2005
Tacoma • Tromso And COrnell Moving Agents project • Provision of security and fault tolerance were two key elements • Resilience for the agent on a potentially malicious host: • Replicated agents, with voting mechanisms • Fault tolerance for mobile agents: • Extension of the primary backup approach • “… preserving the necessary consistency between replicas can be done efficiently only within a local-area network” Security and Protection of Information 2005
Other Projects • COCA: • Replication of a CA to provide availability • Byzantine quorum systems • Proactive recovery • OASIS (Organically Assured and Survivable Information Systems) • Umbrella project which sponsors separate work items in the field of resilient security Security and Protection of Information 2005
A Personal Perspective • Control of Execution: • Adapting fault tolerant principles for a secure environment can come down to a principle of control • In the Fault → Error → Failure model, breaking the chain requires retaining control • Whose security policy are we protecting? • Proposed mechanisms for allowing a client to share that control [Price, ‘99] Security and Protection of Information 2005
A Personal Perspective – II • Use of Other Mechanisms: • Some of our previous work identified the possibility of using timing checks [Price, ’01] • Remove the attacker’s ability to delay or replay messages with impunity • Some variants of replay attacks rely on this • With hindsight, there is an interesting comparison with MAFTIA’s use of a Control Channel Security and Protection of Information 2005
Future Challenges • Relaxation of assumptions: • Fully Byzantine failure models are difficult to protect against – and hence solutions are inefficient • Most of the work since Rampart have concentrated on feasible means of relaxing these failure assumptions: can we do better? • Further use of hardware: • MAFTIA’s use of trusted hardware allows for more efficient protocols – can the principle be generalised? • Mixed failure environments [Siu et al., ‘98] • Trusted Computing Group Security and Protection of Information 2005
Future Challenges – II • Other dependability models: • Fault tolerance is only part of a very mature dependability literature • Disjoint v Inclusive error recovery? • MAFTIA defined a whole classification within their model • Security service classification: • Quorum based systems use the parallelism of a read operation to increase efficiency • Can we class different services according to their communication requirements? Security and Protection of Information 2005
Conclusions • Until 10 years ago, the work in this field was sparse and sporadic • Now there is a large body of work in this area • Practical efficiency is still a key research topic • Broaden our search for other applicable mechanisms • Availability and survivability on the Internet is only going to become more important Security and Protection of Information 2005