Probabilistic Validation of Computer System Survivability

Probabilistic Validation of Computer System Survivability William H. Sanders University of Illinois at Urbana-Champaign whs@uiuc.edu www.iti.uiuc.edu

HIGH INNOVATION PLANNING STEALTH COORDINATION LOW Defending Against a Wide Variety of Attacks Economic intelligence Military spying Nation-states, Terrorists, Multinationals Information terrorism Disciplined strategiccyber attack Selling secrets Civil disobedience Serious hackers Embarrassing organizations Harassment Stealing credit cards Collecting trophies Script kiddies Copy-cat attacks Curiosity Thrill-seeking

Prevent Intrusions (Access Controls, Cryptography, Trusted Computing Base) Cryptography Access Control & Physical Security Multiple Security Levels Trusted Computing Base But intrusions will occur 1st Generation: Protection Detect Intrusions, Limit Damage (Firewalls, Intrusion Detection Systems, Virtual Private Networks, PKI) PKI VPNs 2nd Generation: Detection Intrusion Detection Systems But some attacks will succeed Boundary Controllers Firewalls Tolerate Attacks (Redundancy, Diversity, Deception, Wrappers, Proof-Carrying Code, Proactive Secret Sharing) Hardened Operating System Intrusion Tolerance Graceful Degradation Big Board View of Attacks Real-Time Situation Awareness & Response 3rd Generation: Tolerance Intrusion Tolerance: A New Paradigm for Security

Validation of Computer System/Network Survivability • Security is no longer absolute • Trustworthy computer systems/networks must operated through attacks, providing proper service in spite of possible partially successful attacks • Intrusion tolerance claims to provide proper operation under such conditions • Validation of security/survivability must be done: • During all phases of the design process, to make design choices • During testing, deployment, operation, and maintenance, to gain confidence that the “amount” of intrusion tolerance provided is as advertised.

Validating Computer System Security: Research Goal CONTEXT: Create robust software and hardware that are fault-tolerant, attack resilient, and easily adaptable to changes in functionality and performance over time. GOAL: Create an underlying scientific foundation, methodologies, and tools that will: • Enable clear and concise specifications, • Quantify the effectiveness of novel solutions, • Test and evaluate systems in an objective manner, and • Predict system assurance with confidence.

Existing Security/Survivability Validation Approaches • Most traditional approaches to security validation have focus on avoiding intrusions (non-circumventability), or have not been quantitative, instead focusing on and specifying procedures that should be followed during the design of a system (e.g., the Security Evaluation Criteria [DOD85, ISO99]). • When quantitative methods have been used, they have typically either been based on formal methods (e.g., [Lan81]), aiming to prove that certain security properties hold given a specified set of assumptions, or been quite informal, using a team of experts (often called a “red team,” e.g. [Low01]) to try to compromise a system. • Both of these approaches have been valuable in identifying system vulnerabilities, but probabilistic techniques are also needed.

Example Probabilistic Validation Study • Evaluation of DPASA-DV Project design • Designing Protection and Adaptation into a Survivability Architecture: Demonstration and Validation • Design of a Joint Battlespace Infosphere • Publish, Subscribe and Query features (PSQ) • Ability to fulfill its mission in the presence of attacks, failures, or accidents • Uses Multiple, synergistic validation techniques

JBI Management Staff Quad 1 Quad 2 Quad 3 Quad 4 Executive Zone JBI Core Operations Zone Crumple Zone Network Access Proxy (Isolated Process Domains in SE-Linux) Domain6 Local Controller First Restart Domains Eventually Restart Host Protection Domains Isolation among selected functions on individual core hosts and on clients Domain1 Domain2 Domain3 Domain4 Domain5 Forward/ Ratelimit Proxy Logic Inspect / Forward / Rate Limit PS Sensor Rpts DC PSQImpl PSQImpl Eascii IIOP RMI IIOP TCP UDP TCP TCP STCP JBI Design Overview

Survivability/Security Validation Goal • Provide convincing evidence that the design, when implemented, will provide satisfactory mission support under real use scenarios and in the face of cyber-attacks. • More specifically, determine whether the design, when implemented will meet the project goals: • This assurance case is supported by: • Rigorous logical arguments • Experimental evaluation • A detailed executable model of the design

Goal: Design a Publish and Subscribe Mechanism that … • Provides 100% of critical functionality when under sustained attack by a “Class-A” red team with 3 months of planning. • Detects 95% of large scale attacks within 10 mins. of attack initiation and 99% of attacks within 4 hours with less than 1% false alarm rate. • Displays meaningful attack state alarms. Prevent 95% of attacks from achieving attacker objectives for 12 hours. • Reduces low-level alerts by a factor of 1000 and display meaningful attack state alarms. • Shows survivability versus cost/performance trade-offs.

Integrated Survivability Validation Procedure R Requirement Decomposition S Q P Functional Model of the System (Probabilistic or Logical) Functional Model of the Relevant Subset of the System Model for Access Proxy Model for Client Model for PSQ Server … Assumptions AP1 AP2 AA1 AA2 AA3 Supporting Logical Arguments and Experimentation M6 M5 M4 M1 (Network Domains) M2 M3 L1 (ADF) L2 L3

Integrated Survivability Validation Procedure Steps R • A precise statement of the requirements S Q P • High-level functional model description: • Data and alerts flows for the processes related to the requirements, • Assumed attacks and attack effects [Threat/vulner-ability analysis; whiteboarding] Functional Model of the Relevant Subset of the System Model for Access Proxy Model for Client Model for PSQ Server … AP1 AP2 AA1 AA2 AA3 M6 M5 M4 M1 (Network Domains) M2 M3 L1 (ADF) L2 L3

Integrated Survivability Validation Procedure Steps R S Q P • Detailed descriptions of model component behaviors representing 2a and 2b, along with statements of underlying assumptions made for each component. [Probabilistic modeling or logical argumentation, depending on requirement] Functional Model of the Relevant Subset of the System Model for Access Proxy Model for Client Model for PSQ Server … AP1 AP2 AA1 AA2 AA3 M6 M5 M4 M1 (Network Domains) M2 M3 L1 (ADF) L2 L3

Integrated Survivability Validation Procedure Steps R S Q P • Construct executable functional model [Probabilistic modeling, if model constructed in 3 is probabilistic] Functional Model of the Relevant Subset of the System Model for Access Proxy Model for Client Model for PSQ Server … In Parallel • a) Verification of the modeling assumptions of Step 3 [Logical argumentation]and, • b) where possible, justification of model parameter values chosen in Step 4. [Experimentation] AP1 AP2 AA1 AA2 AA3 M6 M5 M4 M1 (Network Domains) M2 M3 L1 (ADF) L2 L3

Integrated Survivability Validation Procedure Steps R S Q P • Run the executable model for the measures that correspond to the requirements of Step 1. [Probabilistic modeling] Functional Model of the Relevant Subset of the System Model for Access Proxy Model for Client Model for PSQ Server … AP1 AP2 AA1 AA2 AA3 M6 M5 M4 M1 (Network Domains) M2 M3 L1 (ADF) L2 L3

Integrated Survivability Validation Procedure Steps R ? • Comparison of results obtained in Step 6, noting in particular the configurations and parameter values for which the requirements of Step 1 are satisfied. S Q P Functional Model of the Relevant Subset of the System Model for Access Proxy Model for Client Model for PSQ Server … AP1 AP2 AA1 AA2 AA3 Note that if the requirement being addressed is not quantitative, steps 4 and 6 are skipped. M6 M5 M4 M1 (Network Domains) M2 M3 L1 (ADF) L2 L3

Step 1: Requirement Specification • Expressed in an argument graph: JBI critical mission objectives JBI critical functionality JBI mission Detection / Correlation Requirements Initialized JBI provides essential services JBI properly initialized IDS objectives Authorized publish processed successfully Authorized subscribe processed successfully Authorized query processed successfully Authorized join/leave processed successfully Unauthorized activity properly rejected Confidential info not exposed

Requirements decomposition Executable model Model assumptions Supporting arguments Argument Graph for the Design

Step 2: System and Attack Assumption Definition Example High level description … Steps 4-5 Access proxy verifies if the client is in valid session by sending the session key accompanying the IO to the Downstream Controller for verification Step 6 Access Proxy forwards the IO to the PSQ Server in its quadrant. ....

Attack Model Description • Definitions • Intrusion, prevented intrusion, tolerated intrusion • New vulnerabilities • Assumptions • Outside attackers only • Attacker(s) with unlimited resources • Consider successful (and harmful) attacks only • No patches applied for vulnerabilities found during the mission/scenario execution

Attack Model Description • Attack propagation • MTTD: mean time to discovery of a vulnerability • MTTE: mean time to exploitation of a vulnerability • 3 types of vulnerabilities: • Infrastructure-Level Vulnerabilities  attacks in depth • OS vulnerability • Non-JBI-specific application-level vulnerability • pcommon : common-mode failure • Data-Level Vulnerabilities  attacks in breadth • Using the application data of JBI software • Across process domains • flaw in protection domains

Attack Model Description • Attack effects • Compromise • Launching pad for further attacks • Malicious behavior • Crash • Attack propagation stopped • (DoS) • Distinction between OSes with and without protection domains

Attack Model Description • Intrusion Detection • pdetect=0 if the sensors are compromised • pdetect > 0 otherwise. • Attack Responses • Restart Processes • Secure Reboot • Permanent Isolation

AP IO AP IO AP Hb AP Hb ADF NIC ADF NIC AP Alert AP Alert Se Se Ac Ac LC LC PSQ PSQ ADF NIC ADF NIC Se Se Ac Ac SM, Quad 1, OS 3 SM, Quad 1, OS 2 ADF NIC ADF NIC LC LC SM SM Infrastructure Attacks Example Quadrant 1 Access Proxy, Quad 1, OS 1 DC, Quad 1, OS 1 Policy Server, Quad 1, OS 1 T=85 min.: discovery of a vulnerability on the Main PD, OS1 AP IO DC ADF NIC Outside PS ADF NIC AP Hb Se all quad components ADF NIC AP Alert Ac Se LC Ac PSQ Server, Quad 1, OS 1 Guardian, Quad 1, OS 1 Outside LC PSQ Gu ADF NIC ADF NIC Se Se SM, Quad 1, OS 1 Publishing Client, OS1 Ac Ac ADF NIC SD LC LC SM ADF NIC Se Ac Correlator, Quad 1, OS 1 ADF NIC LC Co Crumple Zone Operations Zone Executive Zone Access Proxy, Quad 2, OS 2 Access Proxy, Quad 3, OS 3 PSQ Server, Quad 2, OS 2 Access Proxy, Quad 4, OS 4 PSQ Server, Quad 3, OS 3 AP IO PSQ Server, Quad 4, OS 4 SM, Quad 1, OS 4 PSQ AP Hb ADF NIC ADF NIC Se ADF NIC AP Alert SM Ac Se Outside LC Ac LC

Step 3: Detailed descriptions of model component behaviors and Assumptions (Access Proxy) 4.4 Access Proxy 4.4.1 Model Description AM1: If a process domain in the DJM proxy is not corrupted, it forwards the traffic it is designated to handle from the Quadrant isolation switch to core quadrant elements and vice versa. All traffic being forwarded is well-formed (if the proxy is correct). The following kinds of traffic are handled: 1. IOs (together with tokens) sent from publishing clients to the core (we do not distinguish between IOs sent via different protocols such as RMI or SOAP/HTTP). …. AM2: Attacks on access proxy: attacks on an access proxy are enabled if either/both 1. Quadrant isolation switch is ON, and one or more clients are corrupted, leading to: a) Direct attacks: can cause the corruption of the process domain corresponding to the domain of the attacking process on the compromised client. …. AM3: If an attack occurs on the access proxy, it can have the following effects: 1. Direct attacks leading to process corruption: a) Enable corruption of other process domains on the host. ….. 4.4.2 Facts and Simplifications AF1: Each access proxy runs on a dedicated host machine. AF2: DoS attacks result in increased delays. …. Model of Access Proxy 4.4.3 Assumptions AA1: Only well-formed traffic is forwarded by a correct access proxy. AA2: The access proxy cannot access cryptographic keys used to sign messages that pass through it. AA3: Access proxy cannot access the contents on an IO if application-level end-to-end encryption is being used. AA4: Attacks on an access proxy can only be launched from compromised clients, or from corrupted core elements that interact with the access proxy during the normal course of a mission. …. Assumptions

Step 4: Construct Executable Functional Model

Step 5: Supporting Logical Arguments

Logical Argument Sample PSQ Server Model Access Proxy Model Functional Model SA4: Client Confidentiality in PSQ Server AA2: AP Application-layer Integrity AA3: AP Application-layer Confidentiality SA3: IO Integrity in PSQ Server Model Assumptions Private Key Confidentiality Supporting Arguments No Cryptography in Access Proxy No Unauthorized Direct Access No Unauthorized Indirect Access Not Preconfigured Not Reconfigurable ADF NIC services protected Keys Not Guessable Keys Protected from Theft Physical Protection of CAC device Protection of CAC Authentication Data No Compromise of Authorized Process Accessing CAC DoD Common Access Card (CAC) Algorithmic Framework Key Length Key Lifetime PKCS #11 Compliance Tamperproof

Steps 6 and 7: Measures and Results • Assumptions: CPUB is the conjunction of • C1PUB= the publishing client is successfully registered with the core • C2PUB= the publishing client's mission application interacts with the client as intended • Definition of a successful publish: EPUB is the conjunction of • E1PUB = the data flow for the IO is correct • E2PUB = the time required for the publish operation is less than tmax • E3PUB = the content of the IO received by the subscriber has the same essential content as that assembled by the publisher • Measure: P[EPUB|CPUB] • Fraction of successful publishes in a 12 hour period • Between clients that cannot be compromised • Objective • P[EPUB|CPUB] ≥ pPUB for a 12-hour mission

Vulnerability Discovery Rate Study Fraction of successful publishes versus MTTD Number of successful intrusions versus MTTD

Varying the number of OS and OS w/ process domains

Autonomic Distributed Firewall (ADF) NIC policies Fraction of successful publishes Total number of intrusions • Per-pd policies considerably increase the performance (10% unavailability vs. 1.5% at MTTD=100 minutes) • ADF NICs can handle per-port policies => should take advantage of this feature, implying to set the communication ports in advance

Experimental Validation Overview • System Validation Results • Methodology Review • Example Attack Graphs • Example Attack Step Priority Determination • Detailed Analysis of Example Attack Steps • ADF susceptibility to DoS attack vulnerability • RMI Registration vulnerability • RMI Method Fuzzer • Remaining work

System-Level Validation Methodology (SVM) Objectives: • Improve the system’s survivability • Conduct specific system-level validation tasks • Address all of the system-level concepts and mechanisms that may contribute to improvement, e.g., protocols and application scenarios Main Idea: • Think like an attacker • Examine whether a given attacker goal can be achieved • If so, alter the implementation so as to preclude such achievement Procedure: • Top-down, beginning with a specific high-level attacker goal • Critical steps of the high-level attack tree are elaborated further as sub-trees, down to a level that admits adversarial testing.

SVM: Step-by-Step • Step 1: State an attacker goal G in precise terms, along with accompanying assumptions regarding the system and its environment. • Step 2: Understand the system components and interactions that support the event the attacker wishes to preclude. • Step 3: Construct an attack tree T(G) based on Step 1 and Step 2, where a leaf (attack step) of this tree is specific enough to be pursued by a team of 1-3 persons. • Step 4: Determine all the minimal attacks associated with T(G) and, using appropriate quantitative and qualitative criteria, prioritize the attack steps to be considered by the individual teams. • Step 5: For a given attack step s, its assigned team determines whether further expansion is required. If so, a sub-tree (having s as its root) is constructed such that its leaves (low-level steps) are specific enough to admit to adversarial testing. • Step 6: Determine all the minimal low-level attacks associated with T(s) and prioritize these attacks using appropriate quantitative and qualitative criteria. • Step 7: Guided by priorities determined in Step 6, conduct adversarial testing with respect to selected low-level attacks to see if goal s can indeed be realized. • Step 8: Report Results.

SVM: Attacker Goals • We are currently considering the following attacker goals: G1: Prevent client publish G2: Prevent IO delivery to client (Subscription) G3: Prevent a successful query operation G4: Prevent a successful client registration G5: Defeat confidentiality of IO data G6: Modify IO data G7: Modify data in repository

G5: Defeat Confidentiality of IO Data Definitions for G5: • IO data is data that has been or will be carried as the protected payload of an IO (Information Object). • Confidentiality of such data is defeated if it can be viewed in plaintext (not encrypted) by an adversary. Assumptions for G5: • The clients are assumed to not be compromised in their initial state. An adversary may compromise any host as part of an attack. • IOs are encrypted by the publishing client, decrypted by the PSQ server for guardian checks, and encrypted again for delivery to the subscribed client. • IOs archived in the PSQ repository are stored in plain-text. • With the exception of the man in the middle attack (MITM), the core and client have successfully exchanged contact information, configuration information, and have successfully authenticated each other. • The core has successfully set up the publishing client’s NICs.

G5: High-level Attack Tree

G5: Attack Steps/Minimal Attacks

Elaboration of Attack Steps • An attack step s of a high-level tree often requires further development as a sub-tree (with root s). • If so, it is assigned to a Focus Group that • Develops an attack sub-tree wherein each leaf (low-level attack step) is refined enough to either • admit to adversarial testing, or • based on other evidence, decide whether it is realizable by an attacker. • Conducts adversarial tests for type a) steps • Provides arguments for type b) steps

Attack Sub-tree for Bypass AP Step

Bypass AP: Attack Steps/Attacks

Focus Group Status Reports • Progress of work by each focus group is reported via a Focus Group Status Report (FGSR) that is typically updated at least once per week. • For each low-level attack step (leaf of an FG sub-tree), the FGSR reports the following information, as illustrated by several entries of the Bypass AP report.

Bypass AP FGSR (Sample Entries) Focus Group Status Report Focus Group name: BypassAP Last updated on: 11/15/04

Summary of Attack Steps/Minimal Attacks • For the seven high-level attack trees that have been developed, there are • 524 attack steps (including repeats) • 114 different attack steps • 35 attack steps under consideration • 20 attack steps yet to be addressed • 55 attack steps are trivial, but depend on others to be accomplished • 4 attack steps that are being ignored for different reasons • The number of different minimal attacks for each high-level goal (these are derived automatically from a goal’s attack tree) are as follows. • G1: 54 • G2: 43 • G3: 36 • G4: 52 • G5: 8 • G6: 12 • G7: 11 • Total number of minimal attacks with respect to all goals: 216

Attack Steps Currently Being Addressed

Attack Steps That Will Be Addressed

Attack Steps that are Trivial, but Depend on Others to accomplish

Attack Steps That Will Be Ignored

Example Attack Step Analysis: ADF DOS Attack • Three Metrics were used to benchmark the ADF. • Max. Throughput: The fastest receive rate at which there is no packet loss • Available Bandwidth: The amount of data that can be transmitted in a fixed amount of time (when no flood in progress) • Minimum Flood Rate: The lowest rate of flood which leads to a successful denial of service attack. • Floods cause packet loss, which in turn lowers bandwidth due to TCP congestion control. UDP will suffer high packet loss. • Experimental Setup • Follows rfc2544 as much as possible • Max flood rate is ~44000 frames/sec = 22 Mbits/sec (for 64 Byte frames)

Probabilistic Validation of Computer System Survivability

Probabilistic Validation of Computer System Survivability

Presentation Transcript

Network Survivability

Survivability

Validation of Rating System

VALIDATION OF WATER SUPPLY SYSTEM

CSE 221: Probabilistic Analysis of Computer Systems

Validation of pharmaceutical process, Analytical Method development Computer system validation, ERP

CSE 221: Probabilistic Analysis of Computer Systems

CSE 3504: Probabilistic Analysis of Computer Systems

CSE 221: Probabilistic Analysis of Computer Systems

Survivability Of SDH Network

SURVIVABILITY OF COMPLEX NETWORKS

CSE 3504: Probabilistic Analysis of Computer Systems

CSE 221: Probabilistic Analysis of Computer Systems

Probabilistic Reasoning System (II)

Analytical validation of Computer Codes

Survivability Panel

Probabilistic Validation of Computer System Survivability

FDA Computer System Validation Steps

Computer System Validation Compliance Required by FDA

Reduce costs for compliance with data integrity: 21 CFR Part 11, SaaS/Cloud, EU GDPR

Survivability

SURVIVABILITY OF COMPLEX NETWORKS