Improving Self-Defense by Learning from Limited Experience

Karen Haigh Steven Harp BBN Technologies Adventium Labs Improving Self-Defense byLearning from Limited Experience Haigh & Harp, Learning from Limited Experience

Haigh & Harp, Learning from Limited Experience Overview • Goal:Systems that autonomously improve their defenses with experience. • Several ways to do this... • Examples discussed: • Learning to recognize anomalies • Self Immunizing against observed exploits • Acquiring multistage attacks concepts • Learning effective responses

Haigh & Harp, Learning from Limited Experience Learning in Cyber Security • What is (machine) learning? • Automatically using prior experience to improve performance over time • Problems addressable by learning? • Detection: distinguish problem from non-problem • Immunity: • Good: “an exploit should succeed at most once” • Better: “a vulnerability should be exploitable at most once” • Response: how best to actively counter an attack? Long Term Goal: Cognitive Immunity

Haigh & Harp, Learning from Limited Experience Opportunities & Techniques

Haigh & Harp, Learning from Limited Experience Experimental Sandbox Modelling Defended Systems • Expert Rules • Offline Learning • Online Learning Offline Training + Good data + Complex environment - Dynamic system Online Training - Unknown data + Complex environment + Dynamic system Very hard for adversary to “train” the learner!!! Experimental Sandbox + Good data (self-labeled)‏ + Complex environment + Dynamic system Expert Heuristics + Good data - Complex environment - Dynamic system

Haigh & Harp, Learning from Limited Experience Complex Domain: Human Rules are Incomplete Complex domain: human calibration (incorrectly) claimed that Quad 1 was slowest, missing Quad 0 Quad 0&1 are slower than Quads 2&3. Registration Time by Quad DPASA (DARPA OASIS)‏

Haigh & Harp, Learning from Limited Experience Complex Domain (2)‏ caf_plan, chem_haz and maf_plan are slower than other clients Complex domain: human calibration (incorrectly) claimed that caf_plan & maf_plan were slowest because of hand-typed password, missing chem_haz Registration Time by Client Type DPASA (DARPA OASIS)‏

Haigh & Harp, Learning from Limited Experience Learning for Calibration • Calibrate the parameters of rules for normal operating conditions • Important first step because it learns how to respond to normal conditions • For example: learn timing parameters for rapid response controller, e.g. • Client Registration, PSQ server local probes, SELinux enforcement, SELinux flapping, File integrity checks • Need to handle multi-modal data: CSISM / BBN

Haigh & Harp, Learning from Limited Experience Results for all Registration times These two “shoulder” points indicate upper and lower limits. Beta=0.0005 As more observations are collected, the estimates become more confident of the range of expected values (i.e. tighter estimates to observations)‏ Algorithm of Last & Kandel, 2001 CSISM / BBN

Cortex Project Generalization of Attack Signatures Haigh & Harp, Learning from Limited Experience

Haigh & Harp, Learning from Limited Experience Generalization • Goal: Learn a most general concept from instances of attacks and block all similar attacks against the vulnerability. • Dealing with Zero-day attacks... • Payload Analysis Challenges • How to automatically recognize which element(s) of an attack are essential? • How to generalize them to their boundary conditions? • avoid the fragility of simple pattern matching rules • Approach: Experimentation • Validation of attack concepts  0 false positives Cortex / Honeywell

Haigh & Harp, Learning from Limited Experience Payload content Binary machine instructions Unusual payload (e.g. unix commands, registry keys, database administrative commands) Length (# bytes/terms) Resource consumption patterns Probing (e.g. password guessing) Session-wide (multiple queries) Generalization by Experimentation Model of normal traffic Model contains axes of vulnerability normal Taste Tester Experiment Score suspicious elements Replace with innocuous or generalized values Validate in tester attack Blocking Rules Cortex / Honeywell

Haigh & Harp, Learning from Limited Experience Cortex Demo Architecture and Use Cases Query Tasters Replicate queries Tasters Tasters Switch Tasters Create tasters Delete tasters Heartbeat Status Normal Query Mission Planning AMP CSM Master DB Once per phase Replicator RTS Replicate Switch Tasters Rebuild Tasters Send to Learning Proxy (Dexter)‏ Block known bad queries Taste test Log results . Learner Read Training Data Experiment Generate Rules Cortex / Honeywell

Haigh & Harp, Learning from Limited Experience Cortex Demo Architecture and Use Cases Query Tasters Replicate queries Tasters Tasters Switch Tasters Create tasters Delete tasters Heartbeat Status Attack is blocked AMP Attack gets through CSM Master DB Replicator RTS Replicate Switch Tasters Rebuild Tasters Send to Learning Proxy (Dexter)‏ Block known bad queries Taste test Log results . Learner Read Training Data Experiment Generate Rules Cortex / Honeywell

Haigh & Harp, Learning from Limited Experience Example Results: MySQL Attacks Notes String buffer overflow (password) Correctly generalized single attack to number of valid bytes. Integer overflow Correctly generalized single attack to 0x7FFF max value MySQL DOS attack Noted that hex bytes were suspicious, so generalized bytes and correctly blocked integer overflow! Project was tested with a red-team model Cortex / Honeywell

CSISM Project Identification of Multistage Attacks Haigh & Harp, Learning from Limited Experience

Haigh & Harp, Learning from Limited Experience MultiStage Attacks: Challenges • Detect and generalize multi-step attacks across time and space. • Multistage attacks involve a sequence of actions that span multiple hosts and take multiple steps to succeed. • Challenges: • Which observations are necessary & sufficient? • Incidental observations that are either • side effects of normal operations, or • chaff explicitly added by an attacker to divert the defender. • Concealment (e.g. to remove evidence)‏ • Probabilistic actions (e.g. to improve probability of attack success)‏ • What are the most reliable observations? • What are the parameter boundaries? • Approach: Experimentation • Allows validation of pruning CSISM / BBN

Haigh & Harp, Learning from Limited Experience Architectural Schema Observations Actions 2 A “Sandbox” A C B C A B C A B D Attack Theory Experimenter Defense Measures Experimenter 1 2 3 4 5 6 X ? A B C D A B C CSISM Sensors (ILC, IDS)‏ 1 2 3 4 5 6 Observations ending in failure of protected system. Only some are essential. Viable Defense Strategies and Detection Rules Viable Attack Theories CSISM / BBN

Haigh & Harp, Learning from Limited Experience Multi-Stage Learner The hard part! • Do { • Generate Theory according to heuristic • Complete set of theories is Permutations( Powerset( observations )) • Test Theory • Incrementally update controller rulebases • } while Theories remain • For only 10 observations, there are > 10,000,000 possible theories (not including variations on steps!) CSISM / BBN

Hypothesis Generation • Query learner generates attack hypotheses • in heuristic order to acquire the concept rapidly • Candidate Heuristics • Look for shorter attacks first (adjustable prior) • Suspect order of steps has an influence • Suspect steps to interact positively (for the attacker) • Prefer hypotheses with less common / more suspicious elements Project was tested with a red-team model CSISM / BBN

CSISM Project Response Learning Haigh & Harp, Learning from Limited Experience

Haigh & Harp, Learning from Limited Experience Situation-dependent Action Utilities • Learn tradeoffs among potential responses; context changes appropriateness of responses changes • Context includes descriptions of users, attack elements, system performance, etc • Benefit is effectiveness of defense action • Cost includes effort to mount response and impact on availability • Challenges: • Measuring the effect of responses is hard: • Complex domain  rarely identical situations  non-deterministic actions/effects • Approach: Experimentation • System “snapshots” get close to identical conditions CSISM / BBN

Haigh & Harp, Learning from Limited Experience Response Learning: Results Pending • Bias toward results that worked in similar situations in the past • Hybrid Reinforcement learning and Nearest-Neighbour approaches • Given a set of hypotheses about the locus of an attack • Search for true locus: • Hierarchical based on system architecture • Bias by historical attack patterns • Select response based on similarity match to prior attacks: • Same response when quality was high • Alternate response when quality was low • Project will be tested by a red-team on 20 May 2008. Goal is to demonstrate “better” responses over time. CSISM / BBN

Conclusion Haigh & Harp, Learning from Limited Experience

Haigh & Harp, Learning from Limited Experience Learning Benefits • Learning can improve the defensive posture • better knowledge (about the attacks or attacker), better policies • Learning can improve how the system responds to symptoms • better connection between response actions and their triggers • Active Learning • A mechanism for recognizing Zero-day attacks • No false positives — only validated attacks are added • Learning techniques are enablers for the next level of enhancements in adaptive defense Adaptation is the key to survival

Haigh & Harp, Learning from Limited Experience From Proof-of-Concept to Production

Backup Haigh & Harp, Learning from Limited Experience

Haigh & Harp, Learning from Limited Experience Multistage Attacks • Detect and then generalize multi-step attacks across time and space. • Multistage attacks involve a sequence of actions that span multiple hosts and take multiple steps to succeed. • A sequence of actions with causal relationships. • An action A must occur set up the initial conditions for action B. Action B would have no effect without previously executing action A. • For example • gain ability to execute commands on Box1 as unprivileged user by exploiting a buffer overflow in Service1 • gain root shell by running an exploit of a race condition • disable protection mechanism, e.g. SElinux • replace dpasa jar with attacker jar code • run attacker code that sends bad refs to Box2, Box3, Box4. Walk-Away-Message

Haigh & Harp, Learning from Limited Experience Attacks (MySQL DoS-1)‏ • mysql-com_table-dump-memory-corruption • Malformed request leaves MySQL unstable • Countermeasures: • Block the malformed com_table_dump command using learned pattern and proxy filter rules. • Restart the server • Block all requests from the offending sources

Haigh & Harp, Learning from Limited Experience Attacks (MySQL DoS-2)‏ • mysql-password-handler-buffer-overflow • Excessive password length can crash server • Countermeasures: • Block connections which proffer “abnormal” passwords (learned response or statistical anomaly). • Restart the server. • Block all requests from the offending sources.

Haigh & Harp, Learning from Limited Experience Attacks (MySQL DoS-3)‏ • mysql-remote-fulltext-search-DoS • Malformed request crashes server • Countermeasures: • Detect and block malformed queries • Block all queries of this type (fulltext-search)‏ • Block all requests from the offending sources. • Restart the server

Improving Self-Defense by Learning from Limited Experience

Improving Self-Defense by Learning from Limited Experience

Presentation Transcript

Self-Defense

Learning from Experience

LEARNING FROM EXPERIENCE

Improving Environmental Self-supervision by Industry: Experience from a Pilot Project in Kazakhstan

Self-Defense

SELF DEFENSE

TMWRF – Learning from Experience

Self-defense

Self-Defense

Self Defense

Learning from experience:

Investor Self-Defense

Self Defense Purse

SAFE International Self Defense Launches Ottawa Self Defense

Self defense products

Self Defense Training

Self defense products

Self defense Wellington

Improving Your Mortgage Process by Learning from Experts

Missoula Self Defense

Improving Self-Defense by Learning from Limited Experience