1 / 67

Hierarchical Adaptive Control of QoS for Intrusion Tolerance

Hierarchical Adaptive Control of QoS for Intrusion Tolerance. HACQIT. James E. Just James C. Reynolds Karl Levitt 20 August 2002. The UC Davis Computer Security Laboratory. Outline. Overview Technical Approach How Does It Work What’s New Testing Transition Plans

nia
Download Presentation

Hierarchical Adaptive Control of QoS for Intrusion Tolerance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hierarchical Adaptive Control of QoS for Intrusion Tolerance HACQIT James E. Just James C. Reynolds Karl Levitt 20 August 2002 The UC Davis Computer Security Laboratory

  2. Outline • Overview • Technical Approach • How Does It Work • What’s New • Testing • Transition • Plans • Karl Levitt, UC Davis -- Attack Diagnosis

  3. The HACQIT Idea • Deliver ongoing critical application services to selected users while under network-based cyber-attack • Specific project goals • Understand how to and demonstrate delivery of critical user services for 4 hours while under active attack with no more than 25% degradation in user performance • Handle unknown attacks that can exhaust spare resources • Ignore denial-of-service attacks on bandwidth and users • Force massive increase in adversary work factor for attacks • Support broad classes of COTS HW & SW for near term military utility without footprint of Byzantine fault tolerance • Provide extensible (architecture-based) intrusion tolerance framework for longer term utility • Understand the design space of intrusion tolerant systems for real world use with COTS/GOTS hardware and software

  4. Key = Attacker User = Critical User = Non-Critical User User Server = Critical Service = Non-Critical Service Server Formal System Boundaries & Assumptions Goal: Provide critical services to selected users while under attack with <25% degradation in performance • Key Assumptions: • LAN is reliable, cannot be flooded • No direct DoS attacks against critical users • HACQIT cluster HW & SW are pristine at startup & attackers have no physical access • Unknown vulnerabilities exist in cluster • Critical users & administrators are trusted • Users interact with services via LAN & hosts

  5. Intrusion resistant architecture with strong separation boundaries Technical Approach Behavior specification approach to recognizing errors and intrusions – defend in depth Process pair (hot spare) redundancy and failover with diversity if possible Identify, learn and block new attacks Generalize new attack blocking filters • Multiple response types • Failover • Randomly rejuvenate • Block calls or attacker • Filter out bad requests (e.g., known attacks) • Perform failure diagnosis

  6. Illustration of HACQIT Response to Attacks Attacker compromises critical user host

  7. With Generalization enabled, Code Red 1, 2 and all variants are blocked after first Code Red 1 attack Performance impact: less than 4% in simple tests

  8. HACQIT Capabilities • Current - Protects web enabled message board (dynamic state) • Leverages critical application diversity but does not require it • Adapts cluster sensor settings and responses through dynamic policy changes • Detects faults / intrusions by observed deviations from behavior specification • Maintains full event logs and N-minute service request buffer • Fails over as required in response to attacks or faults • Restores state and resyncs a new process pair with spare • Performs continuous recovery to clean up compromised machine • Rejuvenates application software at random or fixed intervals as preventive action • Learns/blocks unknown attacks using forensics analysis of captured traffic to reproduce symptoms in Sandbox • Generalizes some rules to block simple variants of attack • Outputs SNORT rule for broader detection of new attack • Builds allowed process list in semi-automated mode • New work under exercised option • Diagnosis and recovery: Learning and generalizing additional attacks – enhanced forensics • Survivable server demonstration • Possible • Change security posture as result of external alerts • Engage in group learning (attack notification, filter settings, etc) with other clusters, Cyber Panel, firewall, others

  9. Recent Accomplishments • Ability to handle non-diverse applications (time-diversity) • Vulnerability simulator to enable more robust attacks and learning behavior • Continuous recovery against unauthorized files and processes • SNORT rules generated for previously unknown attack • More informative GUI implemented • Modified critical application – web enabled message board using SQL Server backend database with active server pages (more realistic dynamic data) on both IIS and Apache • More robust implementations of cluster, host, and application monitors and controllers • Improved performance – some rearchitecting • Policy database for storage & incremental distribution • Additional learning

  10. New HACQIT GUI

  11. Penetration Testing • Background • UC Davis Computer Security Class (under Prof. Matt Bishop) is learning Penetration Testing/Fault Hypothesis Methodology on HACQIT cluster (Jan-Mar 2002) • Complete HACQIT documentation provided – including our source code, versions of all COTS OS & products • Described known vulnerabilities • Results • No successful attacks, i.e., no attacks caused failover • Great system test vehicle – found and fixed lots of bugs • Some effort continuing over summer with student “winner”

  12. Performance Test Results • Simulated background workload and successful attacks using • Microsoft web stress tool used to generate workload by posting new messages to message board every 2-3 seconds • Successful “attacks” every 5-7 minutes, i.e., caused failover • Lots of statistics collected • Selected highlights below 69.6% 72.5% 94.8% 98.9% 91% 286%

  13. Other Validation Efforts Ongoing • Validation peer review • Continued software analysis and penetration testing • Internet exposure • Own use of hacker tools, e.g., whisker • Use of new vulnerability simulator and script injector -- ongoing • Analysis -- ongoing • New attack completion time, number of new attacks, recovery time from successful attack, time to learn and block a new attack, number of spares (& diversity), time to identify and stop attacker

  14. Transition • Patent application/investigation ongoing • Communications with selected protection vendors • Examining open source or tool kit distribution • Other agencies/Services • DISA JPO discussions • CECOM Survivable Server

  15. Plans • Focus on more extensive learning and generalization capabilities for single and two stage attacks • Extensive use of vulnerability simulator • Vulnerability and attack categorization • Validation • Update validation framework • Participate in validation peer review • Continue open network access and Davis reviews • Incorporation of UC Davis results • Integration with other OASIS technologies • More transition effort Demo tonight – y’all come

  16. HACQIT – A Beginning for “Systems That Know” • Self awareness -- distinguish between self / non-self • Behavior specification – defines allowable or normal behavior, not statistical • Attack definition – repeatable failure caused by user / agent service request • Host monitor -- hierarchy of reference monitors that evaluate input content, key dll/system calls, critical application “behavior”, behavior of other allowed processes, network traffic behavior, host behavior, etc. against specifications – defense in depth • Cluster monitor -- failover and recovery decisions based on QoS monitoring, integrity and liveness tests, and health status reports, random timing, etc • Recovery mechanisms (reflex and deliberative actions) • Start/kill processes, restore/delete files, mediate application calls, isolate at propagation boundaries , restart OS • Continuous recovery, random rejuvenation, checkpoint and restore, reconstitute process pair with new servers, etc • Adapt to unknown attacks (beginnings of reflective actions) • Learn -- prevent attacker from exhausting backups • Black-box recorder for input service request (implemented as N-minute circular buffer) and sensor reports • Sandbox (isolated duplicate of critical servers) • Diagnostic analyzer (i.e., was failure caused by repeatable attack and, if so, which service request(s) caused it) • Adaptive content filter • Generalize -- refine blocking signature for some attacks types -- shorten, identify initiating event, generalize, etc -- can be done remotely

  17. UC Davis Contribution to HACQIT Forensics PI: Karl Levitt Adults: Jeff Rowe Jim Alves-Foss (visiting from Idaho) Mark Heckman (now at Promia Corp) Grad Students: Melissa Danforth, Nicole Carlson, Tye Stallard, Marcus Tylutki Undergrad Student: Barry Allard

  18. Reminder: Unique Features of HACQIT • Attack Assumptions: unknown, strike again as variant • Detection through violation of QoS specifications and specification-based intrusion detection • Future: Specifications on confidentiality, functional behavior of applications • Analysis: Logs and Sandbox provide data for decision tree • Filters: Block packets and system calls in server • Future: Block calls in client • Rollover: hot process-pair, processor • Future: Other resources-- processes, files, disk blocks, ports, IP addresses; randomization to achieve diversity

  19. Attack Management in HACQIT • When an attack occurs do begin • Detect the event causing a trigger; • Determine if event is really an attack; • If attack do begin until “successful blocking • Determine minimal switchover that effects recovery from attack; • Determine event sequences that caused attacks; • Identify most likely event sequence; • Determine signals to blocking effectors that cause minimal mission impact; • Generalize blocking signals to address generalizations of attack

  20. Forensic Process Flow Trigger Policy/spec violation Analysis Signature generalization Root cause forensics Response Options Data Collection More logging Alternate Hypothesis Mission Model

  21. Outline • Specification-Based Intrusion Detection: The best approach to detect unknown attacks or variants of known attacks. • Forensics Principles: Based on how a human forensics expert analyzes logs to identify attacks. • Forensics Analysis: Use decision tree to identify possible event sequences that could have caused triggered specification violation. • Attack Blocking: Perform immediate and safe blocking, that is subsequently refined to be close to optimal. • Forensic Analysis of Code Red(s): • HMAP: A tool to remotely test web servers • Mission Model: Necessary to determine optimal response to attacks • Completeness Verification of Specification-Based Intrusion Detection: Verify specifications of Unix privileged programs with respect to an accepted protection model. • Related Work (DARPA, other) in Forensics:

  22. Outline • Specification-Based Intrusion Detection: The best approach to detect unknown attacks or variants of known attacks. • Suggests a fast and safe response based on those constraints that are violated

  23. Approaches to Integrity Attack Detection • Static: Detect an inconsistency in system state • Tripwire: Inconsistency in a file • Diagnosis: Test a component • Dynamic: 1. Misuse: Detect known attacks through their signatures 2. Anomaly: Detect activity that does not match a profile; can profile users, processes, programs, systems, networks,… 3. Specification-Based: Detect activity that is inconsistent with a priori specification (aka constraints) for an object. Can write specifications for: programs; protocols; policies on users, … 4. Hybrid (2 and 3): a priori template specifications with parameters discovered by profiling Only Static and Dynamic (2,4) can detect unknown attacks

  24. Useful Types of Constraints • Policy on Users • Files a user can access • Resources a user is allowed to possess • Protocol Specifications -- operational view • Defines allowable transitions • Defines allowable time in a given state • Protocol Specifications -- message content • Mappings delivered by DNS should accurately represent view of authoritative router • IP addresses are not spoofed

  25. Useful Types of Constraints • Protocols -- Invariant and assumptions • IP Routers approximate Kirchoff’s law • Packets are not sniffed by third-party • Packet source must be a non-congested/non-DOSed host • Programs -- valid access constraints • Programs access only certain objects • Programs - Interaction constraints • program interaction should not change the semantic • Data Integrity • e.g., passwords, other authentication information • authorization information, process table

  26. Constraints User constraints Access constraints Program constraints Interaction constraints Data constraints Operational constraints Protocol constraints Message constraints Protocol Invariants Application constraints

  27. Access Constraints for Programs • Can Detect • remote users gain local accesses • local users gain additional privileges • Trojan Horses • Work well for many programs, e.g., passwd, lpr, lprm, lpq, fingerd, at, atq, … • Some program can potentially access many files, e.g., httpd, ftpd • break the execution into pieces (or threadlets). Define the valid access for each sub-thread. • Threadlet defined by transition operations

  28. An ARP specification ARP Request i ARP Request reply_wait ARP Response cached ARP cache timeout

  29. Monitoring for Intrusions alarm Bogus ARP Response Unsolicitied ARP Response Malformed Request ARP Request i ARP Request reply_wait ARP Response cached ARP cache timeout

  30. Other Protocols Specified for Intrusion Detection • Domain Name System (DNS) • Network File System (NFS) • Distributed Host Configuration Protocol (DHCP) • TCP • FTP • RIP routing protocol • OSPF routing protocol

  31. Difficult Unknown Attacks • Loss of Confidentiality: Need to detect “exfiltration” • Causes change to application functionality: Need to write specification for application behavior • Insider browsing in unexpected locations: Anomaly detection, or detect activity inconsistent with a policy -- “Demids”: An intrusion detection system for databases

  32. Outline • Forensics Principles: Based on how a human forensics expert analyzes logs to identify attacks.

  33. Prevention Forensics • Protecting against future attack instances requires determination of the cause. • Forensics need only proceed until a response blocking future attacks is determined, not to the ultimate root cause. • The HACQIT Forensic Agent automatically detects the suspicious events in the application and network traffic logs. • Reports from the Forensic Agent are used by the Response Agent to block future instances.

  34. Abstract Forensic Leads • Login Record 3:22 12/11/2001 • Code Compiled 3:27 12/11/2001 • Network Connection 4:02 12/11/2001 • Inconsistent File System 4:11 12/11/2001 • IDS Alarm 4:12 12/11/2001 • Login Record 4:12 12/11/2001 • Suspicious Transaction in Application Log 4:13 12/11/2001 Raw Data dbtpto tty03 SVRC05 Thu Feb 21 12: 48 - 12: 52 (00: 03) tgtawb tty02 SVRC05 Thu Feb 21 12: 44 still logged in Last login: Wed Jul 28 19: 59: 56 1999 from beukel. Porcupine Jul 30 99 18:45:45 3743 .a. -rw- r-- r-- root wheel /etc/ make. Conf 4347 .a. -r-- r-- r-- bin bin /usr/ include/ machine/ ansi. h 3911 .a. -r-- r-- r-- bin bin /usr/ include/ machine/ endian. h 2697 .a. -r-- r-- r-- bin bin /usr/ include/ machine/ types. h 5903 .a. -r-- r-- r-- bin bin /usr/ include/ sys/ types. h 3528 .a. -r-- r-- r-- bin bin /usr/ share/ mk/ bsd. own. mk 3945 .a. -r-- r-- r-- bin bin /usr/ share/ mk/ sys. mk Jul 30 99 18:45:46 1949 .a. -r-- r-- r-- bin bin /usr/ lib/ crt0. o 22544 .a. -r-- r-- r-- bin bin /usr/ lib/ libgcc. A May 20 01: 04: 42 tuegate: 14498 systatd: connect from litp.ibp.Fr May 20 01: 10: 19 tuegate: 14536 systatd: connect from monk.rutgers.edu May 20 01: 23: 49 tuegate: 15040 systatd: connect from monk.rutgers.edu Automated Forensic Methodology Forensic Evidence Analysis Effect 1 Cause A Cause B Effect 2 Cause C Cause D Effect 3 Cause F Cause E Effect 4 Effect 4

  35. Outline • Forensics Analysis: To identify possible event sequences that could have caused triggered specification violation. • Determine blocking rules that are: • Immediate • Mission aware • Reversible • Subsequently optimized • Related Work (Darpa, other) in Forensics: • DERBI: SRI • Maita: MIT

  36. Forensic Analysis: Overview • Create a tree whose • Roots are initiating events for sequences of actions • Leaves are the terminating events of sequences of actions, usually events that are triggered by specification violations • Upon notice of a trigger 1. Identify trigger in the forensics tree 2. Identify predecessor actions of the trigger 3. Identify all predecessors that are matched with events in log (circular buffer) 4. For all predecessors identified in (3), repeat starting with (1)

  37. Alice Bob Eve Cathy Connection Spoofing 5. Forge SYN-ACK packet to establish the connection as trusted client

  38. NetworkIntrusionDetectionSystem trigger Host B Host A denial of service Trojaned rsh daemon buffer overflow suspicious rsh login entry in log file spoofed connection between A & B “+ +” .rhosts active connection between A & B no password stolen password via password cracking unencypted network connections

  39. NetworkIntrusionDetectionSystem trigger Host B Host A denial of service Trojaned rsh daemon buffer overflow suspicious rsh login entry in log file spoofed connection between A & B “+ +” .rhosts STOP active connection between A & B no password stolen password via password cracking unencypted network connections

More Related