1 / 24

Selected Topics in Automated Diversity

Selected Topics in Automated Diversity. Stephanie Forrest University of New Mexico. Mike Reiter Dawn Song Carnegie Mellon University. Automated Diversity for Security. Computer systems are highly uniform Easy targets for standardized attacks. Use idea of biological diversity:

jaunie
Download Presentation

Selected Topics in Automated Diversity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Selected Topics in Automated Diversity Stephanie Forrest University of New Mexico Mike Reiter Dawn Song Carnegie Mellon University

  2. Automated Diversity for Security • Computer systems are highly uniform • Easy targets for standardized attacks. • Use idea of biological diversity: • Introduce changes that make each system unique • Attack will need to be rewritten for each computer • Provide population resilience to unknown environmental threats • Two approaches: • Interface diversity: Adapt vulnerable interfaces such as machine language, system call numbers, and standard library locations. • Implementation diversity: Utilize diverse implementations of common services • Two projects: • Randomized instruction set emulation [Barrantes, Ackley and Forrest] • Behavioral distance for anomaly detection [Gao, Reiter and Song]

  3. Randomized Instruction Set Emulation (RISE)An example of interface diversity • Many current attacks insert binary code into a running program which is then executed. • RISE protects the code itself, rather than points-of-entry: • Perimeter defense (e.g., stack protection) not enough. • Randomize binary code instruction set for every program: • Foreign malicious code will try to execute code in the standard format and will fail. • Knowledge of a particular translation will gain access only to that particularprogram. • Modify compiler/virtual machine to accept this “new” language: • Prototype in open-source binary-to-binary translator Valgrind. • Related to encrypting compilers.

  4. How does foreign code infect a running program?

  5. Results • Prototype implementation available under GPL from http://www.cs.unm.edu/~immsec: • Normal code runs properly. • Binary code injection attacks stopped (100% of tested examples). • Performance (preliminary): • Emulation overhead of Valgrind is high. • Incremental cost of RISE is small. • (Very) roughly a factor of 2 slowdown in current configuration. • Significant space penalty: • Libraries • Mask

  6. Host-Based Anomaly Detector Is this system call request anomalous? Model Anomalous? (Y/N) User Space Kernel Space 3 5 11 Research Focus: What is the best model for anomaly detection? Can we use another computer as the model?

  7. Fault-Tolerant System • Commercial Off-the-shelf applications: may not produce the same responses • Intrusions that do not result in observable deviation in the responses • Need to observe the behavior

  8. The Problem • Diverse Platform (Linux and Windows) • System call numbers observed do not have semantic meanings • System calls may not have one-to-one correspondence • System call sequences may have different length • Diverse Implementation (Apache and Abyss) • Correspondence may not exist between individual system calls Match? 3 43 5 3 4 9 6 302 10 46 6 222

  9. Evolutionary Distance • Are two DNA sequences derived from a common ancestral sequence? • Evolutionary distance between two DNA sequences • Substitutions • Deletions • Insertions Insertion/Deletion (I/D) Symbols ATGCGTCGTT ATCCGCGAT ATGC-GTCGTT AT-CCG-CGAT

  10. Behavioral Distance and Evolutionary Distance • Similarities • Evaluate difference between two sequences • Substitutions, Deletions and Insertions • Differences • Same system call number in two sequences are not the “same” • We do not have the cost table in behavioral distance measure • We have training data

  11. Behavioral Distance • Behavioral distance calculation • Learning the cost table • Initializing the cost table • Iteratively updating the cost table • System call phrase extraction

  12. Behavioral Distance Calculation ATGCGTCGTT ATCCGCGAT ATGC-GTCGTT AT-CCG-CGAT The set of sequences obtained by inserting n-len(s) I/D symbols into s, at any location

  13. Learning the Cost Table • Training data: subjecting the replicas to a battery of well-formed (benign) requests and observing the system calls induced • Initializing the cost table • The first approach: comparing semantics of individual system calls • The second approach: using frequency information • Iteratively updating the cost table • Use the initialized cost table to calculate behavioral distance between system call sequences in the training data • Results of the behavioral distance reveal the “proper alignments” between system calls • Use these “proper alignments” to update the cost table

  14. System call Phrases • Correspondence may not exist between individual system calls • Behavioral distance calculation is very slow when sequences are long • Solution: group system calls into system call phrases • System call phrases are also called system call subsequences • A system call phrase is a sequence of system calls that frequently appear together in program execution • TEIRESIAS algorithm (also taken from Biology) • TEIRESIAS algorithm has been used in other intrusion/anomaly detection systems

  15. Evaluation – Experimental Setup

  16. Behavioral Distance – Same Application Apache Webserver Myserver Webserver

  17. Behavioral Distance – Different Application Linux: Myserver Webserver Windows: Apache Webserver Linux: Apache Webserver Windows: Myserver Webserver

  18. Behavioral Distance – Mimicry Attacks True acceptance rate when threshold is set to detect the best mimicry attack Behavioral distance of the best mimicry attack Attacker knows individual IDS on one replica Attack knows behavioral distance and the cost table

  19. Performance Overhead

  20. Conclusion • Behavioral distance detects an attack on one process that causes its behavior to deviate from that of another • Behavioral distance makes evasion attacks more difficult with moderate overhead

More Related