1 / 22

Intrusion Detection using Sequences of System Calls

Intrusion Detection using Sequences of System Calls. By S. Hofmeyr & S. Forrest. Overview. Focus: privileged processes Discriminator: system call sequences Building a database: defining “normal” Detecting anomalies: how to measure Results: promising numbers Concerns: remaining doubts

Download Presentation

Intrusion Detection using Sequences of System Calls

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intrusion Detection using Sequences of System Calls By S. Hofmeyr & S. Forrest

  2. Overview • Focus: privileged processes • Discriminator: system call sequences • Building a database: defining “normal” • Detecting anomalies: how to measure • Results: promising numbers • Concerns: remaining doubts • Extensions of research: Jones, Li & Lin

  3. Inspiration • Human immune system • Recognition of self • Rejection of nonself • How would we describe “self” for a software system, or a program?

  4. Focus and Motivation • Focus on privileged processes • Exploitation can give a user root access • They provide a natural boundary • e.g. telnet daemon, login daemon • Privileged processes are easier to track • Specific, limited function • Stable over time • Contrast with the diversity of user actions

  5. Where do we look? • Need to distinguish when: • Privileged process runs normally • Privileged process exhibits an anomaly • The discriminator is the observable entity used to distinguish between these two • Use sequences of system calls as the discriminator, the signature

  6. How much detail? • Discriminator is sequences of system calls • Simple temporal ordering is chosen • Ignore parameters • Ignore specific timing information • Ignore everything else! • Why? As much as possible, work with simple assumptions • Is it “enough”?

  7. Is it enough detail? • Does the discriminator include enough detail for this hypothesis to hold? • Answer seems to be yes ! • Extra complication: due to the variability in configuration and use of individual systems, the set of “normal” sequences of system calls will be different on different systems

  8. Design Decisions • Remember temporal ordering of calls • Not total sequence, but sequences of length k • What size should k be? • Long enough to detect anomalies, short as possible • Empirical observation: length 6 to 10 is sufficient • So “self” is a database of (unordered) short call sequences

  9. Building the “normal” database • Synthetic • Assurance that the normal database contains no intrusions; reproducible • But does not reflect any particular real user activity • Actual use • Necessary to generate from actual use in order to have a unique “self” • How long to accumulate? Is it clean?

  10. The normal database • Database of normal sequences does not contain all legal sequences • If it did, anomalies would not be detected • Some rare sequences will not be used during database initialization • Database is stored as a forest to save space

  11. fopen fread strcmp strcmp fopen fread strcmp strcmp fopen strcmp fread strcmp fopen fread strcmp strcmp fopen fread strcmp fread fopen strcmp strcmp fread strcmp fopen strcmp strcmp fopen fread Signature Database Structure (length 3)

  12. Derive Robust Signature Database

  13. Detecting anomalies • A call sequence not in the database is an anomalous sequence • Strength of that anomalous sequence is measured by “Hamming distance” to the closest normal sequence (called dmin) • Any call trace with an anomalous sequence is an anomalous trace

  14. Detecting anomalies • Strength of an anomalous trace is the maximum dmin of the trace normalized for the value of k (length of sequences in the database): • ŜA = max{dmin values for the trace} / k • Value is between 0 and 1 • By adjusting the threshold value for ŜA, false positives can be reduced

  15. Efficiency • Complexity of computing dmin • O(k(RAN + 1)) • k is sequence length, RA is ratio of anomalous to normal sequences, N is the number of sequences in the database • dmin is calculated after every system call • The constant associated with this algorithm is very important • Not yet running in real time

  16. Results (synthetic) • Sanity test: If different programs are not distinguishable, anomalies within one program will certainly not be either • Easy to distinguish between programs; mismatches on well more than 50% of the instruction sequences (and ŜA >= 0.6) • All intrusions (both attempted & successful) produced anomalies of varying strengths

  17. Results (real environment) • The conjecture of unique normal databases • Experiments in two configurations (at UNM and MIT) had very different databases for the same program (lpr) • Is this typical?

  18. Closing concerns • False positives vs false negatives • If forced to choose, UNM prefers to have false negatives because layering can mitigate • Saw 1 per 100 print jobs (lpr) • Due to system problems • Is ŜA a good measure? • It could help generate false positives • Single extra system call might make ŜA = 0.5

  19. Annex Material Some UVa experiments S. Li, Y. Lin, and A. Jones

  20. Signature Length Has Little Effect • Illustrated by two attacks on Apache • Varied sequence length from 2 to 30 • We chose length 10 to have margin of error

  21. Effectiveness: Buffer Overflow High normalized anomaly signals indicate attacks • Successfully detected buffer overflow attacks against wu-ftpd • Work well because attacker code adds new sequences of library calls

  22. Effectiveness: Denial of Service No intrusion detected High normalized anomaly signal indicates attack • Simulated DOS attack that uses up all available memory • As attack progresses, library calls requesting memory return abnormally and are re-issued • DOS attack caused application to invoke new library call, fsync

More Related