Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)

Learning Sequential Models for Detecting Anomalous Protocol Usage(work in progress) Lloyd Greenwald, Lucent Bell Labs Lucent Technologies – ProprietaryUse pursuant to company instruction

Machine Learning Algorithms for Surveillance and Event Detection • Surveillance: • Network traffic • Event Detection: • Unknown vulnerability exploits using sequences of messages • Machine Learning Algorithms: • Learning Markov models to capture recent sequential protocol usage Lucent Technologies – ProprietaryUse pursuant to company instruction

NIDS Monitors Traffic and Detects Events That Violate Security Policy Lucent Technologies – ProprietaryUse pursuant to company instruction (from Bro user manual)

Example Attack Sequence: NIDS Evasion Attack • Fake missing packet (to cause buffering) • Send two interspersed sequences for same connection • Even with same ttl’s there is ambiguity with how end systems will re-create sequence Lucent Technologies – ProprietaryUse pursuant to company instruction (from Handley et. al. 01)

Example Attack: Multi-Step • Apache/mod_ssl worm (aka Slapper) • Probe/scan target for vulnerability by sending HTTP GET request on tcp port 80 that violates 1.1 standard • Response identifies server as Apache • Exploit for SSLv2-enabled OpenSSL 0.9.6d vulnerability sent to tcp port 443 • Target sends traffic back to attacker on udp port 2002 • Target begins scanning for other vulnerable hosts Lucent Technologies – ProprietaryUse pursuant to company instruction

Technical Approach • Automatically build sequential models of recent protocol usage • Analyze models for common and uncommon sequences • Proactively exercise protocol implementation with uncommon sequences sampled from models • Reactively detect uncommon sequences • Build new defense policies for NIDS Lucent Technologies – ProprietaryUse pursuant to company instruction

Internet Session Data Prior Work: Machine Learning Algorithms for Automated Test Case Generation • Surveillance: • Web logs • Event Detection: • Exercise errors in web applications • Machine Learning Algorithms: • Learning Markov models to capture recent sequential web application usage Lucent Technologies – ProprietaryUse pursuant to company instruction

Prior Work: Automated Test Case Generation • Leverage dynamic user information to automatically generate NEW test cases for web applications. Session Data Key contribution 1) sequential statistical models built using machine learning techniques. Key contribution 2) flexible test case generation exploiting probabilistic sampling methods. Lucent Technologies – ProprietaryUse pursuant to company instruction

Web Application Studied • Front end – JSP • Back end - MySql • 10K lines of code, 118 methods, 12 classes • 123 user sessions (sequential application usage extracted from web log) Question: Can we build models that can be used to generate new, valid user sessions? Lucent Technologies – ProprietaryUse pursuant to company instruction

Building Markov Models From Web Logs • Extract User Sessions from Web Log • 12.3.40.65 GET index.jsp • 12.3.40.65 GET login.jsp • 12.3.40.65 GET /apps/bookstore/reg.jsp?member_login=hello&member_password=world&member_password2=world • 12.3.40.65 GET myinfo.jsp • Control Model: possible sequences of URLS that are visited • Data Model: possible sets of parameter values (name-value pairs) Lucent Technologies – ProprietaryUse pursuant to company instruction

Control Models • unigram: Probability of a user visiting a given page independent of previous page • P(currentPage=X) default search 0.10 0.20 0.65 book Detail 0.05 register Lucent Technologies – ProprietaryUse pursuant to company instruction

0.30 0.15 0.10 Control Models • bigram: Conditional probability of a user visiting a page, given the previous page • P(currentPage=X | lastPage=Y) default search 0.45 book Detail register Lucent Technologies – ProprietaryUse pursuant to company instruction

Control Models • trigram: Conditional probability of a user visiting a page, given the previous two pages • P(currentPage=X | lastPage1=Y1, lastPage2=Y2) default search 0.30 0.05 0.10 book Detail register 0.55 Lucent Technologies – ProprietaryUse pursuant to company instruction

Reliability vs. Discrimination Greater discrimination (more context) Greater reliability (more training data) unigram bigram trigram Lucent Technologies – ProprietaryUse pursuant to company instruction

Data Models • simple: P(values=X | currentPage =Y) “important parameter” • Books.do?category=3BookDetail.do?category=3&itemId=8 • Books.do?category=3BookDetail.do?category=3&itemId=8 • advanced: P(values=X | lastPage+importantParams=Y1,currentPage=Y2) Lucent Technologies – ProprietaryUse pursuant to company instruction

Simple Data Model Page1: http://decide.cs/bookstore/BookDetail.do?itemId=18 quantity=99&itemId=36 Page2: http://decide.cs/bookstore/AddOrder.do? Lucent Technologies – ProprietaryUse pursuant to company instruction

Advanced Data Model Page1: http://decide.cs/bookstore/BookDetail.do?itemId=18 quantity=1&itemId=18 Page2: http://decide.cs/bookstore/AddOrder.do? Lucent Technologies – ProprietaryUse pursuant to company instruction

Generating Test Cases by Combining Control and Data Models • Generate arbitrary queries about user sessions and use these queries to build test cases • What are the k most likely user sessions? • What are the k least likely user sessions? • Generate k user sessions randomly, according to the distribution represented in a web log. Lucent Technologies – ProprietaryUse pursuant to company instruction

Can our models be used to generate valid user sessions? Lucent Technologies – ProprietaryUse pursuant to company instruction

Network Protocol Modeling Challenges • Using live network data instead of logs • Access to reconstructed traffic in both directions • Can build models using data from multiple machines (instead of web log from single server) • What are we generating? • Sequences of packets • Sequence of high-level events that can be turned into packets • What is a user session? • Single connection • Cluster connections from subset of 5-tuple (srcIP, dstIP, srcPort, dstPort, Protocol) • What are control and data models? • Can we generate valid new sequences? Lucent Technologies – ProprietaryUse pursuant to company instruction

Building Sequential Model to Discover NIDS Evasion Attack • Control model: sequence numbers • Data model: TTLs and payload • How hard is it to discover that this pattern is “uncommon” ? Lucent Technologies – ProprietaryUse pursuant to company instruction (from Handley et. al. 01)

Discussion • Are Markov models sufficient for this task? Too propositional? • Are data models too sparse? Are state spaces too large? • How hard is anomaly detection in this framework? What is a good definition for “uncommon” traffic that doesn’t produce many false positives or false negatives? What about emerging new usage patterns? How to avoid “training attacks”? • How much protocol knowledge to use in building models? • Can signature matching events be used in data model? • Besides generating sequences, what other analyses can we perform? Entropy of models to determine level of history-dependence in traffic? Lucent Technologies – ProprietaryUse pursuant to company instruction

Related Work • Host-based and Network-base Intrusion Detection Systems (NIDS) • Signature-based anomaly detection -- manual analysis • Packet-based or with context – detect known vulnerabilities and behaviors • Formal verification of protocols – require extensive protocol knowledge; do not account for implementation variations • Scrubbers and Normalizers remove TCP/IP ambiguities – do not account for application-layer ambiguities and must make tradeoffs concerning removing ambiguities that change semantics or lead to performance loss • Fuzzing/Fault-injection – random generation of inputs for vulnerability detection – generates invalid sequences Lucent Technologies – ProprietaryUse pursuant to company instruction

Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)

Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)

Presentation Transcript

Sequential Learning

Sequential Learning

Modeling the Border Gateway Protocol

One Class Support Vector Machines for Detecting Anomalous Windows Registry Accesses

NISO Circulation Interchange Protocol

Modeling and Detecting Anomalous Topic Access

ASSESSMENT FOR LEARNING

Assessing and Awarding Credit for Prior Learning

Sequential Logic

Detecting Botnets With Anomalous DNS Traffic

Detecting Erroneous Sentences using Automatically Mined Sequential Patterns

Bayesian Learning for Conditional Models

Expediting Programmer AWAREness of Anomalous Code

Stacked Sequential Learning

Verifying Serializability Considering Only Sequential Executions

NISO Circulation Interchange Protocol

NISO Circulation Interchange Protocol

Evapotranspiration Partitioning in Land Surface Models

Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)

Modeling the Border Gateway Protocol

RELIGIOUS EDUCATION Progress in Learning Levels 4 - 6

Hierarchical Bayesian-Kalman Models for Regularization and ARD in Sequential Learning