weighting versus pruning in rule validation for detecting network and host anomalies n.
Download
Skip this Video
Download Presentation
Weighting versus Pruning in Rule Validation for Detecting Network and Host Anomalies

Loading in 2 Seconds...

play fullscreen
1 / 31

Weighting versus Pruning in Rule Validation for Detecting Network and Host Anomalies - PowerPoint PPT Presentation


  • 98 Views
  • Uploaded on

Weighting versus Pruning in Rule Validation for Detecting Network and Host Anomalies. Gaurav Tandon (joint work with Philip K. Chan) Center for Computation and Intelligence Department of Computer Sciences Florida Institute of Technology Melbourne, Florida 32901. gtandon@fit . edu.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Weighting versus Pruning in Rule Validation for Detecting Network and Host Anomalies' - elisa


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
weighting versus pruning in rule validation for detecting network and host anomalies

Weighting versus Pruning in Rule Validation for Detecting Network and Host Anomalies

  • Gaurav Tandon
  • (joint work with Philip K. Chan)
  • Center for Computation and Intelligence
  • Department of Computer Sciences
  • Florida Institute of Technology
  • Melbourne, Florida 32901.
  • gtandon@fit.edu
outline
Outline
  • Intrusion detection systems taxonomy
  • Aspects of rule quality
  • Rule pruning and weighting
  • Weight update methods
  • Experimental evaluation and results
  • Summary

Gaurav Tandon

intrusion detection systems
Intrusion Detection Systems
  • Signature Detection
    • Model “known attacks”
    • Advantage: Accuracy
    • Disadvantage: Unable to detect novel attacks
  • Anomaly Detection
    • Model “normal behavior”
    • Advantage: Detecting new attacks
    • Disadvantage: False alarms
  • Machine learning for Anomaly Detection
    • training from normal data only
    • “one-class” learning

Gaurav Tandon

learning rules for anomaly detection lerad
Learning Rules for Anomaly Detection (LERAD)
  • LERAD (Mahoney and Chan, ICDM 2003)
    • A, B, and X - attributes
    • a, b, x1, x2- values for corresponding attributes
  • Anomaly Score
    • Abnormal events: Degree of anomaly
    • Normal events: Zero

Gaurav Tandon

aspects of rule quality
Aspects of Rule Quality
  • Predictiveness
    • Measure of accuracy of consequent given antecedent
    • P (consequent | antecedent)
    • Examples: RIPPER, C4.5 rules
  • Belief
    • Measure of trust for entire rule
    • Example: Weights in ensemble methods, boosting

Gaurav Tandon

predictiveness vs belief for lerad rule
Predictiveness vs. Belief for LERAD rule
  • Predictiveness: p
    • P (not consequent | antecedent)
  • Belief: w
    • Weight for the entire rule

Gaurav Tandon

motivation and problem statement
Motivation and Problem Statement
  • Rule Pruning
    • Reduce overfitting
  • Rule Weighting
    • Use “belief” to combine predictions
  • Previous studies:
    • Pruning vs. no-pruning
    • Weighting vs. non-weighting
  • Current work:
    • Pruning vs. weighting

Gaurav Tandon

overview of lerad
Overview of LERAD
  • Generate candidate rules from a small training sample
  • Perform coverage test to minimize the rule set
  • Update rules with the entire training set
  • Validate rules on a separate validation set

Gaurav Tandon

anomaly score
Anomaly score
    • p: probability of observing a value not in the consequent
    • r: cardinality of the set {x1, x2, …} in the consequent
    • n: number of instances that satisfy the antecedent
    • (Witten and Bell, 1991)
  • Anomaly score = 1/p

Gaurav Tandon

revisit validation step
Revisit Validation Step
  • Generate candidate rules from a small training sample
  • Perform coverage test to minimize the rule set
  • Update rules with the entire training set
  • Validate rules on a separate validation set

Gaurav Tandon

rule pruning
Rule Pruning

Rule Set

r1

Conform

Validate

r2

Violate

r3

r4

Validation Data (normal)

Training Data (normal)

r5

r6

r7

  • Conformed rules kept
  • Violated rules pruned (False Alarm)

r8

r9

r10

Gaurav Tandon

rule pruning1
Rule Pruning
  • Given a rule and a data instance, three cases apply:
    • rule conformed
    • rule violated
    • rule inapplicable – no changes

Gaurav Tandon

case 1 rule conformed rule pruning
Case 1 - Rule Conformed (Rule Pruning)
  • Rule:
  • Data instance:

<SrcIP=128.1.2.3, DestIP=128.4.5.6, DestPort=80>

  • Updated rule:
    • Consequent - no changes
    • p = 3/101

Gaurav Tandon

case 2 rule violated rule pruning
Case 2 - Rule Violated (Rule Pruning)
  • Rule:
  • Data instance:

< SrcIP=128.1.2.3, DestIP=128.4.5.6, DestPort=23 >

  • Updated rule:
    • Any rule violation is a false alarm - remove rule

Gaurav Tandon

lerad rule generation
LERAD Rule Generation
  • Generate candidate rules from a small training sample
  • Perform coverage test to minimize the rule set
  • Update rules with the entire training set
  • Validate rules on a separate validation set

Gaurav Tandon

coverage and rule pruning
Coverage and Rule Pruning
  • Minimal set of rules to cover the training set
  • Each rule has large coverage on training set
  • Pruning reduces coverage
  • Potentially miss detections

Gaurav Tandon

lerad rule generation1
LERAD Rule Generation
  • Generate candidate rules from a small training sample
  • Perform coverage test to minimize the rule set
  • Update rules with the entire training set
  • Validate rules on a separate validation set

Gaurav Tandon

rule weighting
Rule Weighting

Weighted Rule Set

r1,w1

Conform

Validate

r2,w2

Violate

r3,w3

r4,w4

Validation Data (normal)

Training Data (normal)

r5,w5

r6,w6

r7,w7

  • Weight increase for conformed rules
  • Weight decrease for violated rules(False Alarm)

r8,w8

r9,w9

r10,w10

Gaurav Tandon

case 1 rule conformed rule weighting
Case 1 - Rule Conformed (Rule Weighting)
  • Rule:
  • Data instance:

<SrcIP=128.1.2.3, DestIP=128.4.5.6, DestPort=80>

  • Updated rule:
    • Consequent - no change
    • p = 3/101
    • w increase = w'

Gaurav Tandon

case 2 rule violated rule weighting
Case 2 - Rule Violated (Rule Weighting)
  • Rule:
  • Data instance:

<SrcIP=128.1.2.3, DestIP=128.4.5.6, DestPort=23>

  • Updated rule:
    • Consequent: add DestPort value 23
    • p = 4/101
    • w decrease = w'

Gaurav Tandon

anomaly score1
Anomaly Score
  • Rule Pruning:
    • rule predictiveness
  • Rule Weighting:
    • rule predictiveness
    • rule belief

where t – time elapsed since last anomaly

Gaurav Tandon

weighting method 1 winnow specialist
Weighting Method 1: Winnow-specialist
  • Rule k
  • Decrease weight:
  • Increase weight:

where

  • 2 parameters
  • Sum of rewards might not be equal to sum of penalties

Gaurav Tandon

weighting method 2 equal reward apportioning
Weighting Method 2: Equal Reward Apportioning
  • Weight sum does not change
    • Total reward = Total Penalty (TP)
  • Violated rules:
  • Confirmed rules:
    • where Nc is the number of conformed rules
  • 1 parameter

Gaurav Tandon

weighting method 3 weight of evidence
Weighting Method 3: Weight of Evidence

where

  • Subset of pruned rules kept
    • Only rules with negative weight of evidence removed
  • 0 parameters

Gaurav Tandon

empirical evaluation
Empirical Evaluation

Experimental Data

  • Network
    • IDEVAL-TCP, IDEVAL-PKT, IDEVAL-COMB, UNIV-TCP, UNIV-PKT, UNIV-COMB
  • Host
    • IDEVAL-BSM, UNM, FIT-UTK

Evaluation Criteria

  • AUC: Area under ROC curve
  • Up to 0.1% and 1% False Alarm (FA) rate

Gaurav Tandon

analysis of new attack s detected by rule weighting
Analysis of new attack(s) detected by rule weighting
  • New detections due to higher anomaly scores
    • Increased weights of conformed rules (kept by both pruning and weighting)
      • 2 new detections

2) Decreased weights of violated rules (removed by pruning but retained by weighting)

      • 18 new detections

Gaurav Tandon

overhead
Overhead
  • Training time
    • Avg. increase: 2.9%
  • Testing (detection) time
    • Avg. increase: 0.8%
  • Number of rules in rule set
    • Avg. increase: 2.9%

Gaurav Tandon

summary
Summary
  • Proposed weights representing rule belief for anomaly detection
  • Presented three weighting schemes
  • Compared Pruning and Weighting LERAD variants on various network and host data sets
  • Weighting scheme detects more attacks at lower false alarm rates than Pruning
  • Most new attacks detected by violated rules discarded by Pruning
  • Weighting has higher memory and time requirements than Pruning, still feasible for online system

Gaurav Tandon

slide31
Thank You

Poster # 2 tonight

Questions/Comments?

Gaurav Tandon