Evolving insider threat detection
Download
1 / 51

Evolving Insider Threat Detection - PowerPoint PPT Presentation


  • 222 Views
  • Uploaded on

Evolving Insider Threat Detection. Pallabi Parveen Dr. Bhavani Thuraisingham (Advisor) Dept of Computer Science University of Texas at Dallas Funded by AFOSR. Evolving Insider threat Detection Unsupervised Learning Supervised learning. Outline. Evolving Insider Threat Detection.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Evolving Insider Threat Detection' - karah


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Evolving insider threat detection

Evolving Insider Threat Detection

Pallabi Parveen

Dr. Bhavani Thuraisingham (Advisor)

Dept of Computer Science

University of Texas at Dallas

Funded by AFOSR


Outline

Evolving Insider threat Detection

  • Unsupervised Learning

  • Supervised learning

Outline


Evolving Insider Threat Detection

System log

Testing on Data from

weeki+1

j

Anomaly?

Feature Extraction

& Selection

Online learning

System traces

System Traces

Unsupervised - Graph based Anomaly detection, GBAD

weeki+1

Feature Extraction

& Selection

weeki

Learning algorithm

Gather Data from

Weeki

Update models

Supervised - One class SVM, OCSVM

Ensemble of Models

Ensemble based Stream Mining



Outlines unsupervised learning

  • Insider Threat on Graph

  • Related Work

  • Proposed Method

  • Experiments & Results

Outlines: Unsupervised Learning


An on GraphInsider is someone who exploits, or has the intention to exploit, their legitimate access to assets for unauthorised purposes

Definition of an Insider


Insider threat is a real threat

Insider Threat is a real threat


Insider threat continue

  • Insider threat on Graph

    • Detection

    • Prevention

  • Detection based approach:

    • Unsupervised learning, Graph Based Anomaly Detection

    • Ensembles based Stream Mining

Insider Threat : Continue


Related work

  • "Intrusion Detection Using Sequences of System Calls," on GraphSupervised learning by Hofmeyr

  • "Mining for Structural Anomalies in Graph-Based Data Representations (GBAD) for Insider Threat Detection." Unsupervised learning by Staniford-Chen and Lawrence Holder

  • All are static in nature. Cannot learn from evolving Data stream

Related work



Why unsupervised learning

  • One approach to detecting insider threat is supervised learning where models are built from training data.

  • Approximately .03% of the training data is associated with insider threats (minority class)

  • While 99.97% of the training data is associated with non insider threat (majority class).

  • Unsupervised learning is an alternative for this.

Why Unsupervised Learning?


Why stream mining

Current decision boundary

Data Stream

Data Chunk

Previous decision boundary

Anomaly Data

Normal Data

Instances victim of concept drift

Why Stream Mining


Proposed method

Graph based anomaly detection (GBAD, stream

Unsupervised learning) [2]

+

Ensemble based Stream Mining

Proposed Method


Gbad approach

M(S,G) = DL(G|S) + DL(S)

GBAD Approach


Unsupervised pattern discovery

S description length (MDL) heuristic that minimizes:1

Unsupervised Pattern Discovery

Graph compression and the

minimum description length (MDL)

principle

  • The best graphical pattern Sminimizes the description length of S and the description length of the graph G compressed with pattern S

  • where description length DL(S) is the minimum number of bits needed to represent S (SUBDUE)

  • Compression can be based on inexact matches to pattern

S1

S1

S1

S1

S2

S2

S2


Three types of anomalies

  • GBAD-MDL finds anomalous modifications

  • GBAD-P (Probability) finds anomalous insertions

  • GBAD-MPS (Maximum Partial Substructure) finds anomalous deletions

Three types of anomalies


Example of graph with normative pattern and different types of anomalies

A categories using Graph compression and the minimum description length (MDL) principle:

B

A

B

C

D

G

A

B

C

A

G

D

D

G

D

C

B

A

C

C

B

G

D

E

GBAD-P (insertion)

Example of graph with normative pattern and different types of anomalies

GBAD-MPS

(Deletion)

GBAD-MDL (modification)

Normative

Structure


Proposed method1

Graph based anomaly detection (GBAD, categories using Graph compression and the minimum description length (MDL) principle:

Unsupervised learning)

+

Ensemble based Stream Mining

Proposed Method


Characteristics of data stream

  • Continuous categories using Graph compression and the minimum description length (MDL) principle:flow of data

  • Examples:

Network traffic

Characteristics of Data Stream

Sensor data

Call center records


Datastream classification

  • Single Model Incremental classification categories using Graph compression and the minimum description length (MDL) principle:

  • Ensemble Model based classification

    Ensemble based is more effective than incremental approach.

DataStream Classification


Ensemble of classifiers

C categories using Graph compression and the minimum description length (MDL) principle:1

+

C2

+

+

x,?

C3

-

input

Individual outputs

voting

Ensemble output

Classifier

Ensemble of Classifiers


Proposed ensemble based insider threat detection eit

  • Maintain K GBAD models categories using Graph compression and the minimum description length (MDL) principle:

    • q normative patterns

  • Majority Voting

  • Updated Ensembles

    • Always maintain K models

    • Drop least accurate model

Proposed Ensemble based Insider Threat Detection (EIT)


Ensemble based classification of data streams unsupervised learning gbad

D categories using Graph compression and the minimum description length (MDL) principle:1

D2

D3

D5

D4

C4

C3

C5

C2

C1

Prediction

  • Build a model (with q normative patterns) from each data chunk

  • Keep the best K such model-ensemble

  • Example: for K = 3

Data chunks

D4

D6

D5

Update Ensemble

Testing chunk

Model with Normative Patterns

C5

C4

Ensemble based Classification of Data Streams (unsupervised Learning--GBAD)

C1

C2

C4

C3

C5

Ensemble


Eit u pseudocode

  • Ensemble (Ensemble A, test Graph t, Chunk S) categories using Graph compression and the minimum description length (MDL) principle:

  • LABEL/TEST THE NEW MODEL

  • 1: Compute new model with q normative

  • Substructure using GBAD from S

  • 2: Add new model to A

  • 3: For each model M in A

  • 4: For each Class/ normative substructure, q in M

  • 5: Results1  Run GBAD-P with test Graph t & q

  • 6: Results2 Run GBAD-MDL with test Graph t & q

  • 7: Result3 Run GBAD-MPS with test Graph t & q

  • 8: Anomalies Parse Results (Results1, Results2, Results3)

  • End For

  • End For

  • 9: For each anomaly N in Anomalies

  • 10: If greater than half of the models agree

  • 11: Agreed Anomalies  N

    12: Add 1 to incorrect values of the disagreeing models

  • 13: Add 1 to correct values of the agreeing models

  • End For

  • UPDATE THE ENSEMBLE:

  • 14: Remove model with

  • lowest (correct/(correct + incorrect)) ratio

  • End Ensemble

EIT –U pseudocode


Experiments

  • 1998 MIT Lincoln Laboratory categories using Graph compression and the minimum description length (MDL) principle:

  • 500,000+ vertices

  • K =1,3,5,7,9 Models

  • q= 5 Normative substructures per model/ Chunk

  • 9 weeks

    • Each chunk covers 1 week

Experiments


A sample system call record from mit lincoln dataset

header,150,2, categories using Graph compression and the minimum description length (MDL) principle:execve(2),,Fri Jul 31 07:46:33 1998, +

652468777 msec

path,/usr/lib/fs/ufs/quota

attribute,104555,root,bin,8388614,187986,0

exec_args,1,

/usr/sbin/quota

subject,2110,root,rjm,2110,rjm,280,272,0-0-172.16.112.50

return,success,0

trailer,150

A Sample system call record from MIT Lincoln Dataset


Token sub graph
Token Sub-graph categories using Graph compression and the minimum description length (MDL) principle:


Performance categories using Graph compression and the minimum description length (MDL) principle:

Total Ensemble Accuracy


Performance contd

  • 0 false negatives categories using Graph compression and the minimum description length (MDL) principle:

  • Significant decrease in false positives

  • Number of Model increases

    • False positive decreases slowly after k=3

Performance Contd..


Performance contd1
Performance Contd.. categories using Graph compression and the minimum description length (MDL) principle:

Distribution of False Positives


Performance Contd.. categories using Graph compression and the minimum description length (MDL) principle:

Summary of Dataset A & B


Performance Contd.. categories using Graph compression and the minimum description length (MDL) principle:

The effect of q on TP rates for fixed K = 6 on dataset A

The effect of q on FP rates for fixed

K = 6 on dataset A

The effect of q on runtime For fixed K = 6 on Dataset A


True Positive vs # normative substructure for fixed K=6 on dataset A

True Positive vs # normative substructure for fixed K=6 on dataset A

Performance Contd..

The effect of K on runtime for

fixed q = 4 on Dataset A

The effect of K on TP rates for fixed

q = 4 on dataset A



Outlines supervised learning

  • Related Work dataset A

  • Proposed Method

  • Experiments & Results

Outlines: Supervised Learning



Why one class svm

  • Insider threat data is minority class dataset A

  • Traditional support vector machines (SVM) trained from such an imbalanced dataset are likely to perform poorly on test datasets specially on minority class

  • One-class SVMs (OCSVM) addresses the rare-class issue by building a model that considers only normal data (i.e., non-threat data).

  • During the testing phase, test data is classified as normal or anomalous based on geometric deviations from the model.

Why one class SVM


Proposed method2

One class SVM (OCSVM) , Supervised learning dataset A

+

Ensemble based Stream Mining

Proposed Method


One class svm ocsvm

  • Maps training data into a high dimensional feature space (via a kernel).

  • Then iteratively finds the maximal margin hyper plane which best separates the training data from the origin corresponds to the classification rule:

  • For testing, f(x) < 0. we label x as an anomaly, otherwise as normal data

  • f(X) = <w,x> + b

  • where w is the normal vector and b is a bias term

One class SVM (OCSVM)


Proposed ensemble based insider threat detection eit1

Proposed Ensemble based Insider Threat Detection (EIT)


Ensemble based classification of data strea ms supervised learning

D (via a kernel). 1

D2

D3

D5

D4

C5

C3

C4

C2

C1

Prediction

  • Divide the data stream into equal sized chunks

    • Train a classifier from each data chunk

    • Keep the best K OCSVM classifier-ensemble

    • Example: for K= 3

D5

D4

D6

Labeled chunk

Data chunks

Unlabeled chunk

Addresses infinite length

and concept-drift

C5

C4

Classifiers

Ensemble based Classification of Data Streams (supervised Learning)

C1

C4

C2

C5

C3

Ensemble


Eit s pseudo code testing

Algorithm 1 (via a kernel). Testing

Input: A← Build-initial-ensemble()

Du← latest chunk of unlabeled instances

Output: Prediction/Label of Du

  • 1: FuExtract&Select-Features(Du)

  • //Feature set for Du

  • 2: for each xj∈ Fudo

  • 3.ResultsNULL

  • 4. for each model M in A

  • 5. Results Results U Prediction (xj, M)

  • end for

  • 6. Anomalies Majority Voting (Results)

  • end for

EIT –S pseudo code (Testing)


Eit s pseudocode

Algorithm 2 (via a kernel). Updating the classifier

ensemble

Input: Dn: the most recently labeled data chunks,

A: the current ensemble of best K classifiers

Output: an updated ensemble A

  • 1: for each model M ∈ Ado

  • 2: Test M on Dn and compute its expected error

  • 3: end for

  • 4: Mn Newly trained 1-class SVM classifier (OCSVM) from data Dn

  • 5: Test Mn on Dn and compute its expected error

  • 6: A best K classifiers from Mn ∪ A based on expected error

EIT –S pseudocode


Feature set extracted

Time, userID, machine IP, command, argument, path, return (via a kernel).

1 1:29669 6:1 8:1 21:1 32:1 36:0

Feature Set extracted


Performance
PERFORMANCE….. (via a kernel).


Performance Contd.. (via a kernel).

Updating vs Non-updating stream approach


Performance Contd.. (via a kernel).

Supervised (EIT-S) vs. Unsupervised(EIT-U) Learning

Summary of Dataset A


Conclusion future work

Conclusion: (via a kernel).

Evolving Insider threat detection using

  • Stream Mining

  • Unsupervised learning and supervised learning

    Future Work:

  • Misuse detection in mobile device

  • Cloud computing for improving

    processing time.

Conclusion & Future Work


Publication

Conference Papers: (via a kernel).

Pallabi Parveen, Jonathan Evans, Bhavani Thuraisingham, Kevin W. Hamlen, Latifur Khan, “ Insider

Threat Detection Using Stream Mining and Graph Mining,” in Proc. of the Third IEEE International

Conference on Information Privacy, Security, Risk and Trust (PASSAT 2011), October 2011, MIT, Boston,

USA (full paper acceptance rate: 13%).

Pallabi Parveen, Zackary R Weger, Bhavani Thuraisingham, Kevin Hamlen and Latifur Khan

Supervised Learning for Insider Threat Detection Using Stream Mining, to appear in 23rd IEEE

International Conference on Tools with Artificial Intelligence (ICTAI2011), Nov. 7-9, 2011, Boca Raton,

Florida, USA (acceptance rate is 30%)

Pallabi Parveen, Bhavani M. Thuraisingham: Face Recognition Using Multiple Classifiers. ICTAI 2006, 179-

186

Journal:

Jeffrey Partyka, Pallabi Parveen, Latifur Khan, Bhavani M. Thuraisingham, Shashi Shekhar: Enhanced

geographically typed semantic schema matching. J. Web Sem. 9(1): 52-70 (2011).

Others:

Neda Alipanah, Pallabi Parveen, Sheetal Menezes, Latifur Khan, Steven Seida, Bhavani M.

Thuraisingham: Ontology-driven query expansion methods to facilitate federated queries. SOCA 2010, 1-

8

Neda Alipanah, Piyush Srivastava, Pallabi Parveen, Bhavani M. Thuraisingham: Ranking Ontologies Using

Verified Entities to Facilitate Federated Queries. Web Intelligence 2010: 332-337

Publication


References

  • W. Eberle and L. Holder, Anomaly detection in Data Represented as Graphs, Intelligent Data Analysis, Volume 11, Number 6, 2007. http://ailab.wsu.edu/subdue

  • W. Ling Chen, Shan Zhang, Li Tu: An Algorithm for Mining Frequent Items on Data Stream Using Fading Factor. COMPSAC(2) 2009: 172-177

  • S. A. Hofmeyr, S. Forrest, and A. Somayaji, “Intrusion Detection Using Sequences of System Calls,” Journal of Computer Security, vol. 6, pp. 151-180, 1998.

  • M. Masud, J. Gao, L. Khan, J. Han, B. Thuraisingham, “A Practical Approach to Classify Evolving Data Streams: Training with Limited Amount of Labeled Data,”Int.Conf. on Data Mining, Pisa, Italy, December 2010.

References


Thank you
Thank You Represented as Graphs,


ad