Presented by sandeep dept of computer information sciences university of delaware
1 / 25

Presented by: Sandeep Dept of Computer & Information Sciences University of Delaware - PowerPoint PPT Presentation

  • Uploaded on

Presented by: Sandeep Dept of Computer & Information Sciences University of Delaware. Detection of unknown computer worms based on behavioral classification of the host Robert Moskovitch ,Yuval Elovici ,Lior Rokach. Worms. Worms are considered malicious in nature

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Presented by: Sandeep Dept of Computer & Information Sciences University of Delaware' - brooklyn

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Presented by sandeep dept of computer information sciences university of delaware

Presented by: Sandeep

Dept of Computer & Information Sciences

University of Delaware

Detection of unknown computer worms based on behavioral classification of the host

Robert Moskovitch ,Yuval Elovici ,Lior Rokach


  • Worms are considered malicious in nature

  • Worms propagate actively over a network, while other types of malicious codes, such as viruses, commonly require human activity to propagate

  • Viruses infect a file (its host), a worm does not require a host file .

What do antivirus packages do
What do Antivirus Packages do ?

  • Antivirus software packages inspect each file

    that enters the system, looking for known

    signatures which uniquely identify an

    instance of known malcode

  • Polymorphism and metamorphism are two

    common obfuscation techniques used by malware


  • Polymorphic virus obfuscates its decryption loop

    using several transformations, such as nop-

    insertion, code transposition

Obfuscation techniques
Obfuscation Techniques

  • Metamorphic viruses attempt to evade detection

    by obfuscating the entire virus. When they

    replicate, these viruses change their code in a

    variety of ways, such as code transposition,

    substitution of equivalent instruction sequences,

    change of conditional jumps, and register



  • Virus Code : Morphed Virus Code(From Chernobyl CIH1.4)

    Loop : Loop : pop ecx Loop: pop ecx

    pop ecx nop nop

    jecxz SFModMark jecxz SFModMark jmp L1

    mov esi , ecx xor ebx , ebx L3: call edi

    mov eax , 0d601h beqz N1 xor ebx , ebx

    Pop edx N1: mov esi , ecx beqz N2

    Pop ecx nop N2: jmp Loop

    Call edi mov eax ,0d601h jmp l4

    pop edx L2: nop

    pop ecx mov eax , 0d601h

    nop pop edx Xor ebx , ebx

    call edi pop ecx beqz N1

    Xor ebx , ebx nop N1: mov esi , ecx

    beqz N2 jmp L3 jmp L2

    N2: JMP loop L1: jecxz SFModMark L4:

Current methods
Current Methods

  • Existing methods rely on the analysis of the

    binary for the detection of unknown malcode.

  • Some less typical worms are left undetectable.

    Therefore an additional detection layer at

    runtime is required

Proposed approach
Proposed Approach

  • Malicious actions are reflected in the general

    behavior of the host. By monitoring the host, one

    can inexplicitly identify malcodes.

  • A classifier is trained with computer

    measurements from infected and not infected


Contributions of the paper
Contributions of the Paper

  • Machine learning techniques are capable of

    detecting and classifying worms

  • Using feature selection techniques to show that a

    relatively small set of features are sufficient for

    solving the problem without sacrifice accuracy.

  • Empirical results from an extensive study of

    various machine configurations suggesting that

    the proposed methods achieve high detection

    rates on previously unseen worms.

Dataset creation
Dataset creation

  • Lab network consisted of seven computers, which

    contained heterogenic hardware, and a server

    computer simulating the internet.

  • Used the windows performance counters and

    Vtrace which enable monitoring system features

  • A vector of 323 features for every second.

  • Choose worms that differ in their behavior, from

    among the available worms

Feature selection methods
Feature selection methods

  • Chi-Square

  • Gain Ratio

  • Relief

  • Features’ ensemble :

    fi is a feature, filter is one of the k filtering (feature selection) methods.

Feature sets
Feature Sets and Unified

Classification algorithms
Classification algorithms and Unified

  • Decision Trees,

  • Naıve Bayes,

  • Bayesian Networks

  • Artificial Neural Networks

Evaluation measures
Evaluation measures and Unified

Experiment i
Experiment I and Unified

  • Each classifier is trained on a single dataset i

    and tested on each one ( j ) of the eight datasets.

    Eight corresponding evaluations were done on

    each one of the datasets, resulting in 64

    evaluation runs.

  • When i = j , 10 fold cross validation, in which the

    dataset is randomly partitioned into ten

    partitions and repeatedly the classifier is

    trained on nine partitions and tested on the


Experiment i contd
Experiment I (Contd) and Unified

  • Each evaluation run (out of the 64) was repeated

    for each one of the combinations of feature

    selection method, classification algorithm, and

    number of top features.

  • Each evaluation run was repeated for the 33

    feature set described earlier

  • 132 (four classification algorithms applied to 33

    feature sets) evaluations (each comprises 64

    runs), summing up to 8448 evaluation runs.

Results and Unified

Results contd
Results(Contd) and Unified

Results contd1
Results(Contd) and Unified

Experiment ii
Experiment II and Unified

  • Classifiers based on part of the (five) worms and

    the none activity, and tested on the excluded

    worms (from the training set) and the none


  • Training set consisted of 5 − k worms and the

    testing set contained the k excluded worms,

    while the none activity appeared in both


  • This process repeated for all the possible

    combinations of the k worms (k = 1–4).

  • The Top20 features, which outperformed in e1

    were used

Results and Unified

Conclusion and Unified

  • Q1: In the detection of known malicious code,

    based on a computer’s measurements, using

    machine learning techniques, what is the

    achievable level of accuracy?

  • Q2: Is it possible to reduce the number of

    features to below 30, while maintaining a high

    level of accuracy

Conclusions contd
Conclusions(Contd) and Unified

  • Q3: Will the computer configuration and the

    computer background activity, from which the

    training sets were taken, have a significant

    influence on the detection accuracy?

  • Q4: Is the detection of unknown worms possible,

    based on a training set of known worms?