Detecting software theft via system call based birthmarks
This presentation is the property of its rightful owner.
Sponsored Links
1 / 29

Detecting Software Theft via System Call Based Birthmarks PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on
  • Presentation posted in: General

Detecting Software Theft via System Call Based Birthmarks. Xinran Wang, Yoon-Chan Jhi, Sencun Zhu, Peng Liu ACSAC 2009. OUTLINE. Introduction and Related Work System Call Based Birthmarks System Design and Implementation Evaluation Discussion and Conclusion.

Download Presentation

Detecting Software Theft via System Call Based Birthmarks

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Detecting software theft via system call based birthmarks

Detecting Software Theft via System Call Based Birthmarks

Xinran Wang, Yoon-Chan Jhi, Sencun Zhu,

Peng Liu

ACSAC 2009


Outline

OUTLINE

  • Introduction and Related Work

  • System Call Based Birthmarks

  • System Design and Implementation

  • Evaluation

  • Discussionand Conclusion


Software theft or plagiarism

Software Theft (or plagiarism)

  • Reuse someone else’s code

    • Even only a small part of the original program

  • Obfuscation techniques

    • Different compilers

    • Different compiler optimization levels

    • SandMark


Defender

Defender

  • Software watermark

    • Theoretically, any watermark can be removed

  • Software birthmark

    • A unique characteristic that a program inherently possesses


Defender cont

Defender(Cont.)

  • Requirements

    • R1: Resiliency to obfuscation techniques

    • R2: Capability to detect theft of components

    • R3: Large-scale

    • R4: Applicability to binary executables

    • R5: Independence to platforms


Related work

Related Work

  • Software Birthmark

    • Static source code based birthmark

    • Static executable code based birthmark

    • Dynamic whole program path(WPP) based birthmark

    • Dynamic API based birthmark

  • Clone Detection

    • String-based, AST-based, Token-based and PDG-based

  • Cannot satisfy all requirements


System call based birthmarks

System Call Based Birthmarks

  • Behavior based birthmarks

    • Unique behaviors in features and implementation details

  • SCSSB (System Call Short Sequence Birthmark)

  • IDSCSB (Input Dependant System Call Subsequence Birthmark)


Scssb system call short sequence birthmark

SCSSB (System Call Short Sequence Birthmark)

  • Definition 1: (System Call Trace)

  • Definition 2: (System Call Sequence Set)


Scssb system call short sequence birthmark1

SCSSB (System Call Short Sequence Birthmark)


Scssb system call short sequence birthmark2

SCSSB (System Call Short Sequence Birthmark)

  • Definition 3: (SCSSB: System Call Short Sequence Birthmark)

    SCSSB(p, I, k) is a subset of set S(p, I, k) that satisfies


Scssb system call short sequence birthmark3

SCSSB (System Call Short Sequence Birthmark)

  • Definition 4: (Containment) The containment of A in B is defined as:

    Here A is the birthmark of a plaintiff program or its component, and B is the birthmark of a suspect program.


System design and implementation

System Design and Implementation


System design and implementation1

System Design and Implementation

  • System Call Tracer

  • System Call Abstraction

  • Birthmark Generator

  • Input Dependant System Call Subsequence Birthmarks


System call tracer

System Call Tracer

  • The simplest way

    • strace

  • With thread identifier

    • SATracer based on Valgrind

  • Prepare a list of all subroutines of the component in SATracer

    • The list is automatically generated by Elsa

  • SATracer checks the execution stack of the running thread when a system call is called


System call abstraction

System Call Abstraction

  • Ignore the system calls that do not represent the behavior characteristic

    • brk , mmap

  • Consider aliases or multiple versions of a system call as the same

    • Ex: fstat(int fd, struct stat *sb) andstat(const char *path, struct stat *sb)

  • Ignore failed system calls


Birthmark generator

Birthmark Generator

  • Remove those loading-environment-dependent system calls

    • Run multiple times with the same input

  • Remove the (noisy) system calls

    • Establish a database of common system call short sequences


Input dependant system call subsequence birthmarks

Input Dependant System Call Subsequence Birthmarks

  • Definition 7: (IDSCSB: Input Dependant System Call Subsequence Birthmark)

  • Containment:


Input dependant system call subsequence birthmarks1

Input Dependant System Call Subsequence Birthmarks

  • “file id” and “process id” are ignored

  • Large parameters are hashed by the MD5


Evaluation

Evaluation

  • SCSSB and IDSCSB:

    • Against some advanced obfuscation techniques and 15 real-world large applications

  • SandMark implements 39 byte code obfuscators

  • x86 Linux executable

  • GCJ 4.1.2


Evaluation cont

Evaluation(Cont.)

  • Programs

    • bzip2.c, gzip.c and oggenc.c

  • Impact of Compiler Optimization Levels

    • five optimization switches (-O0,-O1,-O2,-O3 and -Os) of GCC (e.g., bzip2-O0, bzip2-O3, etc.)

  • Impact of Different Compilers

    • GCC, TCC and Watcom (e.g., bzip2-gcc, bzip2-tcc)


Scssb experiment i jlex and jflex

SCSSB Experiment I(JLex and JFlex)


Scssb experiment i cont

SCSSB Experiment I(Cont.)

  • JLex and JFlex


Scssb experiment i cont1

SCSSB Experiment I(Cont.)

  • Containment

    scores

    • JLex

      • CO: 87.9%

      • DO: 85.2%

    • JFlex

      • CO: 96%

      • DO: 96%


Scssb experiment ii gecko

SCSSB Experiment II(Gecko)

  • Gecko: Layout engine used in all Mozilla software and its derivatives


Scssb experiment ii cont

SCSSB Experiment II(Cont.)


Idscsb experiment i jlex and jflex

IDSCSB Experiment I(JLex and JFlex)

  • The containment scores between original and obfuscated JLex are all 100%

  • Between JLex and obfuscated JFlex are less than 46%

  • Between JLex/JFlex and other programs are no more than 7%.


Idscsb experiment ii gecko

IDSCSB Experiment II(Gecko)


Discussion

Discussion

  • Counterattacks

    • System call injection attack

    • System call reordering attack

  • Limitations

    • If the program does not involve any system calls…

    • Need unique system call behaviors

    • The detection result of our tool depends on the threshold a user defines


Conclusion

Conclusion

  • A novel type of birthmarks

  • Resilient to discriminates code obfuscated by SandMark, a state-of-the-art obfuscator

  • The first birthmark that:

    • Detect software component theft

    • Scalability to detect large-scale software theft


  • Login