Detecting software theft via system call based birthmarks
Download
1 / 29

Detecting Software Theft via System Call Based Birthmarks - PowerPoint PPT Presentation


  • 156 Views
  • Uploaded on

Detecting Software Theft via System Call Based Birthmarks. Xinran Wang, Yoon-Chan Jhi, Sencun Zhu, Peng Liu ACSAC 2009. OUTLINE. Introduction and Related Work System Call Based Birthmarks System Design and Implementation Evaluation Discussion and Conclusion.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Detecting Software Theft via System Call Based Birthmarks' - david-espinoza


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Detecting software theft via system call based birthmarks

Detecting Software Theft via System Call Based Birthmarks

Xinran Wang, Yoon-Chan Jhi, Sencun Zhu,

Peng Liu

ACSAC 2009


Outline
OUTLINE

  • Introduction and Related Work

  • System Call Based Birthmarks

  • System Design and Implementation

  • Evaluation

  • Discussionand Conclusion


Software theft or plagiarism
Software Theft (or plagiarism)

  • Reuse someone else’s code

    • Even only a small part of the original program

  • Obfuscation techniques

    • Different compilers

    • Different compiler optimization levels

    • SandMark


Defender
Defender

  • Software watermark

    • Theoretically, any watermark can be removed

  • Software birthmark

    • A unique characteristic that a program inherently possesses


Defender cont
Defender(Cont.)

  • Requirements

    • R1: Resiliency to obfuscation techniques

    • R2: Capability to detect theft of components

    • R3: Large-scale

    • R4: Applicability to binary executables

    • R5: Independence to platforms


Related work
Related Work

  • Software Birthmark

    • Static source code based birthmark

    • Static executable code based birthmark

    • Dynamic whole program path(WPP) based birthmark

    • Dynamic API based birthmark

  • Clone Detection

    • String-based, AST-based, Token-based and PDG-based

  • Cannot satisfy all requirements


System call based birthmarks
System Call Based Birthmarks

  • Behavior based birthmarks

    • Unique behaviors in features and implementation details

  • SCSSB (System Call Short Sequence Birthmark)

  • IDSCSB (Input Dependant System Call Subsequence Birthmark)


Scssb system call short sequence birthmark
SCSSB (System Call Short Sequence Birthmark)

  • Definition 1: (System Call Trace)

  • Definition 2: (System Call Sequence Set)



Scssb system call short sequence birthmark2
SCSSB (System Call Short Sequence Birthmark)

  • Definition 3: (SCSSB: System Call Short Sequence Birthmark)

    SCSSB(p, I, k) is a subset of set S(p, I, k) that satisfies


Scssb system call short sequence birthmark3
SCSSB (System Call Short Sequence Birthmark)

  • Definition 4: (Containment) The containment of A in B is defined as:

    Here A is the birthmark of a plaintiff program or its component, and B is the birthmark of a suspect program.



System design and implementation1
System Design and Implementation

  • System Call Tracer

  • System Call Abstraction

  • Birthmark Generator

  • Input Dependant System Call Subsequence Birthmarks


System call tracer
System Call Tracer

  • The simplest way

    • strace

  • With thread identifier

    • SATracer based on Valgrind

  • Prepare a list of all subroutines of the component in SATracer

    • The list is automatically generated by Elsa

  • SATracer checks the execution stack of the running thread when a system call is called


System call abstraction
System Call Abstraction

  • Ignore the system calls that do not represent the behavior characteristic

    • brk , mmap

  • Consider aliases or multiple versions of a system call as the same

    • Ex: fstat(int fd, struct stat *sb) and stat(const char *path, struct stat *sb)

  • Ignore failed system calls


Birthmark generator
Birthmark Generator

  • Remove those loading-environment-dependent system calls

    • Run multiple times with the same input

  • Remove the (noisy) system calls

    • Establish a database of common system call short sequences


Input dependant system call subsequence birthmarks
Input Dependant System Call Subsequence Birthmarks

  • Definition 7: (IDSCSB: Input Dependant System Call Subsequence Birthmark)

  • Containment:


Input dependant system call subsequence birthmarks1
Input Dependant System Call Subsequence Birthmarks

  • “file id” and “process id” are ignored

  • Large parameters are hashed by the MD5


Evaluation
Evaluation

  • SCSSB and IDSCSB:

    • Against some advanced obfuscation techniques and 15 real-world large applications

  • SandMark implements 39 byte code obfuscators

  • x86 Linux executable

  • GCJ 4.1.2


Evaluation cont
Evaluation(Cont.)

  • Programs

    • bzip2.c, gzip.c and oggenc.c

  • Impact of Compiler Optimization Levels

    • five optimization switches (-O0,-O1,-O2,-O3 and -Os) of GCC (e.g., bzip2-O0, bzip2-O3, etc.)

  • Impact of Different Compilers

    • GCC, TCC and Watcom (e.g., bzip2-gcc, bzip2-tcc)


Scssb experiment i jlex and jflex
SCSSB Experiment I(JLex and JFlex)



Scssb experiment i cont1
SCSSB Experiment I(Cont.)

  • Containment

    scores

    • JLex

      • CO: 87.9%

      • DO: 85.2%

    • JFlex

      • CO: 96%

      • DO: 96%


Scssb experiment ii gecko
SCSSB Experiment II(Gecko)

  • Gecko: Layout engine used in all Mozilla software and its derivatives



Idscsb experiment i jlex and jflex
IDSCSB Experiment I(JLex and JFlex)

  • The containment scores between original and obfuscated JLex are all 100%

  • Between JLex and obfuscated JFlex are less than 46%

  • Between JLex/JFlex and other programs are no more than 7%.



Discussion
Discussion

  • Counterattacks

    • System call injection attack

    • System call reordering attack

  • Limitations

    • If the program does not involve any system calls…

    • Need unique system call behaviors

    • The detection result of our tool depends on the threshold a user defines


Conclusion
Conclusion

  • A novel type of birthmarks

  • Resilient to discriminates code obfuscated by SandMark, a state-of-the-art obfuscator

  • The first birthmark that:

    • Detect software component theft

    • Scalability to detect large-scale software theft


  • ad