software reverse engineering education l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Software Reverse Engineering Education PowerPoint Presentation
Download Presentation
Software Reverse Engineering Education

Loading in 2 Seconds...

play fullscreen
1 / 124

Software Reverse Engineering Education - PowerPoint PPT Presentation


  • 282 Views
  • Uploaded on

Software Reverse Engineering Education. http://www.reversingproject.info. Teodoro Cipresso, tcipress@hotmail.com San José State University, Spring 2009 Advisor: Dr. Mark Stamp Committee: Dr. Robert Chun, Dr. David Taylor. Background Information Introduction to Software Reverse Engineering.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Software Reverse Engineering Education' - sorley


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
software reverse engineering education

Software Reverse Engineering Education

http://www.reversingproject.info

Teodoro Cipresso, tcipress@hotmail.com

San José State University, Spring 2009

Advisor: Dr. Mark Stamp

Committee: Dr. Robert Chun, Dr. David Taylor

background information introduction to software reverse engineering
Background InformationIntroduction to Software Reverse Engineering
  • Software Reverse Engineering (SRE) can be described as the practice of analyzing a software system to create abstractions that identify the individual components and their dependencies, and, if possible, the overall system architecture [1].
  • Once the components and design of an existing system have been recovered, it becomes possible to repair and even enhance them.
  • Reverse engineering skills are also used to detect and neutralize viruses, worms and other malware, as well as to protect intellectual property [1].
background information cont d importance of sre education
Background Information (cont’d)Importance of SRE Education
  • “More emphasis is needed in SE [and CS] undergraduate and graduate programs on the issue of software evolution and change. Students need to be educated on the theory and practice of software comprehension, maintenance and reengineering. They need to learn how to live with the monsters from the past and tame them” [2].
  • “Most of the time, students are trained in developing very small programs starting from scratch. This approach is really misleading since most students learn to believe that software engineering is just about developing brand new software. In fact many students will be involved in evolution-related activities after completion of their studies” [3].
background information cont d student feedback on sre education
Background Information (cont’d)Student Feedback on SRE Education
  • Incorporation of software reverse engineering techniques and methodologies into regular course work was tried at the University of Missouri-Rolla [1].
  • The results of this experiment were quite positive:
    • 77% of students thought that the incorporation of SRE techniques and methodologies reinforced concepts taught during lectures.
    • 82% of students wanted SRE to be included in future courses, especially those that deal with software design.
background information cont d development related reversing scenarios
Background Information (cont’d)Development-Related Reversing Scenarios

Figure 1. Development-related software reverse engineering scenarios.

background information cont d security related reversing scenarios
Background Information (cont’d)Security-Related Reversing Scenarios

Figure 2. Security-related software reverse engineering scenarios.

background information cont d legacy software development process
Background Information (cont’d)Legacy Software Development Process

Figure 3. Software development process in a typical enterprise software system.

project overview baseline education in software reverse engineering

Educate programmers on software reengineering and reuse

Computer programmers with an improved ability to understand, evolve, and secure software.

Educate programmers on software security and malware detection

Educate programmers on software reversing, antireversing, and patching

Project OverviewBaseline Education in Software Reverse Engineering

Figure 4. Activities related to providing a baseline SRE education.

materials and methods
Materials and Methods
  • More than ten peer-reviewed articles on the topics of software reverse engineering, re-engineering, maintenance, reuse, and security were selected and used to address the research questions.
  • Of the articles selected, three were chosen for their specific coverage of experiences with teaching courses in software reversing, reengineering, and maintenance.
  • Drew upon my experience, which is just shy of a decade, with designing and developing legacy software modernization tools at IBM.
results overview of developed sre course modules
ResultsOverview of Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine Code
  • Applying Anti-Reversing Techniques to Java Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware
results cont d overview of developed sre course modules
Results (cont’d)Overview of Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine Code
  • Applying Anti-Reversing Techniques to Java Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware
results cont d reversing and patching wintel machine code
Results (cont’d)Reversing and Patching Wintel Machine Code
  • An introduction to the compilation of high-level languages to machine code is provided. Assembly is contrasted as having a one-to-one mapping to machine code
  • The negative results of experimentation with two decompilers (Boomerang and REC) for machine code are documented. Given the current state of decompiler technology, it was concluded that working with disassembly is the most feasible approach.
  • A Wintel machine code reversing and patching exercise was developed against Password Vault, a non-trivial application that is provided with the exercise to avoid any legal concerns with reversing software written by others.
results cont d reversing and patching wintel machine code cont d
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)
  • The machine code reversing and patching exercise asks the learner to create a new executable version of the application that no longer has a trial limitation of five password records per user.
  • A reliable, and repeatable reversing strategy is used: place a breakpoint on a memory artifact and trace back stack frames to locate the section in the disassembly.
  • For instructional purposes, an animated solution that demonstrates the application of this reversing strategy using OllyDbg, an interactive debugger-disassembler, was developed using Qarbon Viewlet Builder.
results cont d reversing and patching wintel machine code cont d14
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 5. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d15
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 6. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d16
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 7. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d17
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 8. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d18
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 9. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d19
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 10. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d20
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 11. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d21
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 12. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d22
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 13. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d23
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Figure 14. Animated solution to the Wintel reversing and patching exercise.

results cont d reversing and patching wintel machine code cont d24
Results (cont’d)Reversing and Patching Wintel Machine Code (cont’d)

Idea for an advanced Wintel machine code (**) exercise:

  • It should be feasible to patch in additional function to the Password Vault machine code:
    • The GCC compiler can generate assembly language instead of machine code, so the programmer can work in a high-level language.
    • Patching in the generated assembly code would require some significant amount of time spent in the program understanding phase.
    • Final integration of the new code would require modification of the Windows PE header to increase the size of the .code section, also the .rdata and .data sections if new variables and constants are added.
results cont d overview of developed sre course modules25
Results (cont’d) Overview of Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine Code
  • Applying Anti-Reversing Techniques to Java Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware
results cont d reversing and patching java bytecode
Results (cont’d)Reversing and Patching Java Bytecode
  • An introduction to interpreted/intermediate executable formats such as Java bytecode is provided. These formats are contrasted with machine code and assembly language.
  • Java bytecode “disassembly” using javap is covered for help with analysis of bytecode generated by javac.
  • The positive results of experimentation with the Jad Java bytecode decompiler are documented; it is concluded that direct reading/writing of bytecode is not necessary.
  • A Java bytecode reversing and patching exercise was developed against a Java version of Password Vault.
results cont d reversing and patching java bytecode cont d
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)
  • The Java bytecode reversing and patching exercise asks the learner to create a new executable version of the application that no longer has a trial limitation of five password records per user.
  • Since the Password Vault application consists of a small number of classes in a single package, a simple reversing strategy of unpacking the Jar archive, batch decompiling the classes, modifying the generated Java source, and recompiling is used.
  • For instructional purposes, an animated solution that demonstrates the application of this reversing strategy using FrontEnd Plus, a graphical interface to Jad, was developed using Qarbon Viewlet Builder.
results cont d reversing and patching java bytecode cont d28
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)

Figure 15. Animated solution to the Java bytecode reversing and patching exercise.

results cont d reversing and patching java bytecode cont d29
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)

Figure 16. Animated solution to the Java bytecode reversing and patching exercise.

results cont d reversing and patching java bytecode cont d30
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)

Figure 17. Animated solution to the Java bytecode reversing and patching exercise.

results cont d reversing and patching java bytecode cont d31
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)

Figure 18. Animated solution to the Java bytecode reversing and patching exercise.

results cont d reversing and patching java bytecode cont d32
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)

Figure 19. Animated solution to the Java bytecode reversing and patching exercise.

results cont d reversing and patching java bytecode cont d33
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)

Figure 20. Animated solution to the Java bytecode reversing and patching exercise.

results cont d reversing and patching java bytecode cont d34
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)

Figure 21. Animated solution to the Java bytecode reversing and patching exercise.

results cont d reversing and patching java bytecode cont d35
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)

Figure 22. Animated solution to the Java bytecode reversing and patching exercise.

results cont d reversing and patching java bytecode cont d36
Results (cont’d)Reversing and Patching Java Bytecode (cont’d)

Idea for an advanced Java bytecode (**) exercise:

  • Use available Java class libraries, such as jclasslib, to directly read and write Java bytecode.
    • Write a Java program that scans through the bytecode for the Java Password Vault application and locates the instructions for the trial limitation.
    • Once the instructions are located, overwrite them with a sequence that disables the trial limitation.
    • This can be good practice for getting a feel for writing code that patches an executable.
results cont d overview of developed sre course modules37
Results (cont’d) Overview of Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine Code
  • Applying Anti-Reversing Techniques to Java Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware
results cont d applying anti reversing techniques to machine code
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code
  • An brief introduction to basic anti-reversing techniques is provided: Eliminating Symbolic Information, Obfuscating the Program, and Embedding Anti-Debugger Code.
  • Machine code typically has very little symbolic information that can be altogether eliminated, therefore a discussion illustrates how debuggers insert quite a bit of information that makes machine code easier to reverse.
  • The technique Obfuscating the Program, is demonstrated in a Wintel machine code anti-reversing exercise where data, computation, and control flow obfuscations are applied to the C++ source code for Password Vault.
results cont d applying anti reversing techniques to machine code cont d
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)
  • Commercial tools such as EXECryptor www.strongbit.com, fully obfuscate and pack Windows executables, using advanced algorithms that are based on the elementary techniques described in this module.
  • It is difficult to provide a “before and after” illustration of machine code that is obfuscated using EXECryptor, so the examples and exercise in this module are implemented first at the source code level and then confirmed in the machine code using live and static analysis.
    • In the case of control-flow obfuscation, only static analysis is used, where subsequent run traces are compared using an edit-distance measurement.
results cont d applying anti reversing techniques to machine code cont d40
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)
  • The Wintel machine code anti-reversing exercise asks the learner to create a new executable version of the Password Vault application where the following transformations are applied:
    • Encryption of string literals (data obfuscation).
    • Obfuscation of the numeric representation of the password record limit (computation obfuscation).
    • Obfuscation of the method that performs the record limit check (control flow obfuscation).
results cont d applying anti reversing techniques to machine code cont d41
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)
  • Encryption of String Literals (data obfuscation):

Figure 23. Strings are decrypted each time they are used using a bundled cipher.

results cont d applying anti reversing techniques to machine code cont d42
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)
  • Obfuscation of the numeric representation of the password record limit (computation obfuscation):

Figure 24. Complex evaluations obscure the actual condition.

results cont d applying anti reversing techniques to machine code cont d43
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)
  • Obfuscation of the numeric representation of the password record limit (computation obfuscation) (cont’d):

Figure 25. Testing for a function of a number can slow a reverser down.

results cont d applying anti reversing techniques to machine code cont d44
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)
  • Obfuscation of the method that performs the record limit check (control flow obfuscation):
    • We introduce some non-essential, recursive, and randomized logic to the password limit check to make it more difficult for a reverser to perform static and/or live analysis.
    • Since no standards exist for control flow obfuscation, a custom algorithm was designed to hinder live and static analysis through use of recursive and randomized procedure calls.
    • Recursion grows the stack considerably, making stepping through the code difficult, while randomization makes execution unpredictable (breakpoints may not trigger & run traces differ).
results cont d applying anti reversing techniques to machine code cont d45
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)

Depth of the recursion is randomized on each check of the limit.

Random procedure call targets generate and return a number that is added to an instance variable, preventing the procedures from being identified as NOOPs by a code optimizer.

Figure 26. A control flow obfuscation algorithm for the record limit check.

results cont d applying anti reversing techniques to machine code cont d46
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)
  • To measure the effectiveness of the control flow algorithm in hindering analysis, three execution traces of the section of the code containing the record limit check were compared.
  • The Levenshtein Distance (LD) was computed between the three traces where each instruction in the trace was compared. LD was modified to consider each line as opposed to each character.
  • The execution traces were collected using OllyDbg and had to be cleaned of disassembly artifacts such as line numbers, base addresses, and comments in order to ensure that the analysis was fair.
results cont d applying anti reversing techniques to machine code cont d47
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)

Figure 27. Comparison of executions of record limit check on identical program input.

results cont d applying anti reversing techniques to machine code cont d48
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)
  • The Wintel anti-reversing module also demonstrates source code obfuscation which is a useful anti-reversing technique for source code.
  • There may exist a requirement to ship the source code of an application so that the machine code can be generated on the end user’s computer.
  • If the source code contains intellectual property that is worth protecting, one can perform transformations to the source code which make it difficult to read, but have no impact on the machine code that would ultimately be generated when the program is compiled.
results cont d applying anti reversing techniques to machine code cont d49
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)
  • Demonstration of the COBF source code obfuscator:

VerifyPassword.cpp:

01: int main(int argc, char *argv[])

02: {

03: const char *password = "jup!ter";

04: string specified;

05: cout << "Enter password: ";

06: getline(cin, specified);

07: if (specified.compare(password) == 0)

08: {

09: cout << "[OK] Access granted." << endl;

10: } else

11: {

12: cout << "[Error] Access denied." << endl;

13: }

14: }

COBF invocation:

01: C:\cobf_1.06\src\win32\release\cobf.exe

02: @C:\cobf_1.06\src\setup_cpp_tokens.inv -o cobfoutput -b -p C:

03: \cobf_1.06\etc\pp_eng_msvc.bat VerifyPassword.cpp

results cont d applying anti reversing techniques to machine code cont d50
Results (cont’d)Applying Anti-Reversing Techniques to Machine Code (cont’d)

COBF obfuscated source for VerifyPassword.cpp:

01: #include"cobf.h"

02: ls lp lk;lf lo(lf ln,ld*lj[]){ll ld*lc="\x6a\x75\x70\x21\x74

03: \x65\x72";lh la;lb<<"\x45\x6e\x74\x65\x72\x20\x70\x61\x73\x73

04: \x77\x6f\x72\x64""\x3a\x20";li(lq,la);lm(la.lg(lc)==0){lb<<"\x5b

05: \x4f\x4b\x5d\x20\x41" "\x63\x63\x65\x73\x73\x20\x67\x72\x61\x6e

06: \x74\x65\x64\x2e"<<le;}lr{lb<<"\x5b\x45\x72\x72\x6f\x72\x5d

07: \x20\x41\x63\x63\x65\x73\x73\x20\x64" "\x65\x6e\x69\x65

08: \x64\x2e"<<le;}}

COBF generated header (cobf.h):

results cont d overview of developed sre course modules51
Results (cont’d) Overview of Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine Code
  • Applying Anti-Reversing Techniques to Java Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware
results cont d applying anti reversing techniques to java bytecode
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode
  • While experiments with decompiling machine code were not successful, decompilation of Java bytecode to Java source code yielded acceptable results.
  • Given these results, one does need to be concerned with protecting Java bytecode from decompilation if there is significant intellectual property in the program.
  • Obfuscating bytecode is inherently easier than obfuscating source code because bytecode has a significantly more strict and organized representation than source code.
results cont d applying anti reversing techniques to java bytecode cont d
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)
  • Variable, class, and method names, are all left intact when compiling Java source code to Java bytecode. This is a stark difference from machine code where variable and local method names are not preserved.
  • A high-level of protection can be achieved for Java bytecode by applying three transformations: Name Obfuscation, String Encryption, and Control Flow Obfuscation.
  • Zelix Klassmaster, a commercial product, is capable of all performing all three. Unfortunately no open-source or free tool exists that can perform all three.
results cont d applying anti reversing techniques to java bytecode cont d54
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)
  • The trial version of Zelix Klassmaster is restricted to 30 days, and the company will only e-mail a trial version to “non-free” e-mail addresses.
  • Not much is learned by having everything done for us, so this module sees how far one can get with open-source and free software.
  • ProGuard and RetroGuard are free Java bytecode obfuscators capable of Name Obfuscation.
  • SandMark, a Java bytecode watermarking and obfuscation tool from the University of Arizona, is capable of String Encryption and some weak control flow obfuscations.
results cont d applying anti reversing techniques to java bytecode cont d55
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)
  • A Java bytecode anti-reversing exercise was developed against the Java version of Password Vault.
  • Since the learner will have already experienced manually applying obfuscations in the Wintel machine code anti-reversing, this exercise focuses on the use of tools.
  • In the exercise, it is expected that the Java bytecode for the Password Vault application will be incrementally obfuscated using two or more tools.
  • For instructional purposes, an animated solution that demonstrates obfuscating the Password Vault Java bytecode to the point of inhibiting decompilation, was developed using Qarbon Viewlet Builder.
results cont d applying anti reversing techniques to java bytecode cont d56
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 28. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d57
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 29. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d58
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 30. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d59
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 31. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d60
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 32. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d61
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 33. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d62
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 34. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d63
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 35. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d64
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 36. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d65
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 37. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d66
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 38. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d67
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 39. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d68
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 40. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d69
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 41. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d70
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 42. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d71
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 43. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d72
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 44. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d73
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 45. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d74
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 46. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d75
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 47. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d76
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 48. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d77
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 49. Animated solution to the Java bytecode anti-reversing exercise.

results cont d applying anti reversing techniques to java bytecode cont d78
Results (cont’d)Applying Anti-Reversing Techniques to Java Bytecode (cont’d)

Figure 50. Animated solution to the Java bytecode anti-reversing exercise.

results cont d overview of developed sre course modules79
Results (cont’d) Overview of Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine Code
  • Applying Anti-Reversing Techniques to Java Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware
results cont d reengineering and reuse of legacy software
Results (cont’d)Reengineering and Reuse of Legacy Software
  • The question of whether to reengineer or reuse components of a software system most often arises in the context of large business or government organizations.
  • Over time the processes and procedures of a business or organization will inevitably be reflected in the software systems that enable efficient, day-to-day operations [5].
  • While reverse engineering of legacy software is inherently intractable, some of us will inevitably find ourselves in a situation where no other option is available because the cost of rewriting a large, complex software system is prohibitive [6].
results cont d reengineering and reuse of legacy software cont d
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • If good development practices were followed, legacy software is typically composed of three layers [5]:

Figure 51. Layers of a well-structured legacy software application.

results cont d reengineering and reuse of legacy software cont d82
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Legacy applications that are not sufficiently componentized, such that their general organization resembles the three layers, are not good candidates for reengineering and reuse.
  • The most widely accepted technique to reuse legacy application components is that of Wrappering [5], where a new piece of code provides an interface to a legacy application component or layer without requiring code changes to it.
  • Typically, candidate applications should be well-structured such that the business logic can be isolated, encapsulated, and made into reusable components.
results cont d reengineering and reuse of legacy software cont d83
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Unless enough of an application's source code remains such that it's possible to identify the names of reusable entry points (procedures) and their I/O data structures, attempting to reuse the application may be difficult.
  • While it is possible to learn the names of entry points that have been explicitly exported by an application in the case of a DLL, the names don't indicate the layout of the expected I/O data structures.
  • One way to discover the entry points and I/O data structures in legacy machine code is to read the source code of other applications which depend on it.
results cont d reengineering and reuse of legacy software cont d84
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • The COBOL programming language is most often associated with legacy software applications.
  • Normally, COBOL programs have a single entry point; additional “alternate” entry points are rare.
  • Legacy COBOL programs often include functional discriminators in their I/O data structures.

Figure 52. Mapping legacy functional discriminators to an object-oriented design.

results cont d reengineering and reuse of legacy software cont d85
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • In a real-world situation, we would be looking to reuse legacy components whose machine code is the result of thousands of lines of high-level language statements (COBOL) that implement a particular business process.
  • Since our focus is more on reuse and reengineering of legacy code at a basic level, it's not necessary to encumber ourselves with a very large program in order to learn strategies for reuse and reengineering.
  • Included with this module is a small COBOL “calculator” that we wish to make reusable from Java. This program is assumed to be something from the business logic layer.
results cont d reengineering and reuse of legacy software cont d86
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)

01: ******************************************************************

02: ** Simple COBOL program that performs integer arithmetic **

03: ******************************************************************

04: IDENTIFICATION DIVISION.

05: PROGRAM-ID. 'SMPLCALC'.

06: DATA DIVISION.

07: WORKING-STORAGE SECTION.

08: 77 MSG-NUMERIC-OVERFLOW PIC X(25)

09: VALUE 'Numeric overflow occurred'.

10: 77 MSG-SUCCESSFUL PIC X(22)

11: VALUE 'Completed successfully'.

12: LINKAGE SECTION.

13: * Input/Output data structure

14: 01 SMPLCALC-INTERFACE.

15: 02 SI-OPERAND-1 PIC S9(9) COMP-5.

16: 02 SI-OPERAND-2 PIC S9(9) COMP-5.

17: 02 SI-OPERATION PIC X.

18: 88 DO-ADD VALUE '+'.

19: 88 DO-SUB VALUE '-'.

20: 88 DO-MUL VALUE '*'.

21: 02 SI-RESULT PIC S9(18) COMP-3.

22: 02 SI-RESULT-MESSAGE PIC X(128).

23: PROCEDURE DIVISION USING

24: BY REFERENCE SMPLCALC-INTERFACE.

25: MAINLINE SECTION.

26: * Perform requested arithmetic

results cont d reengineering and reuse of legacy software cont d87
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)

27: INITIALIZE SI-RESULT SI-RESULT-MESSAGE

28: EVALUATE TRUE

29: WHEN DO-ADD

30: COMPUTE SI-RESULT = SI-OPERAND-1 + SI-OPERAND-2

31: ON SIZE ERROR

32: PERFORM HANDLE-SIZE-ERROR

33: END-COMPUTE

34: WHEN DO-SUB

35: COMPUTE SI-RESULT = SI-OPERAND-1 - SI-OPERAND-2

36: ON SIZE ERROR

37: PERFORM HANDLE-SIZE-ERROR

38: END-COMPUTE

39: WHEN DO-MUL

40: COMPUTE SI-RESULT = SI-OPERAND-1 * SI-OPERAND-2

41: ON SIZE ERROR

42: PERFORM HANDLE-SIZE-ERROR

43: END-COMPUTE

44: END-EVALUATE

45: * Successful return

46: MOVE MSG-SUCCESSFUL TO SI-RESULT-MESSAGE

47: MOVE 2 TO RETURN-CODE

48: GOBACK

49: .

results cont d reengineering and reuse of legacy software cont d88
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Many commercial tools support importing a COBOL data structure and generating Java marshalling classes.
  • These marshalling classes are intended to be used with the J2EE Connector Architecture (JCA) where a Java application wrappers a legacy software application.

Figure 53. Example JCA implementation for accessing a legacy application.

results cont d reengineering and reuse of legacy software cont d89
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • A popular alternative to using the JCA architecture to reengineer and reuse legacy applications is to implement a Service Oriented Architecture (SOA).
  • SOA components become capable of communicating without the tight and fragile coupling of traditional binary interfaces because they are wrappered with a platform-neutral interface such as XML and Web services.
  • When XML is used as envisioned, all data, both of type character and numeric are represented as printable text—completely divorced from any platform specific representation or encoding.
results cont d reengineering and reuse of legacy software cont d90
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • The net effect of this is that two entities or programs can interact without having to know the data structures that comprise each other's binary interface.
  • Of course, the XML that is exchanged cannot be arbitrary, so industry standards such as XML Schema (XSD), and Web Services Definition Language (WSDL) fill this gap.
  • A Web service is considered to be WS-I compliant, or generally interoperable, if it meets many criteria, one of which is the use of XML for the input and output of each operation exposed by service.
results cont d reengineering and reuse of legacy software cont d91
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • This particular requirement of WS-I where XML is the interoperable interface of choice, sets the stage for a meaningful exercise.
  • A Legacy Software Reengineering and Reuse Exercise was developed for this module where the focus is on wrappering a COBOL program so that is reusable from Java using XML in a local environment.
  • The learner is asked to create a language neutral XML interface to the COBOL calculator program and invoke it from a Java program, which incidentally makes it reusable from other Java programs.
results cont d reengineering and reuse of legacy software cont d92
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Overview of the architecture for the exercise:

Figure 54. Architecture for legacy application reengineering and reuse from Java.

results cont d reengineering and reuse of legacy software cont d93
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Steps in the reengineering and reuse exercise:
    • Create an XML Schema which represents all of the data in the SMPLCALC-INTERFACECOBOL data structure.
    • Write a Java interface ISimpleCalculator.javafor three computation types supported by SMPLCALC.cbl.
    • Write a Java class JSimpleCalculator.javathat implements the interface defined in ISimpleCalculator.javaand provides a user interface.
    • Use the Java command-line utility xjc, in combination with the XML Schema, generate Java to XML marshalling code (JAXB).
results cont d reengineering and reuse of legacy software cont d94
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Steps in the reengineering and reuse exercise (cont’d):
    • Write a small C/C++ JNI program Java2CblXmlBridge.cppwhich exports a method Java2SmplCalc that:
      • Invokes XML2CALC.cbl, passing the XML document received from JSimpleCalculator.java.
      • Returns the XML generated by XML2CALC.cblto JSimpleCalculator.java.
results cont d reengineering and reuse of legacy software cont d95
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Steps in the reengineering and reuse exercise (cont’d):
    • Write a COBOL program XML2CALC.cbl:
      • Marshalls XML from Java2CblXmlBridge.cpp into SMPLCALC-INTERFACE.
      • Invokes SMPLCALC.cbl, passing SMPLCALC-INTERFACEby reference.
      • Marshalls SMPLCALC-INTERFACEback to XML before returning to Java2CblXmlBridge.cpp.
    • Compile XML2CALC.cbl and link it with the object code for SMPLCALC.cbl (SMPLCALC.obj).
results cont d reengineering and reuse of legacy software cont d96
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Steps in the reengineering and reuse exercise (cont’d):
    • Create a DLL to be loaded by JSimpleCalculator.javaby compiling and linking Java2CblXmlBridge.cppwith the object code for XML2CALC.cbl.
    • Update JSimpleCalculator.javato use the JAXB marshalling code to send/receive XML through the JNI layer and display the results.
results cont d reengineering and reuse of legacy software cont d97
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Highlights of the solution code:
    • SimpleCalculator.xsd

<element name="SI-OPERAND-1">

<simpleType>

<restriction base="integer">

<totalDigits value="9" />

</restriction>

</simpleType>

</element>

. . .

<element name="SI-OPERATION">

<simpleType>

<restriction base="string">

<enumeration value="+" />

<enumeration value="-" />

<enumeration value="*" />

</restriction>

</simpleType>

</element>

results cont d reengineering and reuse of legacy software cont d98
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Highlights of the solution code (cont’d):
    • ISimpleCalculator.java
results cont d reengineering and reuse of legacy software cont d99
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Highlights of the solution code (cont’d):
    • JSimpleCalculator.java
results cont d reengineering and reuse of legacy software cont d100
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Highlights of the solution code (cont’d):
    • JSimpleCalculator.java (cont’d)
results cont d reengineering and reuse of legacy software cont d101
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Highlights of the solution code (cont’d):
    • Java2CblXmlBridge.c
results cont d reengineering and reuse of legacy software cont d102
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Highlights of the solution code (cont’d):
    • XML2CALC.cbl
results cont d reengineering and reuse of legacy software cont d103
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Sample run of solution code:

Figure 55. Reuse of COBOL from Java using JAXB, JNI, and COBOL XML Support.

results cont d reengineering and reuse of legacy software cont d104
Results (cont’d)Reengineering and Reuse of Legacy Software (cont’d)
  • Sample run of solution code:

Figure 56. Reuse of COBOL from Java using JAXB, JNI, and COBOL XML Support.

results cont d overview of developed sre course modules105
Results (cont’d) Overview of Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine Code
  • Applying Anti-Reversing Techniques to Java Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware
results cont d identifying monitoring and reporting malware
Results (cont’d)Identifying, Monitoring, and Reporting Malware
  • Malware describes a category of software that does always operate in a way that benefits the user.
  • Of course, those of us who have ever used software might contend that this definition of malware will cause programs that we use every day to be categorized as malware.
  • So let's qualify it a bit: the malicious or annoying behaviors of malware are intentional, not the result of one or more bugs.
results cont d identifying monitoring and reporting malware cont d
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • There are currently five types of malware that affect computer systems [6] [7]:
    • Viruses: require some deliberate action to help them spread.
    • Worms: similar to a virus but can spread by itself over computer networks.
    • Trojan Horses: functional software that performs hidden malicious or annoying operations.
    • Backdoor: a vulnerability purposely embedded in software.
    • Rabbit: a program that exhausts system resources.
results cont d identifying monitoring and reporting malware cont d108
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • Malware usually isn't of just one type; for example, 3 of the top 10 malicious codes families reported in 2008 were Trojans with a backdoor component [8].
  • Using the machine code and bytecode reversing experiences gained from the previous modules, one could try reversing malware.
  • Using virtualization tools such as VMware to create secondary operating system images on which to analyze malware can still result in infection of the primary operating system.
results cont d identifying monitoring and reporting malware cont d109
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • The goal of this module is to help the learner become familiar with using tools to identify, monitor, and report software that might be malicious.
  • Since it's not practical to ask a learner to install a virus, worm, backdoor, or rabbit, we are left with the possibility of a benign software Trojan. (discussed later).
  • In 1996, Mark Russinovich founded a company called “Winternals Software” where he was the chief software architect on a comprehensive suite of tools for diagnosing, debugging, and repairing Windows® systems and applications [9].
results cont d identifying monitoring and reporting malware cont d110
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • Mark's company has since been purchased by Microsoft and his suite of tools have been rebranded “Windows Sysinternals” and are offered for free on Microsoft Technet.
  • Mark's story is an interesting one because he is recognized as an expert on the internals of Windows® even though he did not participate in its development—a true testament to what can be learned about software through reverse engineering.
  • The Sysinternals suite contains 66 different utilities, but we'll focus on the most useful one in this context of analyzing the behavior of malware: Process Monitor.
results cont d identifying monitoring and reporting malware cont d111
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • The Process Monitor can capture detailed information about any running process in a Windows® system including: file system, registry, and network activity.

Figure 57. Process Monitor session for the Password Vault application.

results cont d identifying monitoring and reporting malware cont d112
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • Of course, Process Monitor itself doesn't identify malware, it simply reports what a process is doing.
  • With a little bit of ingenuity, one can identify Trojan Horses by looking for activities that don't seem to fit with the advertised functionality of a program.
  • It's common practice to download free software from the Internet, and because we've been convinced that open-source software, which is sometimes confused with free software, should have the fewest number of vulnerabilities, we do it without much afterthought.
results cont d identifying monitoring and reporting malware cont d113
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • Incidentally, the data on the number of vulnerabilities found in popular Internet browsers does not support this belief.
  • “Mozilla browsers were affected by 99 new vulnerabilities in 2008, more than any other browser; there were 47 new vulnerabilities identified in Internet Explorer, 40 in Apple Safari, 35 in Opera™, and 11 in Google® Chrome” [8].
  • It seems counter-intuitive that an open-source browser would have twice as many security holes than a closed-source browser like Internet Explorer.
results cont d identifying monitoring and reporting malware cont d114
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • Becoming familiar with the Windows® Sysinternals suite can help you evaluate whether the software on your Windows® machine is acting in your best interest.
  • If you suspect a particular program to be malware, it can be submitted online to a service called ThreatExpert.
  • ThreatExpert is a Web-based tool that supports submission of software executables that are to be evaluated against an on-line malware database.
  • Matching against existing malware is just one part of ThreatExpert's automated engine; the service tries to execute suspected malware in an isolated environment in order to perform heuristic analysis of its actions.
results cont d identifying monitoring and reporting malware cont d115
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)

Figure 59. Example ThreatExpert report summary for submitted malware.

results cont d identifying monitoring and reporting malware cont d116
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • A Malware Identification and Monitoring Exercise was developed against a Java Alarm Clock application. This program was written to be a benign software Trojan.
  • The exercise asks the learner to identify the behaviors of the Alarm Clock application that make it a software Trojan using the Windows Sysinternals tool suite.
  • The Alarm Clock application bytecode has been aggressively obfuscated to discourage the use of decompilation as a strategy for learning the program’s behavior.
results cont d identifying monitoring and reporting malware cont d117
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)
  • The Alarm Clock application is a benign software Trojan that, in addition to being a rudimentary alarm clock, performs unadvertised functions on background threads:
    • Logs information from the Windows® registry
    • Logs locations of office documents in the file system.
    • Scans for computers that respond to an ICMP ping.
    • Paced background threads are used.
results cont d identifying monitoring and reporting malware cont d118
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)

Figure 60. Background threads log information about the user’s system.

results cont d identifying monitoring and reporting malware cont d119
Results (cont’d)Identifying, Monitoring, and Reporting Malware (cont’d)

Figure 61. Process Monitor session for the Alarm Clock application.

conclusions
Conclusions
  • Since programmers would benefit from reverse engineering education, instructors need to be able to teach it to them.
  • At the present time, computer science instructors will be hard pressed to find materials for teaching a course that are compatible with classroom delivery.
  • Several books exist on reverse engineering that cater to industry professionals or those interested in self-study.
  • However, in a university setting, instructors engage students in ordered learning through exercises, quizzes, and exams.
conclusions121
Conclusions
  • Universities should continue to work toward establishing standard content for software reverse engineering and software maintenance courses.
  • Software Reverse Engineering is an activity that relies heavily on tools. Better tools can only make this activity more feasible and reliable.
  • The market for reverse engineering tools does not seem saturated; there appear to be some opportunities for either new open-source projects or commercial products.
references
References

[1] M. R. Ali, “Why teach reverse engineering?” ACM SIGSOFT SEN, v.30, n.4, pp.1-4, Jul 2005.

[2] M. El-Ramly, “Experience in teaching a software reengineering course,” in Proceedings of the 28th International Conference on Software Engineering (ICSE). Shanghai, China, 2006, pp. 699-702.

[3] A. V. Deursen, J. Favre, R. Koschke, and J. Rilling, “Experiences in Teaching Software Evolution and Program Comprehension,” in Proceedings of the 11th IEEE international Workshop on Program Comprehension, Washington, DC, 2003, pp. 2834-284.

[4] B. W. Weide, W. D. Heym, J. E. Hollingsworth, “Reverse engineering of legacy code exposed,” in Proceedings of the 17th international Conference on Software Engineering, Seattle, Washington, WA, 1995, pp. 327-331.

[5] H. M. Sneed, “Encapsualtion of legacy software: A technique for reusing legacy software components”, in Annals of Software Engineering, v.9, n.4, pp.293-313, 2000.

references cont d
References (cont’d)

[6] B. W. Weide, W. D. Heym, J. E. Hollingsworth, “Reverse engineering of legacy code exposed,” in Proceedings of the 17th international Conference on Software Engineering, Seattle, Washington, WA, 1995, pp. 327-331.

[7] E. Eliam, Secrets of Reverse Engineering, Indianapolis, IN: Wiley, 2005. M. Stamp, Information Security: Principles and Practice, Hoboken, NJ: John Wiley & Sons, 2006.

[8] Symantec Corp. (2009, Apr.). Symantec Global Internet Security Threat Report. [Online]. Available: http://eval.symantec.com/mktginfo/enterprise/white_papers/bwhitepaper_internet_security_threat_report_xiv_04-2009.en-us.pdf. (Accessed April 26th, 2009).

[9] Microsoft Corporation, Windows Sysinternals: utilities to help manage, troubleshoot and diagnose Windows systems and applications. [Online]. Available: http://technet.microsoft.com/en-us/sysinternals/default.aspx. (Accessed April 30th, 2009).