Disassembling for Fun

DownloadDisassembling for Fun

Advertisement
Download Presentation
Comments
mare
From:
|  
(195) |   (0) |   (0)
Views: 84 | Added: 20-02-2012
Rate Presentation: 0 0
Description:
Who is this guy?. Certifiable (MCSD.net certified that is)Blog
Disassembling for Fun

An Image/Link below is provided (as is) to

Download Policy: Content on the Website is provided to you AS IS for your information and personal use only and may not be sold or licensed nor shared on other sites. SlideServe reserves the right to change this policy at anytime. While downloading, If for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.











- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -




1. Disassembling for Fun Jason Haley

2. Who is this guy? Certifiable (MCSD.net certified that is) Blog ? http://jasonhaley.com/blog Co-leader of Beantown .Net User Group Member of Boston Area Code Brew A nerd dinner organizer for Boston area TA for Programming .Net at Harvard Sr. Software Engineer - Cheshire Software

3. Disassembling is useful See how efficient a compiler is Translate IL to a higher level language View all pieces of an assembly Extract resources Edit source code to recompile

4. Example of disassembling What is Round-tripping? Demos: ILDasm, Reflector Round-trip means to disassemble an assembly, do something with it, then compile the result of the disassembling (including changes). Three reasons to alter the ILAsm source code during a round trip: Change the code emitted by a high-level compiler or a tool in a way the compiler would not allow you to do Add items written in ILAsm to extend your application?s functionality beyond the capabilities of a high level compiler (ie. Mike Stall?s IL inlining in C# and VB) Combine several modules into a single module Demo instructions: Ildasm /out:hello.il helloworld.exe Ilasm /exe hello.il /out:HelloWorld2.exeRound-trip means to disassemble an assembly, do something with it, then compile the result of the disassembling (including changes). Three reasons to alter the ILAsm source code during a round trip: Change the code emitted by a high-level compiler or a tool in a way the compiler would not allow you to do Add items written in ILAsm to extend your application?s functionality beyond the capabilities of a high level compiler (ie. Mike Stall?s IL inlining in C# and VB) Combine several modules into a single module Demo instructions: Ildasm /out:hello.il helloworld.exe Ilasm /exe hello.il /out:HelloWorld2.exe

5. Agenda Define disassembling Applied disassembling Writing a disassembler

6. What is disassembling? Disassembling is not reflection Demos: WinCV, Asmex Disassemble or decompile Demos: ILDasm, Reflector WinCV shows interface level information using reflection Asmex shows the same interface level information, but it also shows more detailed data that is contained in the metadata of the assembly Disassemble ? take machine code and convert it to assembler Decompile ? take machine code and convert it to a higher level language (C, C++, etc.)WinCV shows interface level information using reflection Asmex shows the same interface level information, but it also shows more detailed data that is contained in the metadata of the assembly Disassemble ? take machine code and convert it to assembler Decompile ? take machine code and convert it to a higher level language (C, C++, etc.)

7. Agenda Define disassembling Applied disassembling Writing a disassembler

8. What is in an assembly file? PE/COFF File CLR Header Metadata IL code

9. PE File Portable Executable File Format PE/COFF headers Data directories Sections Demos: Dumpbin, .Net Explorer PE/COFF Specification ? outlines the file formats for (executable) image and object files Portable means not Architecture specific (windows runs on multiple hardware platforms, Intel, Mips, Alpha, etc.) From PE Spec Image File: Executable file: eithe a .exe file or .dll. An image file can be thought of as a ?memory image?. The term ?image file? is usually used instead of ?executable file?, because the later sometimes is taken to mean only the .exe file. Object File: A file given as input to the linker. The linker produces an image file, which in turn is used as input by the loader. The term ?object file? does not necessarily imply any connection to object-oriented programming RVA: Relative Virtual Address. In an image file, an RVA is always the address of an item once loaded into memory, with the base address of the image file subtracted from it. The RVA of an item will almost always differ from its position within the file on disk (File Pointer). Virtual Address (VA): Save as RVA, except that the base address of the image file is not subtracted. The address is called ?Virtual Address? because Windows creates a distinct virtual address space for each process, independent of physical memory. A virtual address is not as predicatable as an RVA, because the loader might not load the image at its preferred location. File pointer: Location of an item within the file itself, before being processed by the linker or the loader. In other words, this is a position within the file as stored on disk. Section: A section is the basic unit of code or data within a PE/COFF file.PE/COFF Specification ? outlines the file formats for (executable) image and object files Portable means not Architecture specific (windows runs on multiple hardware platforms, Intel, Mips, Alpha, etc.) From PE Spec Image File: Executable file: eithe a .exe file or .dll. An image file can be thought of as a ?memory image?. The term ?image file? is usually used instead of ?executable file?, because the later sometimes is taken to mean only the .exe file. Object File: A file given as input to the linker. The linker produces an image file, which in turn is used as input by the loader. The term ?object file? does not necessarily imply any connection to object-oriented programming RVA: Relative Virtual Address. In an image file, an RVA is always the address of an item once loaded into memory, with the base address of the image file subtracted from it. The RVA of an item will almost always differ from its position within the file on disk (File Pointer). Virtual Address (VA): Save as RVA, except that the base address of the image file is not subtracted. The address is called ?Virtual Address? because Windows creates a distinct virtual address space for each process, independent of physical memory. A virtual address is not as predicatable as an RVA, because the loader might not load the image at its preferred location. File pointer: Location of an item within the file itself, before being processed by the linker or the loader. In other words, this is a position within the file as stored on disk. Section: A section is the basic unit of code or data within a PE/COFF file.

10. CLR Header Contains CLR specific information ?Required runtime? version Metadata location Managed resources location Strong name signature location Demo: .Net Explorer MajorRuntimeVersion and MinorRuntimeVersion may not be the values you are expecting them to be, because they really refer to the clr header version as is alluded to in the comment of corhdr.h line 124: // CLR 2.0 header structure. Even though the ?runtime? is v1.1.4322, as can be seen in the metadata root informationMajorRuntimeVersion and MinorRuntimeVersion may not be the values you are expecting them to be, because they really refer to the clr header version as is alluded to in the comment of corhdr.h line 124: // CLR 2.0 header structure. Even though the ?runtime? is v1.1.4322, as can be seen in the metadata root information

11. Metadata Assembly metadata Metadata header Metadata streams (tables and heaps) Demos: Monodis, Asmex, Spices.Net Metadata header (storage header) contains information such as: Signature, version information ? which is the expected version 1.1.4322, and the number of metadata streams in the assembly Up to 5 streams in an assembly (from Chapter 4 in Serge?s book): #Strings - contains the names of metadata items (class names, method names, field names, etc). #Blob ? contains internal binary objects such as default values and signatures. #GUID ? contains all sorts of guids used by the assembly #US ? contains user-defined strings. This means string constants defined in the code (kept in unicode) #~ - contains a compressed metadata stream, which is an optimized system of metadata tables #- - contains an uncompressed metadata stream, which will include intermediate lookup tables ? usually used by debuggers in cases such as edit and continue Before version 2.0 of .net, the metadata schema defines 44 tables. Additional tables have been added in 2.0 for generics. RID ? is a record identifier, which is one based row number in the table containing the record Token ? is a 4 byte unsigned integer whose senior byte carries a zero based table index (which is the same as the internal metadata RID type). The remaining 3 bytes are left for the RID.Metadata header (storage header) contains information such as: Signature, version information ? which is the expected version 1.1.4322, and the number of metadata streams in the assembly Up to 5 streams in an assembly (from Chapter 4 in Serge?s book): #Strings - contains the names of metadata items (class names, method names, field names, etc). #Blob ? contains internal binary objects such as default values and signatures. #GUID ? contains all sorts of guids used by the assembly #US ? contains user-defined strings. This means string constants defined in the code (kept in unicode) #~ - contains a compressed metadata stream, which is an optimized system of metadata tables #- - contains an uncompressed metadata stream, which will include intermediate lookup tables ? usually used by debuggers in cases such as edit and continue Before version 2.0 of .net, the metadata schema defines 44 tables. Additional tables have been added in 2.0 for generics. RID ? is a record identifier, which is one based row number in the table containing the record Token ? is a 4 byte unsigned integer whose senior byte carries a zero based table index (which is the same as the internal metadata RID type). The remaining 3 bytes are left for the RID.

12. IL Code Recognizing the pieces Metadata table contents Metadata heap contents IL code Demos: Metadata diagram, ILDasm, Dis# Disassemblers and decompilers make the output look easy. The output of a disassembly is similar to any complicated report that you might have created in the past, pulling all the data from a relational database using multiple tables. To really understand the pieces, turn on all the information in ILDasm and start looking at the tokens to see where that specific piece of information is coming fromDisassemblers and decompilers make the output look easy. The output of a disassembly is similar to any complicated report that you might have created in the past, pulling all the data from a relational database using multiple tables. To really understand the pieces, turn on all the information in ILDasm and start looking at the tokens to see where that specific piece of information is coming from

13. Disassemblers/Decompilers ILDasm Monodis DILE ? Dotnet IL Editor Reflector for .Net Asmex ? Free source .Net Assembly Examiner Dis# - .Net decompiler .Net Explorer Spices.Net ILDasm http://www.microsoft.com/downloads/details.aspx?FamilyID=9b3a2ca6-3647-4070-9f41-a333c6b9181d&displaylang=en Written by Microsoft Cost: $0 Monodis http://www.mono-project.com/Downloads Written by mono Cost $0 DILE ? Dotnet IL Editor http://sourceforge.net/projects/dile Written by Petr?ny Zsolt?? Cost: $0 Reflector for .Net http://www.aisto.com/roeder/dotnet/ Written by Lutz Roeder Cost: $0 Asmex ? Free-source .Net Assembly Examiner http://www.jbrowse.com/products/asmex/ Written by Ben Peterson Cost: $0 Dis# - .Net decompiler http://www.netdecompiler.com/ Written by NETdecompiler.com Cost: ~$345 .Net Explorer http://www.remotesoft.com/dotexplorer/ Written by Remotesoft Cost: ~$1,099 Spices.Net http://www.9rays.net/cgi-bin/components.cgi?act=1&cid=86 Written by 9rays.net Cost: ~$693ILDasm http://www.microsoft.com/downloads/details.aspx?FamilyID=9b3a2ca6-3647-4070-9f41-a333c6b9181d&displaylang=enWritten by MicrosoftCost: $0 Monodishttp://www.mono-project.com/DownloadsWritten by monoCost $0 DILE ? Dotnet IL Editorhttp://sourceforge.net/projects/dileWritten by Petr?ny Zsolt??Cost: $0 Reflector for .Nethttp://www.aisto.com/roeder/dotnet/Written by Lutz RoederCost: $0 Asmex ? Free-source .Net Assembly Examinerhttp://www.jbrowse.com/products/asmex/Written by Ben PetersonCost: $0 Dis# - .Net decompilerhttp://www.netdecompiler.com/Written by NETdecompiler.comCost: ~$345 .Net Explorerhttp://www.remotesoft.com/dotexplorer/Written by RemotesoftCost: ~$1,099 Spices.Nethttp://www.9rays.net/cgi-bin/components.cgi?act=1&cid=86Written by 9rays.netCost: ~$693

14. DILE ? Dotnet IL Editor Open source (Zsolt Petreny) ? http://sourceforge.net/projects/dile Disassembles to IL Quick search for name and tokens Debugger functionality ? can debug IL! Demo: Debugging IL vs. Assembler

15. Reflector for .Net Lutz Roeder ? http://www.aisto.com/roeder/dotnet Great code browsing tool Add-ins created by community - http://csharp21.tripod.com/ReflectorAddIns Demo: Reflector and its add-ins

16. Asmex ? Assembly Examiner Free source (Ben Peterson) - http://www.jbrowse.com/products/asmex/ Graphical representation Most pieces of an assembly Demo: Look at the code

17. Agenda Define disassembling Applied disassembling Writing a disassembler

18. Writing a disassembler PE/COFF File CLR Header Metadata IL Code

19. PE File Finding the PE header Signatures (MS-DOS, PE) Necessary structures Demos: Vijay Winnt.h file Line 6392: #define IMAGE_DOS_SIGNATURE 0x4D5A // MZ Line 6395: #define IMAGE_NT_SIGNATURE 0x50450000 // PE00 Line 6515: COFF Header _IMAGE_FILE_HEADER Line 6589: PE Header _IMAGE_OPTIONAL_HEADER MS-DOS stub 64 bytes PE Signature 4 bytes COFF Header 20 bytes PE Header 224 bytes Winnt.h file Line 6392: #define IMAGE_DOS_SIGNATURE 0x4D5A // MZ Line 6395: #define IMAGE_NT_SIGNATURE 0x50450000 // PE00 Line 6515: COFF Header _IMAGE_FILE_HEADER Line 6589: PE Header _IMAGE_OPTIONAL_HEADER MS-DOS stub 64 bytes PE Signature 4 bytes COFF Header 20 bytes PE Header 224 bytes

20. CLR Header Finding the CLR Header Need information from PE Header Calculate the offset in file Demos: Vijay CorHdr.h Line 125: IMAGE_COR20_HEADER 15th data directory of PE Header contains the RVA and size of the CLR Header Stored in .text section of PE file Offset in file = RVA % section alignment + size of headersCorHdr.h Line 125: IMAGE_COR20_HEADER 15th data directory of PE Header contains the RVA and size of the CLR Header Stored in .text section of PE file Offset in file = RVA % section alignment + size of headers

21. Metadata Tables are a ?normalized database? Heaps String ? zero-terminated character GUID ? 16 byte binary objects Blob ? binary object, preceded by its length Manifest Demos: metainfo, Vijay Tables Coded token ? same idea as a token, but used in tables where fields can reference multiple tables and stored in a more compact format (tokens are always 4 bytes, coded tokens will be either 2 or 4 bytes) Size of an RID is determine by the count of the table Mask field in the MetadataStream Header contains a bit vector indicating what tables exist in the metadata Table and column schema is ?hard coded? and known, but not in the file its self ? you need to know the layout of the records in able to read the bytes correctly. Heap Blobs contain their length first, then the data. The length is stored in a compressed format, that you will need to decompress before you know how long the data is. Manifest Metadata that describes an assembly and its modules. Made up of entries in the following metadata tables: Assembly (assembly identity; prime module only) AssemblyRef (other assemblies referenced in this one) CustomAttribute DeclSecurity (prime module only) ExportedType (types exposed by this assembly and defined in other modules; prime module only) File (other files of the same assembly) ManifestResource (managed resources defined in this assembly or defined or used in this module) Module ( identity of this module) Tables Coded token ? same idea as a token, but used in tables where fields can reference multiple tables and stored in a more compact format (tokens are always 4 bytes, coded tokens will be either 2 or 4 bytes) Size of an RID is determine by the count of the table Mask field in the MetadataStream Header contains a bit vector indicating what tables exist in the metadata Table and column schema is ?hard coded? and known, but not in the file its self ? you need to know the layout of the records in able to read the bytes correctly. Heap Blobs contain their length first, then the data. The length is stored in a compressed format, that you will need to decompress before you know how long the data is. Manifest Metadata that describes an assembly and its modules. Made up of entries in the following metadata tables: Assembly (assembly identity; prime module only) AssemblyRef (other assemblies referenced in this one) CustomAttribute DeclSecurity (prime module only) ExportedType (types exposed by this assembly and defined in other modules; prime module only) File (other files of the same assembly) ManifestResource (managed resources defined in this assembly or defined or used in this module) Module ( identity of this module)

22. IL Code Getting to the IL code Signatures RVA Method format (tiny or fat) Method data section Exception handling clause (small or fat) Demos: Dile, Vijay Signatures are: constructs that define the types of program items. built from encoded references to various classes and value types Binary object containing one or more encoded types and reside in the #Blob heap Ecma standard doc Sect 23.2 - Blobs and Signatures Method table schema Ecma standard doc Sect 25.4.1 - Method header and type values Method Header Tiny header - only 1 byte, with the first 2 bits holding the type (10) and the 6 remaining bits holding the method IL code size in bytes (which must be less than 64 bytes) Used if the following is true: No local variables No exceptions No extra data sections The operand stack shall be no bigger than 8 entries Fat Header ? 12 bytes in size. Used if the following is true: The method is too large to encode the size (at least 64 bytes) There are exceptions There are extra data sections There are local variables The operand stack needs more than 8 entries (or is set to a specific value)Signatures are: constructs that define the types of program items. built from encoded references to various classes and value types Binary object containing one or more encoded types and reside in the #Blob heap Ecma standard doc Sect 23.2 - Blobs and Signatures Method table schema Ecma standard doc Sect 25.4.1 - Method header and type values Method Header Tiny header - only 1 byte, with the first 2 bits holding the type (10) and the 6 remaining bits holding the method IL code size in bytes (which must be less than 64 bytes) Used if the following is true: No local variables No exceptions No extra data sections The operand stack shall be no bigger than 8 entries Fat Header ? 12 bytes in size. Used if the following is true: The method is too large to encode the size (at least 64 bytes) There are exceptions There are extra data sections There are local variables The operand stack needs more than 8 entries (or is set to a specific value)

23. Summary What is disassembling? What is a disassembler and what can it do for you? Where can I find a disassembler? What are some of the things you need to know to write your own disassembler? Why do you care?

24. Resources Inside Microsoft .Net IL Assembler ? Serge Lidin Standard ECMA-335 ? CLI ? http://ecma-international.org/publications/standards/Ecma-335.htm Metadata diagram - Chris King .Net SDK (especially ILDasm)

25. Questions ?


Other Related Presentations

Copyright © 2014 SlideServe. All rights reserved | Powered By DigitalOfficePro