1 / 27

Interfacing with ELF files

Interfacing with ELF files. An introduction to the Executable and Linkable Format (ELF) binary file specification standard. Overview of source translation. User-created files. C/C++ Source and Header Files. Assembly Source Files. Makefile. C/C++ Source and Header Files.

hertz
Download Presentation

Interfacing with ELF files

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interfacing with ELF files An introduction to the Executable and Linkable Format (ELF) binary file specification standard

  2. Overview of source translation User-created files C/C++ Source and Header Files Assembly Source Files Makefile C/C++ Source and Header Files Assembly Source Files Linker Command File preprocessor Make Utility compiler assembler Object Files Object Files Archive Utility Library Files Library Files Linker and Locator Shared Object File Linkable Image File Executable Image File Link Map File

  3. Executable versus Linkable ELF Header ELF Header Program-Header Table (optional) Program-Header Table Section 1 Data Segment 1 Data Section 2 Data Segment 2 Data Section 3 Data Segment 3 Data … Section n Data … Segment n Data Section-Header Table Section-Header Table (optional) Linkable File Executable File

  4. Role of the Linker ELF Header ELF Header Section 1 Data Program-Header Table Section 2 Data … Section n Data Segment 1 Data Section-Header Table Segment 2 Data Linkable File … Segment n Data ELF Header Section 1 Data Section 2 Data Executable File … Section n Data Section-Header Table Linkable File

  5. ELF Header e_ident [ EI_NIDENT ] e_type e_machine e_version e_entry e_phoff e_shoff e_flags e_ehsize e_phentsize e_phnum e_shentsize e_shnum e_shstrndx Section-Header Table: e_shoff, e_shentsize, e_shnum, e_shstrndx Program-Header Table: e_phoff, e_phentsize, e_phnum, e_entry

  6. Section-Headers sh_name sh_type sh_flags sh_addr sh_offset sh_size sh_link sh_info sh_addralign sh_entsize

  7. Program-Headers p_type p_offset p_vaddr p_paddr p_filesz p_memsz p_flags p_align

  8. Memory: Physical vs. Virtual Virtual Address Space (4 GB) Portions of physical memory are “mapped” by the CPU into regions of each task’s ‘virtual’ address-space Physical address space (1 GB)

  9. Linux ‘Executable’ ELF files • The Executable ELF files produced by the Linux linker are configured for execution in a private ‘virtual’ address space, whereby every program gets loaded at the identical virtual memory-address (i.e., 0x08048000) • We will soon study the Pentium’s paging mechanism which makes this possible (i.e., after we have finished Project #2)

  10. Linux ‘Linkable’ ELF files • But it is possible that some ‘linkable’ ELF files are self-contained (i.e., they do not need to be linked with other object-files or libraries) • Our ‘manydots.o’ is such an example • So we can write our own system-code that can execute the instructions contained in a stand-alone ‘linkable’ object-module, using the CPU’s ‘segmented’ physical memory

  11. Our ‘loadmap.cpp’ utility • We created a tool that ‘parses’ a linkable ELF file, to identify each section’s length, type, and location within the object-module • For those sections containing the ‘text’ and ‘data’ for the program, we build segment-descriptors, based on where the linkable image-file will reside in physical memory

  12. 32-bit versus 16-bit code • The Linux compilers, and ‘as’ assembler, produce object-files that are intended to reside in ’32-bit’ memory-segments (i.e., the ‘default’ bit in the segment-descriptor is set to 1) • This affects the CPU’s interpretation of the machine-instructions that it fetches • Our ‘as86’ assembler can produce either 16-bit or 32-bit code (though its default is 16-bit code) • We can employ ‘USE32’ or ‘USE16’ directives

  13. Example: ‘as86’ Listing USE32 0x0000 01 D8 add eax, ebx 0x0002 66 01 D8 add ax, bx 0x0005 90 nop USE16 0x0006 66 01 D8 add eax, ebx 0x0009 01 D8 add ax, bx 0x000B 90 nop END

  14. Demo-program • We created a Linux program (‘hello.s’) that invokes two system-calls (‘write’ and ‘exit’) • We assembled it with the ‘as’ assembler: $ as hello.o –o hello.o • The linkable ELF object-file ‘hello.o’ is then written to our boot-disk (track 0, sector 14) using: $ dd if=hello.o of=/dev/fd0 seek=13 • (It will get loaded into memory by ‘trackldr’)

  15. Memory-Map ‘hello.o’ image 0x00011800 Loaded from Track 0 of boot-disk by ‘trackldr.b’ ‘try32bit.b’ image 0x00010000 BOOT-LOADER 0x00007C00 ‘trackldr.b’ read from Track 0 of boot-disk by ROM-BIOS bootstrap ROM-BIOS DATA 0x00000400 IVT

  16. Segment Descriptors • We created 32-bit segment-descriptors for the ‘text’ and ‘data’ sections of ‘hello.o’ (in a Local Descriptor Table) with DPL=3) • For the ‘.text’ section: offset in ELF file = 0x34 size = 0x23 • So its segment-descriptor is: .WORD 0x0023, 0x1834, 0xFA01, 0x0040 (base-address = load-address + file-offset)

  17. Descriptors (continued) • For the ‘.data’ section: offset in ELF file = 0x58 size = 0x0D • So its segment-descriptor is: .WORD 0x000D, 0x1858, 0xFA01, 0x0040 (base-address = load-address + file-offset) • For the ring3 stack (not part of ELF file): .WORD 0x0FFF, 0x2100, 0xF201, 0x0040

  18. Task-State Segment • Because the system-calls (via int 0x80) will cause privilege-level transitions, we will need to setup a Task-State Segment (to store the ring0 stacktop pointer) theTSS: .WORD 0, 0, 0 ; 3 longwords • Its segment-descriptor goes into our GDT: .WORD 0x000B, theTSS, 0x8901, 0x0000

  19. Transition to Ring 3 • Recall that we use ‘retf’ to enter ring 3: push word #userSS push word #0x1000 push word #userCS push word #0x0000 retf

  20. System-Call Dispatcher • All system-calls are ‘vectored’ through IDT interrupt-gate 0x80 • For ‘hello.o’ we only require implementing two system-calls: ‘exit’ and ‘write’ • But to simplify future enhancements, we used a ‘jump-table’ anyway (for now it has a few ‘dummy’ entries that we can modify later)

  21. System-Call ID-numbers • System-call ID #0 (it will never be needed) • System-call ID #1 is for ‘exit’ (required) • System-call ID #2 is for ‘fork’ (deferred) • System-call ID #3 is for ‘read’ (deferred) • System-call ID #4 is for ‘write’ (required) • System-call ID #5 is for ‘open’ (deferred) • System-call ID #6 is for ‘close’ (deferred) (NOTE: over 200 system-calls exist in Linux)

  22. Defining our jump-table sys_call_table: .LONG do_nothing ; for service 0 .LONG do_exit ; for service 1 .LONG do_nothing ; for service 2 .LONG do_nothing ; for service 3 .LONG do_write ; for service 4 NR_SYS_CALLS EQU ( *- sys_call_table)/4

  23. Setting up Interrupt-Gate 0x80 • The Descriptor Privilege Level must be 3 mov edi, #0x80 ; gate ID-number lea di, theIDT[edi*8] ; descriptor addr mov 0[di], #isrSVC ; entry-pt loword mov 2[di], #sel_CS ; USE32 code mov 4[di], #0xEE00 ; DPL=3 intr-gate mov 6[di], #0x0000 ; entry-pt hiword

  24. Using our jump-table isrSVC: ; service-number is found in EAX cmp eax, #NR_SYS_CALLS jb idok xor eax, eax idok: CSEG jmp dword sys_call_table[eax*4]

  25. Our ‘exit’ service • When the application invokes the ‘exit’ system-call, our mini ‘operating system’ leaves protected-mode and returns back to our ‘trackldr’ boot-loader program • (The ‘exit-code’ is simply discarded, since this isn’t a multitasking operating-system)

  26. Our ‘write’ service • We only implement writing to the STDOUT device (i.e., the video display terminal) • For most characters in the user’s buffer, we just write the ascii-code (and standard display-attribute) directly to video memory at the current cursor-location and advance the cursor (scrolling the screen if needed) • Special ascii control-codes (‘\n’, \’r’, \’b’) are treated differently, as on a TTY device

  27. In-Class Exercise • The ‘manydots.s’ demo (to be used with Project #2) uses the ‘read’ system-call (in addition to ‘write’ and ‘exit’) • However, you could still ‘execute’ it using the ‘try32bit.s’ mini operating-stem, letting the ‘read’ service simply “do nothing” (or return with “hard-coded” buffer-contents) • Just modify the LDT descriptors so they conform to the sections in ‘manydots.o’

More Related