Practical Session 8

Practical Session 8

Position Independent Code- self sufficiency of combining program • Position Independent Code (PIC) program has everything it needs internally • PICcan be placed somewhere in memory, executes properly regardless of its absolute address • PIC can be added to any other program, without fear that it might not work

Position Independent Code- requirements • No direct usage of labels • Only relative jumps (“call”) • One section only • No library functions only system calls

No direct usage of labels Labels are resolved by the assembler and linker at compile time to an absolute address. If the code is moved (in an appropriate section), the absolute address that was calculated before the moving wont be correct any more.

No direct usage of labels An absolute address of STR1 and STR2 would be resolved based on the relative address of STR1 and STR2 in .rodata section (which may change if all the code would be moved)

Only relative jumps We can use only relative jumps because the code we wish to jump to may change its position. If all the code changes its position, relative jumps are still valid, because the address difference is preserved. We can use “call” instruction because it makes a relative jump to the function (which means that the new IP (after “call”) will be the old IP (before “call”) plus a number (may be negative) with was the address difference of the two places calculated by the assembler).

Only relative jumps - example • The first column (from the left) is simply the line number in the listing and is otherwise meaningless • The second column is the relative address, in hex, of where the code will be placed in memory • The third column is the actual compiled code • For the normal type of call in 32-bit mode (relative near call), the binary code for ‘CALL myFunc’ is the opcode E8 followed by a 4-byte value that specifies the target address, relative to the next instruction after the call. •  Address of myFunc label = 0x1F •  Address of the next instruction after the call (i.e. ‘mov [answer], eax’) is 0xA •  0x1F-0xA=0x15, and we get exactly the binary code written here ‘E815000000’

One section only We put all the code in a single section – .text (read-only) or .data (read-write). Both .text and .data sections may contain any valid assembly instruction. Usage of a single section gives us a possibility to calculate a difference between each pair of code instructions, and thus execute relative jumps.

No library functionsonly system calls We don’t know if and where the library functions are .Thus there are no “printf” “gets” and so on… To perform I/O operation we have to use the Linux system calls because INT 0x80 isn’t a regular procedure - it is called via the interrupt table which is static.

Finding a code address at run-time • Since ‘call’ instruction executes a relative jump, we may ‘call’ functions that are defined in PIC • ‘call’ instruction pushes the return address at run-time • Thus we may calculate a run-time address of any label relatively to ‘call’ return address

Finding a code address at run-time get_my_loc function gets the address of ‘next_i’ label at run-time get_my_loc: call next_inext_i: pop edx ret

Using strings – PIC example • section .text • name: db "Moshe",10,0 • nameLen equ $ - name • global _start • get_my_loc: • call next_i • next_i: • pop edx • ret • _start: • call get_my_loc • sub edx, next_i – name • mov ecx, edx • mov edx, nameLen • mov eax, 4 • mov ebx, 1 • int 80h • mov eax, 1 • int 80h stack ESP

Using strings – PIC example • section .text • name: db "Moshe",10,0 • nameLen equ $ - name • global _start • get_my_loc: • call next_i • next_i: • pop edx ; edx gets ‘next_i’ address • ret • _start: • call get_my_loc • sub edx, next_i – name • mov ecx, edx • mov edx, nameLen • mov eax, 4 • mov ebx, 1 • int 80h • mov eax, 1 • int 80h stack ESP

Using strings – PIC example • section .text • name: db "Moshe",10,0 • nameLen equ $ - name • global _start • get_my_loc: • call next_i • next_i: • pop edx • ret ; EIP gets address of ‘sub edx, next_i-name’ • _start: • call get_my_loc • sub edx, next_i – name • mov ecx, edx • mov edx, nameLen • mov eax, 4 • mov ebx, 1 • int 80h • mov eax, 1 • int 80h stack ESP

Using strings – PIC example • section .text • name: db "Moshe",10,0 • nameLen equ $ - name • global _start • get_my_loc: • call next_i • next_i: • pop edx • ret • _start: • call get_my_loc • sub edx, next_i – name ; edx = ‘next_i’ address – (‘next_i’ address – ‘name’ address) • mov ecx, edx • mov edx, nameLen • mov eax, 4 • mov ebx, 1 • int 80h • mov eax, 1 • int 80h the address difference between “next_i” and “name” is constant even if the code changes it’s position

Using strings – PIC example • section .text • name: db "Moshe",10,0 • nameLen equ $ - name • global _start • get_my_loc: • call next_i • next_i: • pop edx • ret • _start: • call get_my_loc • sub edx, next_i – name • mov ecx, edx • mov edx, nameLen • mov eax, 4 • mov ebx, 1 • int 80h • mov eax, 1 • int 80h why we may use ‘nameLen’ label directly ?

Using strings – PIC example >nasm -f elf sample.s -l sample.lst 0x0C = ‘next_i’ – ‘name’

Practical Session 8