1 / 70

Assemblers and Linkers

Assemblers and Linkers. Goals for this Lecture. Help you to learn about: The assembly process IA-32 machine language Why? Machine language is the last stop on the “language levels” tour A power programmer knows about the relationship between assembly language and machine language

farrah
Download Presentation

Assemblers and Linkers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assemblers and Linkers

  2. Goals for this Lecture • Help you to learn about: • The assembly process • IA-32 machine language • Why? • Machine language is the last stop on the “language levels” tour • A power programmer knows about the relationship between assembly language and machine language • A systems programmer knows how an assembler translates assembly language to machine language

  3. The Build/Execute Process myprog.c Compiler Covered in COS 320: Compiling Techniques myprog.s Assembler myprog.olibraries Covered here Linker myprog Execution

  4. The Forward Reference Problem • Problem • Assembler must generate machine language code for “jmp mylabel” • But assembler hasn’t seen definition of mylabel • I.e., the jmp instruction contains a forward reference to mylabel … jmp mylabel … mylabel: … Any assembler must deal with the forward reference problem

  5. The Forward Reference Solution • Solution • Assembler performs 2 passes over assembly language program • Different assemblers perform different tasks in each pass • One straightforward design…

  6. Assembler Passes • Pass1 • Assembler traverses assembly program to create… • Symbol table • Key: label • Value: information about label • Label name, which section, what offset within that section, … • Pass 2 • Assembler traverses assembly program again to create… • RODATA section • DATA section • BSS section • TEXT section • Relocation record section • Each relocation record indicates an area that the linker must patch

  7. An Example Program • A simple(nonsensical)program: • Let’s consider how theassembler handles thatprogram… .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret #include <stdio.h> int main(void) { if (getchar() == 'A') printf("Hi\n"); return 0; }

  8. Assembler Data Structures (1) • Symbol Table • Relocation Records • RODATA Section (location counter: 0) • TEXT Section (location counter: 0) No DATA or BSSsection in this program Initially all sectionsare empty

  9. Assembler Pass 1 .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler notes thatthe current section isRODATA Assembler adds bindingto Symbol Table…

  10. Assembler Data Structures (2) • Symbol Table • Relocation Records • (Same) • RODATA Section (location counter: 0) • (Same) • TEXT Section (location counter: 0) • (Same) msg marks a spot in theRODATA section at offset 0 msg is a local label Assign msg sequence number 0

  11. Assembler Pass 1 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler incrementsRODATA sectionlocation counter bybyte count of thestring (4)…

  12. Assembler Data Structures (3) • Symbol Table • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 0) • (Same) RODATA location counter now is 4 If another label were defined inat this point, it would mark aspot in RODATA at offset 4

  13. Assembler Pass 1 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler notes that current sectionis TEXT Assembler does nothing Assembler adds binding to Symbol Table…

  14. Assembler Data Structures (4) • Symbol Table • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 0) • (Same) main marks a spot in the TEXTsection at offset 0 main is a local label (assemblerwill discover otherwise in Pass 2) Assign main sequence number 1

  15. Assembler Pass 1 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler increments TEXT section location counter by the length of each instruction…

  16. Assembler Data Structures (5) • Symbol Table • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 26) • (Same) TEXT location counter now is 26 If another label weredefined at this point, itwould mark a spotin TEXT at offset 26

  17. Assembler Pass 1 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler adds binding to Symbol Table…

  18. Assembler Data Structures (6) • Symbol Table • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 26) • (Same) skip marks a spot in the TEXTsection at offset 26 skip is a local label Assign skip sequence number 2

  19. Assembler Pass 1 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler increments TEXT section location counter by the length of each instruction…

  20. Assembler Data Structures (7) • Symbol Table • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 35) • (Same) TEXT location counter now is 35 If another label weredefined at this point, itwould mark a spotin TEXT at offset 35

  21. From Assembler Pass 1 to Pass 2 • End of Pass 1 • Assembler has (partially) created Symbol Table • So assembler now knows which location each label marks • Beginning of Pass 2 • Assembler resets all section location counters…

  22. Assembler Data Structures (8) • Symbol Table • Relocation Records • (Same) • RODATA Section (location counter: 0) • (Same) • TEXT Section (location counter: 0) • (Same) Location countersreset to 0

  23. Assembler Pass 2 .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler notes thatthe current section isRODATA Assembler does nothing Assembler places bytes in RODATA section, and increments location counter…

  24. Assembler Data Structures (9) • Symbol Table • (Same) • Relocation Records • (Same) • RODATA Section (location counter: 4) • TEXT Section (location counter: 0) • (Same) Location counterincremented to 4 RODATA sectioncontains the bytescomprising the string

  25. Assembler Pass 2 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler notes thatthe current section isTEXT Assembler updates Symbol Table…

  26. Assembler Data Structures (10) • Symbol Table • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 0) • (Same) main is aglobal label

  27. Assembler Pass 2 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler does nothing Assembler generates machine language code in current (TEXT) section… Tangent to describeIA-32 machine language…

  28. IA-32 Machine Language • Difficult to generalize about IA-32 instruction format • Many (most!) instructions are exceptions to the rules • Generally, instructions use this format…

  29. IA-32 Instruction Format Instruction prefixes Opcode ModR/M SIB Displacement Immediate Up to 4prefixes of 1 byte each (optional) 1, 2, or 3 byteopcode 1 byte (if required) 1 byte (if required) 0, 1, 2, or 4 bytes 0, 1, 2, or 4 bytes 7 6 5 3 2 0 7 6 5 3 2 0 Mod Reg/Opcode R/M Scale Index Base Instruction prefix • Sometimes a repeat count • Rarely used; don’t be concerned

  30. IA-32 Instruction Format (cont.) Instruction prefixes Opcode ModR/M SIB Displacement Immediate Up to 4prefixes of 1 byte each (optional) 1, 2, or 3 byteopcode 1 byte (if required) 1 byte (if required) 0, 1, 2, or 4 bytes 0, 1, 2, or 4 bytes 7 6 5 3 2 0 7 6 5 3 2 0 Mod Reg/Opcode R/M Scale Index Base Opcode • Specifies which operation should be performed • Add, move, call, etc.

  31. IA-32 Instruction Format (cont.) Instruction prefixes Opcode ModR/M SIB Displacement Immediate Up to 4prefixes of 1 byte each (optional) 1, 2, or 3 byteopcode 1 byte (if required) 1 byte (if required) 0, 1, 2, or 4 bytes 0, 1, 2, or 4 bytes 7 6 5 3 2 0 7 6 5 3 2 0 Mod Reg/Opcode R/M Scale Index Base ModR/M • Specifies types of operands (immediate, register, memory) • Specifies sizes of operands (byte, word, long) • Sometimes denotes a register:000 = EAX/AL; 011 = EBX/BL; 001 = ECX/CL; 010 = EDX/DL;110 = ESI/DH; 111 = EDI/BH; 101 = EBP/CH; 110 = ESP/AH • Sometimes contains an extension of the opcode

  32. IA-32 Instruction Format (cont.) Instruction prefixes Opcode ModR/M SIB Displacement Immediate Up to 4prefixes of 1 byte each (optional) 1, 2, or 3 byteopcode 1 byte (if required) 1 byte (if required) 0, 1, 2, or 4 bytes 0, 1, 2, or 4 bytes 7 6 5 3 2 0 7 6 5 3 2 0 Mod Reg/Opcode R/M Scale Index Base SIB • Used when one of the operands is a memory operandthat uses a scale, an index register, and/or a base register

  33. IA-32 Instruction Format (cont.) Instruction prefixes Opcode ModR/M SIB Displacement Immediate Up to 4prefixes of 1 byte each (optional) 1, 2, or 3 byteopcode 1 byte (if required) 1 byte (if required) 0, 1, 2, or 4 bytes 0, 1, 2, or 4 bytes 7 6 5 3 2 0 7 6 5 3 2 0 Mod Reg/Opcode R/M Scale Index Base Displacement • Used in jump and call instructions • Indicates the displacement between the destination instructionand the jump/call instruction • More precisely, indicates:[addr of destination instr] – [addr of instr following the jump/call] • Uses little-endian byte order

  34. IA-32 Instruction Format (cont.) Instruction prefixes Opcode ModR/M SIB Displacement Immediate Up to 4prefixes of 1 byte each (optional) 1, 2, or 3 byteopcode 1 byte (if required) 1 byte (if required) 0, 1, 2, or 4 bytes 0, 1, 2, or 4 bytes 7 6 5 3 2 0 7 6 5 3 2 0 Mod Reg/Opcode R/M Scale Index Base Immediate • Specifies an immediate operand • Uses little-endian byte order

  35. Assembler Data Structures (11) • Symbol Table • (Same) • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 1)

  36. Assembler Pass 2 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler generates machine language code in current (TEXT) section…

  37. Assembler Data Structures (12) • Symbol Table • (Same) • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 3)

  38. Assembler Pass 2 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler generates machine language code in current (TEXT) section…

  39. Assembler Data Structures (12) • Symbol Table • (Same) • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 8) Assembler looks in SymbolTable to find offset of getchar getchar is not in Symbol Table Assembler cannot computedisplacement that belongsat offset 4 So…

  40. Assembler Data Structures (13) • Symbol Table • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 8) • (Same) Assembler adds getcharto Symbol Table Then…

  41. Assembler Data Structures (14) • Symbol Table • (Same) • Relocation Records • RODATA Section(location counter: 4) • (Same) • TEXT Section(location counter: 8) • (Same) Assembler generatesa relocation record,thus asking linker topatch code Dear Linker, Please patch the TEXT sectionat offset 4. Do a “displacement”type of patch. The patch is withrespect to the label whose seqnumber is 3 (i.e. getchar). Sincerely, Assembler

  42. Assembler Pass 2 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler generates machine language code in current (TEXT) section…

  43. Assembler Data Structures (15) • Symbol Table • (Same) • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 11)

  44. Assembler Pass 2 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler generates machine language code in current (TEXT) section…

  45. Assembler Data Structures (16) • Symbol Table • (Same) • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 13) Assembler looks inSymbol Table to findoffset of skip (26) Assembler subtractsoffset of next instruction(13) Resulting displacementis 13

  46. Assembler Pass 2 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler generates machine language code in current (TEXT) section…

  47. Assembler Data Structures (16) • Symbol Table • (Same) • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 18) Assembler knows offsetof msg (0) within RODATAsection But assembler does notknow location RODATAsection So assembler does notknow location of msg So…

  48. Assembler Data Structures (17) • Symbol Table • (Same) • Relocation Records • RODATA Section(location counter: 4) • (Same) • TEXT Section(location counter: 18) • (Same) Assembler generatesa relocation record,thus asking linker topatch code Dear Linker, Please patch the TEXT sectionat offset 14. Do an “absolute”type of patch. The patch is withrespect to the label whose seqnumber is 0 (i.e. msg). Sincerely, Assembler

  49. Assembler Pass 2 (cont.) .section ".rodata" msg: .asciz "Hi\n" .section ".text" .globl main main: pushl %ebp movl %esp, %ebp call getchar cmpl $'A', %eax jne skip pushl $msg call printf addl $4, %esp skip: movl $0, %eax movl %ebp, %esp popl %ebp ret Assembler generates machine language code in current (TEXT) section…

  50. Assembler Data Structures (18) • Symbol Table • (Same) • Relocation Records • (Same) • RODATA Section (location counter: 4) • (Same) • TEXT Section (location counter: 23) Assembler looks in SymbolTable to find offset of printf printf is not in Symbol Table Assembler cannot computedisplacement that belongsat offset 19 So…

More Related