1 / 29

IKI10230 Pengantar Organisasi Komputer Kuliah no. 09: Compiling-Assembling-Linking

IKI10230 Pengantar Organisasi Komputer Kuliah no. 09: Compiling-Assembling-Linking. Sumber : 1. Paul Carter, PC Assembly Language 2. Hamacher. Computer Organization , ed-5 3. Materi kuliah CS61C/2000 & CS152/1997, UCB. 21 April 2004

hedwig
Download Presentation

IKI10230 Pengantar Organisasi Komputer Kuliah no. 09: Compiling-Assembling-Linking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IKI10230Pengantar Organisasi KomputerKuliah no. 09: Compiling-Assembling-Linking Sumber:1. Paul Carter, PC Assembly Language2. Hamacher. Computer Organization, ed-53. Materi kuliah CS61C/2000 & CS152/1997, UCB 21 April 2004 L. Yohanes Stefanus (yohanes@cs.ui.ac.id)Bobby Nazief (nazief@cs.ui.ac.id) bahan kuliah: http://www.cs.ui.ac.id/kuliah/POK/

  2. lib.o Steps to Starting a Program C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o Linker Executable(mach lang pgm): foo.exe Loader Memory

  3. Example: C Asm  Obj  Exe  Run #include <stdio.h> int main (int argc, char *argv[]) { int i; int sum = 0; for (i = 0; i <= 100; i = i + 1) sum = sum + i * i; printf ("The sum from 0 .. 100 is %d\n", sum); }

  4. Compiler • Input: High-Level Language Code (e.g., C, Java) • Output: Assembly Language Code(e.g., Intel x86) • Note: Output may contain directives & pseudoinstructions

  5. Example: C Asm Obj  Exe  Run L5: inc dword [ebp-4] jmp L3 L4: add esp,-8 mov eax,[ebp-8] push eax push dwordLC0 call _printf add esp,16 L2: mov esp,ebp pop ebp ret segment .text LC0: db "The sum from 0 .. 100 is %d",0xa,0 _main: push ebp mov ebp,esp sub esp,24 mov dword [ebp-8],0 mov dword [ebp-4],0 L3: cmp dword [ebp-4],100 jle L6 jmp L4 L6: mov eax,[ebp-4] imul eax,[ebp-4] add [ebp-8],eax

  6. lib.o Where Are We Now? C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o Linker Executable(mach lang pgm): a.out Loader Memory

  7. Assembler • Reads and Uses Directives • Replace Pseudoinstructions • Produce Machine Language • Creates Object File

  8. Producing Machine Language • Simple Case • Arithmetic, Logical, Shifts, and so on. • All necessary info is within the instruction already. • What about Branches? • PC-Relative • So once pseudoinstructions are replaced by real ones, we know by how many instructions to branch. • What about jumps? • Some require absolute address. • What about references to data? • These will require the full 32-bit address of the data. • Addresses can’t be determined yet, so we create two tables…

  9. Symbol Table • List of “items” in this file that may be used by other files. • What are they? • Labels: function calling • Data: anything in the .data section; variables which may be accessed across files • First Pass: record label-address pairs • Second Pass: produce machine code • Result: can jump to a later label without first declaring it

  10. Relocation Table • List of “items” for which this file needs the address. • What are they? • Any label jumped to: jmp or call • internal • external (including lib files) • Any piece of data

  11. Object File Format • object file header: size and position of the other pieces of the object file • text segment: the machine code • data segment: binary representation of the data in the source file • relocation information: identifies lines of code that need to be “handled” • symbol table: list of this file’s labels and data that can be referenced • debugging information

  12. Example: C Asm Obj Exe  Run 0x4c: inc dword [ebp-4] jmp 0xffffffe0 (0x34) 0x54: add esp,-8 mov eax,[ebp-8] push eax push 0x0 call 0x0 add esp,16 0x6e: mov esp,ebp pop ebp ret segment .text 0x0: db "The sum from 0 .. 100 is %d",0xa,0 0x1d: push ebp mov ebp,esp sub esp,24 mov dword [ebp-8],0 mov dword [ebp-4],0 0x34: cmp dword [ebp-4],100 jle 0x05 (0x42) jmp 0x00000012 (0x54) 0x42: mov eax,[ebp-4] imul eax,[ebp-4] add [ebp-8],eax

  13. Symbol Table Entries • Symbol Table • Label Address LC0: 0x00000000 main: 0x0000001d L3: 0x00000034 L6: 0x00000042 L5: 0x0000004c L4: 0x00000054 L2: 0x0000006e • Relocation Information • Offset Type Value 0x0000005f dir32 .text (LC0: offset 0 of .text segment) 0x00000064 DISP32 _printf

  14. lib.o Where Are We Now? C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o Linker Executable(mach lang pgm): a.out Loader Memory

  15. Link Editor/Linker • Step 1: Take text segment from each .o file and put them together. • Step 2: Take data segment from each .o file, put them together, and concatenate this onto end of text segments. • Step 3: Resolve References • Go through Relocation Table and handle each entry • That is, fill in all absolute addresses

  16. Four Types of Addresses • PC-Relative Addressing (beq, bne): never relocate • Absolute Address (jmp, call): always relocate • External Reference (usually call): always relocate • Data Reference: always relocate

  17. Resolving References • Linker assumes first word of first text segment is at address 0x00000000. • Linker knows: • length of each text and data segment • ordering of text and data segments • Linker calculates: • absolute address of each label to be jumped to (internal or external) and each piece of data being referenced • To resolve references: • search for reference (data or label) in all symbol tables • if not found, search library files (for example, for printf) • once absolute address is determined, fill in the machine code appropriately • Output of linker: executable file containing text and data (plus header)

  18. Example: C Asm Obj Exe Run 0x160c: inc dword [ebp-4] jmp 0xe0 (0x15f4) 0x1614: add esp,-8 mov eax,[ebp-8] push eax push 0x000015c0 call 0x00001778 (0x2da0)* add esp,16 0x162e: mov esp,ebp pop ebp ret *0x1628 + 0x1778 = 0x2da0 segment .text 0x15c0: db "The sum from 0 .. 100 is %d",0xa,0 0x15dd: push ebp mov ebp,esp sub esp,24 mov dword [ebp-8],0 mov dword [ebp-4],0 0x15f4: cmp dword [ebp-4],100 jle 0x05 (0x1602) jmp 0x12 (0x1614) 0x1602: mov eax,[ebp-4] imul eax,[ebp-4] add [ebp-8],eax

  19. 00000000 ... 000015C0 00001631 ... 0000B000 ... 0000BB04 Peta Memori .EXE Obj lainnya Foo.o .text Obj lainnya (..., _printf, ...) .data

  20. lib.o Where Are We Now? C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o Linker Executable(mach lang pgm): a.out Loader Memory

  21. Loader (1/3) • Executable files are stored on disk. • When one is run, loader’s job is to load it into memory and start it running. • In reality, loader is the operating system (OS) • loading is one of the OS tasks

  22. Loader (2/3) • So what does a loader do? • Reads executable file’s header to determine size of text and data segments • Creates new address space for program large enough to hold text and data segments, along with a stack segment • Copies instructions and data from executable file into the new address space (this may be anywhere in memory)

  23. Loader (3/3) • Copies arguments passed to the program onto the stack • Initializes machine registers • Most registers cleared, but stack pointer assigned address of 1st free stack location • Jumps to start-up routine that copies program’s arguments from stack to registers and sets the PC • If main routine returns, start-up routine terminates program with the exit system call

  24. Example: C Asm Obj Exe Run 0x000015c0:0x206568540x206d75730x6d6f72660x2e203020 0x000015d0:0x3031202e0x736920300x0a6425200xe5895500 0x000015e0:0x0018ec810x45c700000x000000f80xfc45c700 0x000015f0:0x000000000x64fc7d810x7e0000000x0012e905 0x00001600:0x458b00000x45af0ffc0xf84501fc0xe9fc45ff 0x00001610:0xffffffe00xfff8c4810x458bffff0xc06850f8 0x00001620:0xe80000150x000017780x0010c4810xec890000 0x00001630:0x0000c35d 0x000015c0:54 68 65 20 73 75 62 20 66 72 6f 6d 20 30 20 2e T h e s u m f r o m 0 . 000015dd: 55push ebp 000015de:89e5mov ebp,esp 000015e0: 81ec18000000sub esp,0x18 000015e6: c745f800000000mov [ebp-8],0 000015ed: c745fc00000000mov [ebp-4],0 000015f4: 817dfc64000000cmp [ebp-4],0x64 000015fb: 7e05jle 0x1602 000015fd: e912000000jmp 0x1614

  25. .ASM, .O, & .EXE (FORMAT COFF)

  26. Example: C Asm Obj  Exe  Run L6: movl -4(%ebp),%eax imull -4(%ebp),%eax addl %eax,-8(%ebp) L5: incl -4(%ebp) jmp L3 L4: addl $-8,%esp movl -8(%ebp),%eax pushl %eax pushl LC0 call _printf addl $16,%esp L2: movl %ebp,%esp popl %ebp ret .text LC0: .ascii "The sum from 0 .. 100 is %d\12\0" main: pushl %ebp movl %esp,%ebp subl $24,%esp movl $0,-8(%ebp) movl $0,-4(%ebp) L3: cmpl $100,-4(%ebp) jle L6 jmp L4

  27. Example: C Asm Obj Exe  Run 0x40: movl -4(%ebp),%eax imull -4(%ebp),%eax addl %eax,-8(%ebp) 0x4a: incl -4(%ebp) jmp -0x1b (0x34) 0x50: addl $-8,%esp movl -8(%ebp),%eax pushl %eax pushl0x0 call 0x0 (undefined) addl $16,%esp 0x64: movl %ebp,%esp popl %ebp ret .text 0x0: .ascii "The sum from 0 .. 100 is %d\12\0" 0x20: pushl %ebp movl %esp,%ebp subl $24,%esp movl $0,-8(%ebp) movl $0,-4(%ebp) 0x34: cmpl $100,-4(%ebp) jle 6 (0x40) jmp 0x14 (0x50)

  28. Symbol Table Entries • Symbol Table • Label Address LC0: 0x00000000 L2: 0x00000064 L3: 0x00000034 L4: 0x00000050 L5: 0x0000004a L6: 0x00000040 main: 0x00000020 • Relocation Information • Address Instr. Type Dependency • 0x0000005c call printf

  29. Example: C Asm Obj Exe Run 0x1600: movl -4(%ebp),%eax imull -4(%ebp),%eax addl %eax,-8(%ebp) 0x160a: incl -4(%ebp) jmp -0x1b (0x15f4) 0x1610: addl $-8,%esp movl -8(%ebp),%eax pushl %eax pushl0x15c0 call 0x2d90 addl $16,%esp 0x1624: movl %ebp,%esp popl %ebp ret .text 0x15c0: .ascii "The sum from 0 .. 100 is %d\12\0" 0x15e0: pushl %ebp movl %esp,%ebp subl $24,%esp movl $0,-8(%ebp) movl $0,-4(%ebp) 0x15f4: cmpl $100,-4(%ebp) jle 6 (0x1600) jmp 0x14 (0x1610)

More Related