the assembly process
Download
Skip this Video
Download Presentation
The Assembly Process

Loading in 2 Seconds...

play fullscreen
1 / 19

The Assembly Process - PowerPoint PPT Presentation


  • 143 Views
  • Uploaded on

The Assembly Process. Computer Organization and Assembly Language: Module 10. Machine Code Generation. Assembling a program entails translating the assembly language into binary machine code This requires more than simply mapping assembly instructions to machine instructions

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' The Assembly Process' - saburo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the assembly process

The Assembly Process

Computer Organization and Assembly Language: Module 10

machine code generation
Machine Code Generation
  • Assembling a program entails translating the assembly language into binary machine code
  • This requires more than simply mapping assembly instructions to machine instructions
    • Each instruction is bound to an address
    • Labels are bound to addresses
    • Assembly instructions which refer to labels generate machine instructions which contain the label\'s address
    • Pseudo-instructions are translated into one or more machine instructions
instruction format
Instruction Format

(see Appendix A of Patterson & Hennesy for complete details )

addi $13,$7,50

0010 00

00111

01101

0000 0000 0011 0010

16 bits

6 bits

5 bits

5 bits

immediate operand

opcode

add $13,$7,$8

0000 00

00 111

01000

01101

000 0010 0000

extended opcode

opcode

the symbol table
The symbol table
  • The assembler scans the source code and generates the appropriate bit string for each line encountered
  • The assembler must remember
    • what memory locations have been allocated
    • to which address each label is bound
  • A symbol table is a list of (label, address) pairs
  • When the data and text segments have been generated, they are stored as an executable file
  • The file is used by a program called the loader to initialize memory to the appropriate state before execution
instructions
Instructions
  • The .text directive tells the assembler that the lines which follow are instructions.
    • By default, the text segment starts at 0x00400000
  • In some cases, a symbol may not have an assigned address yet when the assembler scans the line where it belongs
    • A second pass through the code can update instructions containing unresolved labels
    • Maintain a list of addresses in which each unresolved label appears
      • When the labeled is added to the symbol table, all locations in the corresponding list are updated to hold the address associated with the label
pseudo instructions
Pseudo-instructions

PseudoActual machine implementation

add add, addi, addu, or addiu

mul mult and mflo

div div and mflo (extra for div by zero check)

rem div and mfhi (extra for div by zero check)

li lui [and ori]

la lui and ori

move ori with $0

branch offset in the mips r2000
Branch offset in the MIPS R2000
  • In machine code, the target address in a branch must be specified as an offset from the address of the branch.
  • During execution, this offset is simply added to the program counter to fetch the next instruction
    • PC contains the address
    • Offset is measured in words, not bytes

PC_NEW = offset*4 + PC_OLD

  • To calculate the offset, the assembler uses the formula:

offset = (target instruction address – (branch instruction address))/4

branch offset calculation
Branch offset calculation
  • The offset is stored in the instruction as a word offset rather than a byte offset.
    • Instructions are only stored at word boundaries
    • For both target and branch instruction, the least two bits of the address are zero
  • An offset maybe negative
    • If the target instruction preceded the branch instruction
  • The offset is stored in the 16-bit immediate field
    • This means the branch can only jump about 215 instructions before or after the current address
      • 215 instructions (words) = 217 bytes
branch offset calculation1
Branch offset calculation
  • An entry in the SPIM instruction list

offset in bytes (__start = 0x00400000)

0x00400000 – (0x00400068) = - 104

stored offset

ffe6 = -26 = -104/4

offset calculation, in bytes

ignores PC increment

[0x00400068]0x1440ffe6bne $2, $0, -104 [__start-0x00400068]; 44: bnez $v0, __start

machine code

orignal assembly code

instruction address

line number in source file

jump target calculation
Jump target calculation

f

e

d

c

b

a

9

8

7

6

5

4

3

2

1

0

  • The jump instruction has two forms
    • Pseudo-direct, for j and jal
    • Register direct for jr and jalr
  • jr and jalr specify a register containing the address to be loaded into the PC
  • j and jal specify most of the address of the target within the instruction.
    • However, they have a range of at most one-sixteenth of the memory space
jump target calculation1

PC

opcode

Jump target bits (26)

00

Jump target calculation
  • The target address is a 32 bit quantity
    • Since all word addresses are multiples of 4 there is no need to store the last two bits
    • The jump instruction format has 26 bits for the target address
      • The remaining 6 bits of the instruction are used for the opcode
    • The highest-order 4 bits of the target are taken from the address currently stored in the program counter
jump target calculation2
Jump Target Calculation

f

e

d

c

b

a

9

8

7

6

5

4

3

2

1

0

  • jump instructions have a range of 226 words or 226 x 22 =228 bytes
    • This range is NOT symmetric about the jump instruction

+0x0fffff7c

0x80000080

-0x00000080

program relocation
Program relocation
  • It is possible that program modules are developed separately by individual programmers. When these programs are to be loaded into memory they should not be assigned overlapping memory space.
  • To handle this problem, the modules have to be relocated
    • relative addresses are relocatable
    • Any absolute references must be "fixed" by the loader
      • Use a logical base address known at load time
      • Absolute addresses are stored as offsets from this TBD base
from source to executable
From source to executable

high-level

source code

lib

obj

asm

exe

asm

obj

linker

loader

assembler

memory

compiler

some examples of assembling code
Some examples of assembling code
  • .data
  • a1: .word 3
  • a2: .word 16, 16, 16, 16
  • a3: .word 5
  • .text
  • __start:
  • la $6, a2
  • loop:
  • lw $7, 4($6)
  • mul $9, $10, $7
  • b loop
  • li $v0, 10
  • syscall
some examples of assembling code1
Some examples of assembling code

Symbol Table

  • symbol address
  • a1 1000 0000
  • a2 1000 0004
  • a3 1000 0014
  • __start 0040 0000
  • loop 0040 0008
  • Memory map of data section
  • address contents
  • 1000 0000 0000 0003
  • 1000 0004 0000 0010
  • 1000 0008 0000 0010
  • 1000 000c 0000 0010
  • 1000 0010 0000 0010
  • 1000 0014 0000 0005
  • .data
  • a1: .word 3
  • a2: .word 16, 16, 16, 16
  • a3: .word 5
  • .text
  • __start:
  • la $6, a2
  • loop:
  • lw $7, 4($6)
  • mult $9, $10, $7
  • b loop
  • li $v0, 10
  • syscall
translate pseudo instructions
Translate pseudo-instructions

lui $6, $6, 0x1000

  • ori $6, $6, 0x0004
  • lw $7, 4($6)
  • mult $10, $7
  • mflo $9
  • b loop
  • ori $v0, $0, 10
  • syscall

la $6, a2

  • loop:
  • lw $7, 4($6)
  • mul $9, $10, $7
  • b loop
  • li $v0, 10
  • syscall
translate to machine code
Translate to machine code

lui $6, 0x1000

  • ori $6, 0x0004
  • lw $7, 4($6)
  • mult $10, $7
  • mflo $9
  • b loop
  • ori $v0, $0, 10
  • syscall

address contents

00400000 3c06 1000 (lui)

00400004 34c6 0004 (ori)

00400008 8cc7 0004 (lw)

0040000c 012a 0018 (mult)

00400010 0000 4812 (mflo)

00400014 1000 xxxx (beq)

00400018 3402 000a (ori)

0040001c 0000 000c (syscall)

resolve relative references
Resolve relative references

lui $6, 0x1000

  • ori $6, 0x0004
  • lw $7, 4($6)
  • mult $10, $7
  • mflo $9
  • b loop
  • ori $v0, $0, 10
  • syscall

address contents

00400000 3c06 1000

00400004 34c6 0004

00400008 8cc7 0004

0040000c 012a 0018

00400010 0000 4812

00400014 1000 fffd (-3)

00400018 3402 000a

0040001c 0000 000c

[0x400008 - (0x400014)]/4 = -12/4 = -3 = 0xfffd

ad