480 likes | 664 Views
CprE 381 Computer Organization and Assembly Level Programming, Fall 2013. Chapter 2. Instructions: Language of the Computer. Zhao Zhang Iowa State University Revised from original slides provided by MKP. Procedure/Function Calling. Steps required Place parameters in registers
E N D
CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Chapter 2 Instructions: Language of the Computer Zhao Zhang Iowa State University Revised from original slides provided by MKP
Procedure/Function Calling • Steps required • Place parameters in registers • Transfer control to procedure • Acquire storage for procedure • Perform procedure’s operations • Place result in register for caller • Return to place of call §2.8 Supporting Procedures in Computer Hardware Chapter 2 — Instructions: Language of the Computer — 2
Register Usage Review • $a0 – $a3: arguments (reg’s 4 – 7) • $v0, $v1: result values (reg’s 2 and 3) • $t0 – $t9: temporaries • Can be overwritten by callee • $s0 – $s7: saved • Must be saved/restored by callee • $gp: global pointer for static data (reg 28) • $sp: stack pointer (reg 29) • $fp: frame pointer (reg 30) • $ra: return address (reg 31) Note: There are additional rules for floating point registers Chapter 2 — Instructions: Language of the Computer — 3
Procedure Call Instructions • Procedure call: jump and link jal ProcedureLabel • Address of following instruction put in $ra • Jumps to target address • Procedure return: jump register jr $ra • Copies $ra to program counter • Can also be used for computed jumps • e.g., for case/switch statements Chapter 2 — Instructions: Language of the Computer — 4
Leaf Procedure Example • C code: intleaf_example (int g, h, i, j){ intf; f = (g + h) - (i + j); return f;} • Arguments g, …, j in $a0, …, $a3 • f in $s0 (hence, need to save $s0 on stack) • Result in $v0 Chapter 2 — Instructions: Language of the Computer — 5
Leaf Procedure Example • MIPS code: leaf_example: addi $sp, $sp, -4 sw $s0, 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0, $t0, $t1 add $v0, $s0, $zero lw $s0, 0($sp) addi $sp, $sp, 4 jr $ra Save $s0 on stack Procedure body Result Restore $s0 Return Chapter 2 — Instructions: Language of the Computer — 6
Exercise • Write MIPS code for intadd2(int x, int y){ return x + y; } Chapter 1 — Computer Abstractions and Technology — 7
Exercise • First version, with stack frame # x in $a0, y in $a1, return in $v0 add2: addi $sp, $sp, -4 # alloc frame sw $s0, 0($sp) # save $s0 add $s0, $a0, $a1 # tmp = x + y add $v0, $s0, $zero # $v0 = tmp lw $s0, 0($sp) # restore $s0 addi $sp, $sp, 4 # release frame jr $ra Chapter 1 — Computer Abstractions and Technology — 8
Exercise • Optimized version, w/o stack frame # x in $a0, y in $a1, return in $v0 add2: add $v0, $a0, $a1 # $v0 = x + y jr $ra • In this case, we have nothing to store in stack frame Chapter 1 — Computer Abstractions and Technology — 9
Exercise • Write MIPS code for int max(int x, int y){ if (x > y) return x; else return y; } Chapter 1 — Computer Abstractions and Technology — 10
Exercise # x in $a0, y in $a1, return in $v0 max: slt $t0, $a1, $a0 # y < x? beq else # no, do else add $v0, $a0, $zero # to return x jal $ra # return else: add $v0, $a1, $zero # to return y jal $ra Chapter 1 — Computer Abstractions and Technology — 11
Non-Leaf Procedures • Procedures that call other procedures • For nested call, caller needs to save on the stack: • Its return address • Any arguments and temporaries needed after the call • Restore from the stack after the call Chapter 2 — Instructions: Language of the Computer — 12
Stack Frame Contents • A complete stack frame may hold • Extra arguments exceeding $a0-$a3 • Save registers ($s0-$s7) that will be overwritten • Return address ($ra) • Local, automatic variables A non-leaf function must have a stack frame, because $ra has to be saved Chapter 1 — Computer Abstractions and Technology — 13
Local Data on the Stack • Local data allocated by callee • e.g., C automatic variables • Procedure frame (activation record) • Used by some compilers to manage stack storage • Our examples do not use $fp Chapter 2 — Instructions: Language of the Computer — 14
Non-Leaf Procedure Example • Write MIPS code for int max3(int x, int y, int z){ return max(max(x, y), z); } • We have to use a procedureframe in stack (stack frame) Chapter 1 — Computer Abstractions and Technology — 15
Non-Leaf Procedure Example # x in $a0, y in $a1, z in $a2, ret in $v0 max3: addi $sp, $sp, -8 # alloc stack frame sw $ra, 4($sp) # preserve $ra sw $a2, 0($sp) # preserve z jal max # call max(x, y) add $a0, $v0, $zero # $a0 = max(x, y) lw $a1, 0($sp) # $a1 = z jal max # 2nd call max(…) lw $ra, 4($sp) # restore $ra addi $sp, $sp, 8 # free stack frame jr $ra # return Chapter 1 — Computer Abstractions and Technology — 16
Non-Leaf Procedure Example • Write MIPS code for intadd3(int x, int y, int z){ return add2(add2(x, y), z); } Chapter 1 — Computer Abstractions and Technology — 17
Non-Leaf Procedure Example • C code: int fact (int n){ if (n < 1) return f; else return n * fact(n - 1);} • Argument n in $a0 • Result in $v0 Chapter 2 — Instructions: Language of the Computer — 18
Non-Leaf Procedure Example • MIPS code: fact: addi $sp, $sp, -8 # adjust stack for 2 items sw $ra, 4($sp) # save return address sw $a0, 0($sp) # save argument slti $t0, $a0, 1 # test for n < 1 beq $t0, $zero, L1 addi $v0, $zero, 1 # if so, result is 1 addi $sp, $sp, 8 # pop 2 items from stack jr $ra # and returnL1: addi $a0, $a0, -1 # else decrement n jal fact # recursive call lw $a0, 0($sp) # restore original n lw $ra, 4($sp) # and return address addi $sp, $sp, 8 # pop 2 items from stack mul $v0, $a0, $v0 # multiply to get result jr $ra # and return Chapter 2 — Instructions: Language of the Computer — 19
Memory Layout • Text: program code • Static data: global variables • e.g., static variables in C, constant arrays and strings • $gp initialized to address allowing ±offsets into this segment • Dynamic data: heap • E.g., malloc in C, new in Java • Stack: automatic storage Chapter 2 — Instructions: Language of the Computer — 20
Character Data • Byte-encoded character sets • ASCII: 128 characters • 95 graphic, 33 control • Latin-1: 256 characters • ASCII, +96 more graphic characters • Unicode: 32-bit character set • Used in Java, C++ wide characters, … • Most of the world’s alphabets, plus symbols • UTF-8, UTF-16: variable-length encodings §2.9 Communicating with People Chapter 2 — Instructions: Language of the Computer — 21
Byte/Halfword Operations • Could use bitwise operations • MIPS byte/halfword load/store • String processing is a common case lb rt, offset(rs) lh rt, offset(rs) • Sign extend to 32 bits in rt lbu rt, offset(rs) lhu rt, offset(rs) • Zero extend to 32 bits in rt sb rt, offset(rs) sh rt, offset(rs) • Store just rightmost byte/halfword Chapter 2 — Instructions: Language of the Computer — 22
String Copy Example • C code (array-based version) • Null-terminated string void strcpy (char x[], char y[]){ inti = 0;while ((x[i] = y[i]) != '\0')i++;} • Addresses of x, y in $a0, $a1 • i in $s0 Chapter 2 — Instructions: Language of the Computer — 23
String Copy Example • MIPS code: strcpy: addi $sp, $sp, -4 # adjust stack for 1 item sw $s0, 0($sp) # save $s0 add $s0, $zero, $zero # i = 0L1: add $t1, $s0, $a1 # addr of y[i] in $t1 lbu $t2, 0($t1) # $t2 = y[i] add $t3, $s0, $a0 # addr of x[i] in $t3 sb $t2, 0($t3) # x[i] = y[i] beq $t2, $zero, L2 # exit loop if y[i] == 0 addi $s0, $s0, 1 # i = i + 1 j L1 # next iteration of loopL2: lw $s0, 0($sp) # restore saved $s0 addi $sp, $sp, 4 # pop 1 item from stack jr $ra # and return Chapter 2 — Instructions: Language of the Computer — 24
String Copy Example • C code, pointer-based version void strcpy (char *x, char *y){ while ((*x++ = *y++) != '\0'){ }} • A good optimizing compiler may generate the same, efficient code for both versions (see next) Chapter 2 — Instructions: Language of the Computer — 25
Strcpy: Optimized Version strcpy: # reg: x in $a0, y in $a1, *y in $t0 Loop: lbu$t0, 0($a1) # load *y sb $t0, 0($a0) # store to *x addi $a0, $a0, 1 # x++ addi $a1, $a1, 1 # y++ bne $t0, $zero, Loop # *y != 0? jr $ra # return • 5 vs. 7 instructions in the loop • 6 vs. 13 instructions in the function Chapter 1 — Computer Abstractions and Technology — 26
Arrays vs. Pointers • Array indexing involves • Multiplying index by element size • Adding to array base address • Pointers correspond directly to memory addresses • Can avoid indexing complexity §2.14 Arrays versus Pointers Chapter 2 — Instructions: Language of the Computer — 27
Another Example • Clear an array, Array access clear1(int array[], int size) { inti; for (i = 0; i < size; i++) { array[i] = 0; } } Chapter 1 — Computer Abstractions and Technology — 28
Array Access MIPS Code # array in $a0, size in $a1 clear1: move $t0,$zero # i = 0 loop1: sll $t1, $t0, 2 # $t1 = i * 4 add $t2, $a0, $t1 # $t2 = &array[i] sw $zero, 0($t2) # array[i] = 0 addi$t0, $t0, 1 # i = i + 1 slt $t3, $t0, $a1 # $t3 = (i < size) bne $t3, $zero, loop1 # if true, repeat Chapter 1 — Computer Abstractions and Technology — 29
Pointer Access • Clear an array, array access clear2(int *array, int size) { int *p; for (p = array; p < array + size; p++) { *p = 0; } } Chapter 1 — Computer Abstractions and Technology — 30
Pointer Access MIPS Code clear2: move $t0, $a0 # p = array sll $t1, $a1, 2 # $t1 = size * 4 add $t2, $a0, $t1 # $t2 = &array[size] j loop2_cond loop2: sw $zero, 0($t0) # *p = 0 addi$t0, $t0, 4 # p++ loop2_cond: slt $t3, $t0, $t2 # p < &array[size]? bne $t3, $zero, loop2 $jr $ra Chapter 1 — Computer Abstractions and Technology — 31
Comparison of Array vs. Ptr • Multiply “strength reduced” to shift • Array version requires shift to be inside loop • Part of index calculation for incremented i • c.f. incrementing pointer • Compiler can achieve same effect as manual use of pointers • Induction variable elimination • Better to make program clearer and safer Chapter 2 — Instructions: Language of the Computer — 32
For-Loop Example • Calculate the sum of array intarray_sum(int X[], int size) { int sum = 0; for (inti = 0; i < size; i++) sum += X[i]; return sum; } Chapter 1 — Computer Abstractions and Technology — 33
FOR Loop Control and Data Flow Graph Linear Code Layout (Optional: prologue and epilogue) Init-expr Init-expr Jump For-body For-body Incr-expr Incr-expr Test cond Cond Branch if true T F
For-Loop MIPS Code # X in $a0, size in $a1, return in $v0 array_sum: add $v0, $zero, $zero # sum = 0 add $t0, $zero, $zero # i = 0 j for_cond for_loop: sll $t1, $t0, 2 # $t1 = i*4 add $t1, $a0, $t1 # $t1 = &X[i] lw $t1, 0($t1) # $t1 = X[i] add $v0, $v0, $t1 # sum += X[i] addi $t0, $t0, 1 # i++ for_cond: slt $t1, $t0, $a1 # i < size? bne $t1, $zero, for_loop # if true, repeat jr $ra Chapter 1 — Computer Abstractions and Technology — 35
For-Loop: Pointer Version • Calculate the sum of array intarray_sum(int X[], int size) { int *p, sum = 0; for (p = X; p < &X[size]; p++) sum += *p; return sum; } Again, do not write pointer version for performance – A good compiler will take care of it. Chapter 1 — Computer Abstractions and Technology — 36
Optimized MIPS Code # X in $a0, size in $a1, return in $v0 array_sum: add $v0, $zero, $zero # sum = 0 add $t0, $a0, $zero # p = X sll $a1, $a1, 2 # $a1 = 4*size add $a1, $a0, $a1 # $a1 = &X[size] j for_cond for_loop: lw $t1, 0($t0) # $t1 = *p add $v0, $v0, $t1 # sum += *p addi $t0, $t0, 4 # p++ for_cond: slt $t1, $t0, $a1 # p < &X[size]? bne $t1, $zero, for_loop # if true, repeat jr $ra Chapter 1 — Computer Abstractions and Technology — 37
32-bit Constants • Most constants are small • 16-bit immediate is sufficient • For the occasional 32-bit constant lui rt, constant • Copies 16-bit constant to left 16 bits of rt • Clears right 16 bits of rt to 0 §2.10 MIPS Addressing for 32-Bit Immediates and Addresses 0000 0000 0111 1101 0000 0000 0000 0000 lhi $s0, 61 0000 0000 0111 1101 0000 1001 0000 0000 ori $s0, $s0, 2304 Chapter 2 — Instructions: Language of the Computer — 38
32-bit Constants • Translate C to MIPS f = 0x10203040; # Assume f in $s0 lui $s0, 0x1020 ori $s0, $s0, 0x3040 Chapter 1 — Computer Abstractions and Technology — 39
32-bit Constants • Load a big value in MIPS int *p = array; # assume p in $s0 la $s0, array • MIPS assembly supports pseudo instruction “la”, equivalent to lui $s0, upper_of_array ori $s0, $s0, lower_of_array • The assembler decides the value for upper_of_array and lower_of_array Chapter 1 — Computer Abstractions and Technology — 40
0 0 rs -- rt rt rd rd 0 shamt 0 4 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits 5 bits 5 bits 5 bits 5 bits 6 bits 6 bits Shift Instructions sllrd, rt, shamt ; shift left logic Ex: sll $s0, $s0, 4 ; sll by 4 bits sllvrd, rt, rs ; SLL variable Ex: sllv $s0, $s0, $t0 ; ssl by $t0 bits Source: textbook B-55, B56 Chapter 1 — Computer Abstractions and Technology — 41
Shift Instructions • Other shift instructions srlrd, rt, shamt # shift right logic srlvrd, rt, rs # SRL varaible srard, rt, shamt # shift right arithmetic sravrd, rt, rs # SRA variable Chapter 1 — Computer Abstractions and Technology — 42
op rs rt constant or address 6 bits 5 bits 5 bits 16 bits Branch Addressing • Branch instructions specify • Opcode, two registers, target address • Most branch targets are near branch • Forward or backward • PC-relative addressing • Target address = PC + offset × 4 • PC already incremented by 4 by this time Chapter 2 — Instructions: Language of the Computer — 43
op address 26 bits 6 bits Jump Addressing • Jump (j and jal) targets could be anywhere in text segment • Encode full address in instruction • (Pseudo)Direct jump addressing • Target address = PC31…28 : (address × 4) Chapter 2 — Instructions: Language of the Computer — 44
Target Addressing Example • Loop code from earlier example • Assume Loop at location 80000 Chapter 2 — Instructions: Language of the Computer — 45
Branching Far Away • If branch target is too far to encode with 16-bit offset, assembler rewrites the code • Example beq $s0,$s1, L1 ↓ bne $s0,$s1, L2 j L1L2: … Chapter 2 — Instructions: Language of the Computer — 46
Addressing Mode Summary Chapter 2 — Instructions: Language of the Computer — 47