370 likes | 679 Views
CDA 3101 Fall 2013 Introduction to Computer Organization. Pointers & Arrays MIPS Programs 16 September 2013. Overview. Pointers (addresses) and values Argument passing Storage lifetime and scope Pointer arithmetic Pointers and arrays
E N D
CDA 3101 Fall 2013Introduction to Computer Organization Pointers & Arrays MIPS Programs 16 September 2013
Overview • Pointers (addresses) and values • Argument passing • Storage lifetime and scope • Pointer arithmetic • Pointers and arrays • Pointers in MIPS
Review: Pointers • Pointer: a variable that contains the address of another variable • HLL version of machine language memory address • Why use Pointers? • Sometimes only way to express computation • Often more compact and efficient code • Why not? • Huge source of bugs in real software, perhaps the largest single source • 1) Dangling reference (premature free) • 2) Memory leaks (tardy free): can't have long-running jobs without periodic restart of them
Review: C Pointer Operators • Suppose c has value 100, it is located in memory at address 0x10000000 • Unary operator & gives address: p = &c; gives address of c to p; • p “points to” c (p == 0x10000000) (Referencing) • Unary operator * gives value that pointer points to • if p = &c =>*p == 100(Dereferencing a pointer) • Deferencing data transfer in assembler • ... = ... *p ...; load (get value from location pointed to by p) • *p = ...; store (put value into location pointed to by p)
Review: Pointer Arithmetic 3 2 int x = 1, y = 2; /* x and y are integer variables */ int z[10]; /* an array of 10 ints, z points to start */ int *p; /* p is a pointer to an int */ x = 21; /* assigns x the new value 21 */ z[0] = 2; z[1] = 3 /* assigns 2 to the first, 3 to the next */ p = &z[0]; /* p refers to the first element of z */ p = z; /* same thing; p[ i ] == z[ i ]*/ p = p+1; /* now it points to the next element, z[1] */ p++; /* now it points to the one after that, z[2] */ *p = 4; /* assigns 4 to there, z[2] == 4*/ p = 3; /* bad idea! Absolute address!!! */ p = &x; /* p points to x, *p == 21 */ z = &y illegal!!!!! array name is not a variable p: 4 z[1] z[0] y: 2 x: 2 1
Review: Assembly Code • c is int, has value 100, in memory at address 0x10000000, p in $a0, x in $s0 • p = &c; /* p gets 0x10000000*/ • lui $a0,0x1000 # p = 0x10000000 • x = *p; /* x gets 100 */ • lw $s0, 0($a0) # dereferencing p • *p = 200; /* c gets 200 */ addi $t0,$0,200 sw $t0, 0($a0) # dereferencing p
Argument Passing Options • 2 choices • “Call by Value”: pass a copy of the item to the function/procedure • “Call by Reference”: pass a pointer to the item to the function/procedure • Single word variables passed by value • Passing an array? e.g., a[100] • Pascal (call by value) copies 100 words of a[] onto the stack • C (call by reference) passes a pointer (1 word) to the array a[] in a register
Lifetime of Storage and Scope Code Static Stack • Automatic (stack allocated) • Typical local variables of a function • Created upon call, released upon return • Scope is the function • Heap allocated • Created upon malloc, released upon free • Referenced via pointers • External / static • Exist for entire program Heap
Arrays, Pointers, and Functions • 4 versions of array function that adds two arrays and puts sum in a third array (sumarray) • Third array is passed to function • Using a local array (on stack) for result and passing a pointer to it • Third array is allocated on heap • Third array is declared static • Purpose of example is to show interaction of C statements, pointers, and memory allocation
Version 1 • int x[100], y[100], z[100]; • sumarray(x, y, z); • C calling convention means: • sumarray(&x[0], &y[0], &z[0]); • Really passing pointers to arrays • addi $a0,$gp,0 # x[0] starts at $gp • addi $a1,$gp,400 # y[0] above x[100] • addi $a2,$gp,800 # z[0] above y[100] • jal sumarray
Version 1: Compiled Code void sumarray(int a[], int b[], int c[]) {int i; for(i = 0; i < 100; i = i + 1) c[i] = a[i] + b[i]; } addi $t0,$a0,400 # beyond end of a[]Loop: beq $a0,$t0,Exit lw $t1, 0($a0) # $t1=a[i] lw $t2, 0($a1) # $t2=b[i] add $t1,$t1,$t2 # $t1=a[i] + b[i] sw $t1, 0($a2) # c[i]=a[i] + b[i] addi $a0,$a0,4 # $a0++ addi $a1,$a1,4 # $a1++ addi $a2,$a2,4 # $a2++ j LoopExit: jr $ra
Version 2 int *sumarray(int a[],int b[]) { int i, c[100]; for(i=0;i<100;i=i+1) c[i] = a[i] + b[i]; return c;} addi $t0,$a0,400 # beyond end of a[] addi $sp,$sp,-400 # space for c addi $t3,$sp,0 # ptr for c addi $v0,$t3,0 # $v0 = &c[0]Loop: beq $a0,$t0,Exit lw $t1, 0($a0) # $t1=a[i] lw $t2, 0($a1) # $t2=b[i] add $t1,$t1,$t2 # $t1=a[i] + b[i] sw $t1, 0($t3) # c[i]=a[i] + b[i] addi $a0,$a0,4 # $a0++ addi $a1,$a1,4 # $a1++ addi $t3,$t3,4 # $t3++ j LoopExit: addi $sp,$sp, 400 # pop stack jr $ra $sp c[100] a[100] B[100]
Version 3 Code Static Stack int * sumarray(int a[],int b[]) { int i; int *c; c = (int *) malloc(100); for(i=0;i<100;i=i+1) c[i] = a[i] + b[i]; return c;} c[100] Heap • Not reused unless freed • Can lead to memory leaks • Java, Scheme have garbagecollectors to reclaim free space
Version 3: Compiled Code addi $t0,$a0,400 # beyond end of a[] addi $sp,$sp,-12 # space for regs sw $ra, 0($sp) # save $ra sw $a0, 4($sp) # save 1st arg. sw $a1, 8($sp) # save 2nd arg. addi $a0,$zero,400 jal malloc addi $t3,$v0,0 # ptr for c lw $a0, 4($sp) # restore 1st arg. lw $a1, 8($sp) # restore 2nd arg.Loop: beq $a0,$t0,Exit ... (loop as before on prior slide ) j LoopExit:lw $ra, 0($sp) # restore $ra addi $sp, $sp, 12 # pop stack jr $ra
Code Stack Version 4 int * sumarray(int a[],int b[]) { int i; static int c[100]; for(i=0;i<100;i=i+1) c[i] = a[i] + b[i]; return c;} Static c[100] Heap • Compiler allocates once forfunction, space is reused • Will be changed next time sumarray invoked • Why describe? used in C libraries
New Topic - MIPS Programs • Data types and addressing included in the ISA • Compromise between application requirements and hardware implementation • MIPS data types • 32-bit word • 16-bit half word • 8-bit bytes • Addressing Modes • Data • Register • 16-bit signed constants • Base addressing • Instructions • PC-relative • (Pseudo) direct
Compiler (cc) Assembly program: foo.s Assembler (as) Object(mach lang module): foo.o lib.o Linker (ld) Executable(mach lang pgm): a.out Loader Memory Overview of Program Development C program: foo.c
Assembler • Reads and use directives • Replace pseudoinstructions • subu $sp,$sp,32 addiu $sp, $sp, -32 • sd $a0, 32($sp) sw $a0, 32($sp) sw $a1, 36($sp) • mul $t7,$t6,$t5 mult $t6,$t5 mflo $t7 • la $a0, 0xAABBCCDD lui $at, 0xAABB ori $a0, $at, 0xCCDD • Produce machine language • Create object file (*.o)
Assembler Directives • Directions to assembler that don’t produce machine instructions .align n Align the next datum on a 2n byte boundary .text Subsequent items put in user text segment .data Subsequent items put in user data segment .globlsymsym can be referenced from other files .asciiz str Store the string str in memory .word w1…wn Store the n 32-bit quantities in successive memory words .byte b1..bn Store n 8-bit values in successive bytes of memory .float f1..fn: Store n floating-point numbers in successive memory words
j/jal xxxxx address lw/sw $gp $x address beq/bne $rs $rt Absolute Addresses • Which instructions need relocation editing? • Loads / stores to variables in static area • Conditional branches • PC-relative addressing preserved even if code moves • Jump instructions • Direct (absolute) references to data (e.g. la)
Producing Machine Language • Simple case • Arithmetic, logical, shifts, etc. • All necessary info is within the instruction already. • Conditional branches (beq, bne) • Once pseudoinstructions are replaced by real ones, we know by how many instructions for branch span • PC-relative, easy to handle • Direct (absolute) addresses • Jumps (j and jal) • Direct (absolute) references to data • These can’t be determined yet, so we create two tables
Assembler Tables • Symbol table • List of “items” in this file that may be used by other files. • Labels: function calling • Data: anything in the .data section; variables which may be accessed across files • First Pass: record label-address pairs • Second Pass: produce machine code • Can jump to a later label without first declaring it • Relocation Table • List of “items” for which this file needs the address. • Any label jumped to: j or jal • internal • external (including lib files) • Any piece of data (e.g. la instruction)
Object File Format • Object file header: size and position of the other pieces of the object file • Text segment: the machine code • Data segment: binary representation of the data in the source file • Relocation information: identifies lines of code that need to be “handled” • Symbol table: list of this file’s labels and data that can be referenced • Debugging information
Linker (Link Editor) • Combines object (.o) files into an executable file • Enables separate (independent) compilation of files • Only recompile modified file (module) • Windows NT source is >30 M lines of code! And Growing! • Edits the “links” in jump and linkinstructions • Process (input: object files produced by assembler) • Step 1: put together the text segments from each .o file • Step 2: put together the data segments from each .o file and concatenate this onto end of text segments • Step 3: resolve references. Go through Relocation Table and handle each entry (fill in all absolute addresses)
Resolving References • Four types of references (addresses) • PC-relative (e.g. beq, bne): never relocate • Absolute address (j, jal): always relocate • External reference (jal): always relocate • Data reference (lui and ori): always relocate • Linker assumes first word of first text segment is at address 0x00000000. • Linker knows: • Length of each text and data segment • Ordering of text and data segments • Linker calculates: • Absolute address of each label to be jumped to (internal or external) and each piece of data being referenced
Loader • Executable file is stored on disk. • Loader’s job: load it into memory and start it running. • In reality, loader is part of the operating system (OS) • Reads header to determine size of text and data segments • Creates new address space for program large enough to hold text and data segments, along with a stack segment • Copies instructions and data from executable file memory • Copies arguments passed to the program onto the stack • Initializes machine registers, $sp = 1st free stack location • Jumps to start-up routine that copies program’s arguments from stack to registers and sets the PC • If main routine returns, start-up routine terminates program with the exit system call
Example: C => Asm => Obj => Exe => Run #include <stdio.h> int main (int argc, char *argv[]) { int i; int sum = 0; for (i = 0; i <= 100; i = i + 1) sum = sum + i * i; printf ("The sum from 0 .. 100 is %d\n", sum); }
.text .align 2 .globl main main: subu $sp,$sp,32 sw $ra, 20($sp) sd $a0, 32($sp) sw $0, 24($sp) sw $0, 28($sp) loop: lw $t6, 28($sp) mul $t7, $t6,$t6 lw $t8, 24($sp) addu $t9, $t8,$t7 sw $t9, 24($sp) Example: C => Asm => Obj => Exe => Run addu $t0, $t6, 1 sw $t0, 28($sp) ble $t0,100, loop la $a0, str lw $a1, 24($sp) jal printf move $v0, $0 lw $ra, 20($sp) addiu $sp,$sp,32 jr $ra .data .align 0 str: .asciiz "The sum from 0 .. 100 is %d\n"
00addiu $29,$29,-32 04 sw $31,20($29) 08 sw $4, 32($29) 0c sw $5, 36($29) 10 sw $0, 24($29) 14 sw $0, 28($29) 18 lw $14,28($29) 1c mult $14,$14 20 mflo $15 24 lw $24,24($29) 28 addu $25,$24,$15 2c sw $25,24($29) Example: C => Asm => Obj => Exe => Run Replace pseudoinstructions; assign addresses (start at 0x00) 30 addiu $8,$14, 1 34 sw $8,28($29) 38 slti $1,$8, 101 3c bne $1,$0, loop 40lui $4, hi.str 44ori $4,$4,lo.str 48 lw $5,24($29) 4c jal printf 50 add $2, $0, $0 54 lw $31,20($29) 58 addiu $29,$29,32 5c jr $31 60 The
Symbol and Relocation Tables • Symbol Table • Label Address main: 0x00000000 loop: 0x00000018 str: 0x10000430 printf: 0x004003b0 • Relocation Information • Address Instr. Type Dependency • 0x00000040 HI16 str • 0x00000044 LO16 str • 0x0000004c jal printf
00 addiu $29,$29,-32 04 sw $31,20($29) 08 sw $4,32($29) 0c sw $5,36($29) 10 sw $0, 24($29) 14 sw $0, 28($29) 18 lw $14, 28($29) 1c multu $14, $14 20 mflo $15 24 lw $24, 24($29) 28 addu $25,$24,$15 2c sw $25, 24($29) Example: C => Asm => Obj => Exe => Run Edit addresses 30 addiu $8,$14, 1 34 sw $8,28($29) 38 slti $1,$8, 101 3c bne $1,$0, -9 40 lui $4, 4096 44 ori $4,$4,1072 48 lw $5,24($29) 4c jal 1048812 50 add $2, $0, $0 54 lw $31,20($29) 58 addiu $29,$29,32 5c jr $31
Example: C => Asm => Obj => Exe => Run 0x00400000 00100111101111011111111111100000 0x00400004 10101111101111110000000000010100 0x00400008 10101111101001000000000000100000 0x0040000c 10101111101001010000000000100100 0x00400010 10101111101000000000000000011000 0x00400014 10101111101000000000000000011100 0x00400018 10001111101011100000000000011100 0x0040001c 10001111101110000000000000011000 0x00400020 00000001110011100000000000011001 0x00400024 00100101110010000000000000000001 0x00400028 00101001000000010000000001100101 0x0040002c 10101111101010000000000000011100 0x00400030 00000000000000000111100000010010 0x00400034 00000011000011111100100000100001 0x00400038 00010100001000001111111111110111 0x0040003c 10101111101110010000000000011000 0x00400040 00111100000001000001000000000000 0x00400044 10001111101001010000000000011000 0x00400048 00001100000100000000000011101100 0x0040004c 00100100100001000000010000110000 0x00400050 10001111101111110000000000010100 0x00400054 00100111101111010000000000100000 0x00400058 00000011111000000000000000001000 0x0040005c 00000000000000000001000000100001
MIPS Program - Summary • Compiler converts a single HLL file into a single assembly language file. • Assembler removes pseudos, converts what it can to machine language, and creates a checklist for the linker (relocation table). This changes each .s file into a .o file. • Linker combines several .o files and resolves absolute addresses. • Loader loads executable into memory and begins execution.