Computer Architecture

Computer Architecture Lab 2.1 Prof. Jerry Breecher CSCI 240 Fall 2003 Lab 2.1

What you will do in this lab. The purpose of this lab is to help you become familiar with the compiler on your test machine and to better understand the code that’s produced by that compiler. You have three tasks before you: 1. Write C code and run it through the compiler. Determine all the addressing modes that seem to be possible with your compiler and with your architecture. In particular, for your architecture: • How many address modes are there? Can you document them? • Determine the CPI of a piece of code. • Write a piece of Intel Assembler code as demonstrated in the example in this lab. Lab 2.1

What you will do in this lab. What is a verbal lab? You prepare, document and tie up all the pieces of your lab just as if you were handing it in. Instead, you and your teammate talk over the results with me. The discussion will be professional, the way I would talk about a problem with a junior colleague. WARNING: I am going to ask you all kinds of questions. Be Prepared. Do NOT expect that you can remember everything you did – write it down – having a lab notebook is a good thing. Lab 2.1

Where To Get Documentation There are many sources of information to help you with this lab. Here are some of those sources: Intel Architecture – at Intel, up to date, but can take long time to download: Volume 1: Basic Architecture http://developer.intel.com/design/pentium4/manuals/245470.htm Volume 2: Instruction Set Reference http://developer.intel.com/design/pentium4/manuals/245471.htm Volume 3: System Programming Guide http://developer.intel.com/design/pentium4/manuals/245472.htm Optimization http://developer.intel.com/design/pentium4/manuals/248966.htm Intel Architecture – local, may not be completely up to date, but fast: http://babbage.clarku.edu/~jbreecher/docs/Intel_Arch_Software_Dev_Manual1.pdf http://babbage.clarku.edu/~jbreecher/docs/Intel_Arch_Software_Dev_Manual2.pdf http://babbage.clarku.edu/~jbreecher/docs/Intel_Arch_Software_Dev_Manual3.pdf http://babbage.clarku.edu/~jbreecher/docs/Intel_Arch_Optimization.pdf Intel Architecture – not as intense as those above: http://babbage.clarku.edu/~jbreecher/docs/IntelInstructionsSimpleVersion.htm Lab 2.1

Where To Get Documentation There are many sources of information to help you with this lab. Here are some of those sources: GNU Debugger – remote copy is at: http://www.gnu.org/manual/gdb-4.17/html_mono/gdb.html Local copy is at:http://babbage.clarku.edu/~jbreecher/docs/Debugging_With_GDB.html A get-started tutorial by Aaron Brown is at http://babbage.clarku.edu/~jbreecher/docs/GDB_Tutorial.html GCC – Compiler: - remote definition is at: http://gcc.gnu.org/onlinedocs/gcc-3.0.1/gcc.html A fairly simple tutorial on using C is at: http://babbage.clarku.edu/~jbreecher/docs/C_Tutorial.html The source code for the examples used here is at: http://babbage.clarku.edu/~jbreecher/arch/labs/lab2.1.c http://babbage.clarku.edu/~jbreecher/arch/labs/lab2.1b.c Lab 2.1

Task 1: Determine all the addressing modes that seem to be possible with your compiler and with your architecture. On the next few pages is code that will produce strange effects (alignment faults) on a RISC machine. Since an Intel processor is a CISC machine, it doesn’t produce alignment faults, but it’s a nice piece of code to demonstrate C; it shows how to handle input arguments, and it shows how to do a loop. I used my favorite editor to type in the code on the next few pages. I then said “gcc -O3 -S proj2.1.c” ### This produces an assembly file proj2.1.s “gcc –O3 proj2.1.c –o proj21 ### This produces an executable named proj21 At this point, there are TWO methods for observing the assembly code produced by the compiler: Method 1: Observation of the executable code using GDB. Method 2: Observation of the assembly file produced by the compiler. Both of these are explained on the following pages. Lab 2.1

Code To Determine Address Modes /********************************************************************* This program is designed to show the time required to handle an alignment fault. *********************************************************************/ #include <stdio.h> #include <stdlib.h> #include <time.h> long global[32]; /* A total of 128 bytes */ int main( int argc, char *argv[] ) { long offset; /* This is the number of bytes to be added to an aligned location */ long iterations; /* How many times we want to do the operation. */ long *alignment; /* This is the address where we will try to read the data. */ long index; long long_temp = 0; time_t start_time, end_time; if (argc < 3 ) { printf( "Usage alignment <byte offset> <iterations>\n"); exit(0); } Lab 2.1

Code To Determine Address Modes offset = atol( argv[1] ); iterations = atol( argv[2] ); printf( "Inputs: Offset: %d Iterations: %d\n", offset, iterations ); alignment = (long *)((long)(&global) + offset); printf( "Global addr: %x Offset Address: %x\n", &global, alignment ); *alignment = 12345678; /* DO ITERATIONS FOR LONG VALUE */ time ( &start_time ); for ( index = 0; index < iterations; index++ ) long_temp = (*alignment + long_temp) % 47; time( &end_time ); printf( "Time for %d iterations of longs is %d seconds.\n", iterations, end_time - start_time ); } /* End of main */ Lab 2.1

Task 1 – Method 1: Determine all the addressing modes that seem to be possible with your compiler and with your architecture. Method 1: “gdb proj21” “disassemble main” On doing this, the Assembly output shown on the next few pages was produced. This then will now get you started. To continue, • You will need to examine this Assembler output to identify all the addressing modes evident. Here’s one mode as an example: 0x804851F <main+127>: movl $0xbc614e,0x80498a0(%ebx) • You will then modify this C code in order to generate other Assembler that will contain additional addressing modes. • If you hunt around on the web, you will find various people who have defined the Addressing modes for Intel processors. That’s all there is to this part of the assignment. Lab 2.1

Assembly Output - Method 1 0x80484a0 <main>: push %ebp 0x80484a1 <main+1>: mov %esp,%ebp 0x80484a3 <main+3>: sub $0x1c,%esp 0x80484a6 <main+6>: push %edi 0x80484a7 <main+7>: push %esi 0x80484a8 <main+8>: push %ebx 0x80484a9 <main+9>: mov 0xc(%ebp),%esi 0x80484ac <main+12>: cmpl $0x2,0x8(%ebp) if (argc < 3 ) 0x80484b0 <main+16>: jg 0x80484d0 <main+48> 0x80484b2 <main+18>: add $0xfffffff4,%esp 0x80484b5 <main+21>: push $0x8048660 0x80484ba <main+26>: call 0x8048398 <printf> 0x80484bf <main+31>: add $0xfffffff4,%esp 0x80484c2 <main+34>: push $0x0 0x80484c4 <main+36>: call 0x80483a8 <exit> 0x80484c9 <main+41>: lea 0x0(%esi,1),%esi 0x80484d0 <main+48>: push $0x0 offset = atol( argv[1] ); 0x80484d2 <main+50>: push $0xa 0x80484d4 <main+52>: push $0x0 0x80484d6 <main+54>: pushl 0x4(%esi) 0x80484d9 <main+57>: call 0x8048378 <__strtol_internal> 0x80484de <main+62>: mov %eax,%ebx 0x80484e0 <main+64>: add $0x10,%esp 0x80484e3 <main+67>: push $0x0 iterations = atol( argv[2] ); 0x80484e5 <main+69>: push $0xa 0x80484e7 <main+71>: push $0x0 0x80484e9 <main+73>: pushl 0x8(%esi) 0x80484ec <main+76>: call 0x8048378 <__strtol_internal> Lab 2.1

Assembly Output – Method 1 0x80484f1 <main+81>: mov %eax,0xffffffec(%ebp) 0x80484f4 <main+84>: add $0x10,%esp 0x80484f7 <main+87>: add $0xfffffffc,%esp 0x80484fa <main+90>: push %eax 0x80484fb <main+91>: push %ebx 0x80484fc <main+92>: push $0x80486a0 0x8048501 <main+97>: call 0x8048398 <printf> printf( "Inputs: Offset: %d 0x8048506 <main+102>: lea 0x80498a0(%ebx),%esi 0x804850c <main+108>: add $0xfffffffc,%esp 0x804850f <main+111>: push %esi 0x8048510 <main+112>: push $0x80498a0 0x8048515 <main+117>: push $0x80486e0 0x804851a <main+122>: call 0x8048398 <printf> printf( "Global addr: …. 0x804851f <main+127>: movl $0xbc614e,0x80498a0(%ebx) *alignment = 12345678; 0x8048529 <main+137>: add $0x20,%esp 0x804852c <main+140>: add $0xfffffff4,%esp 0x804852f <main+143>: lea 0xfffffffc(%ebp),%eax 0x8048532 <main+146>: push %eax 0x8048533 <main+147>: call 0x8048368 <time> time ( &start_time ); The code continues . . . Lab 2.1

Task 1 – Method 2: Determine all the addressing modes that seem to be possible with your compiler and with your architecture. Method 2: “vi proj2.1.s” On doing this, the Assembly output shown on the next pages was produced. • You will need to examine this Assembler output to identify all the addressing modes evident. Here’s one mode as an example: movl $12345678,global(%ebx) • You will then modify this C code in order to generate other Assembler that will contain additional addressing modes. • If you hunt around on the web, you will find various people who have defined the Addressing modes for Intel processors. That’s all there is to this part of the assignment. Lab 2.1

Assembly Output – Method 2 .file "proj2.1.c" .version "01.01" gcc2_compiled.: .section .rodata .align 32 .LC0: .string "Usage alignment <byte offset> <iterations>\n" .align 32 .globl main .type main,@function main: pushl %ebp movl %esp,%ebp subl $28,%esp pushl %edi pushl %esi pushl %ebx movl 12(%ebp),%esi cmpl $2,8(%ebp) jg .L33 addl $-12,%esp pushl $.LC0 call printf addl $-12,%esp pushl $0 call exit .p2align 4,,7 This code matches Method 1, It’s just formatted differently. Lab 2.1

Assembly Output – Method 2 leal global(%ebx),%esi addl $-4,%esp pushl %esi pushl $global pushl $.LC2 call printf movl $12345678,global(%ebx) addl $32,%esp addl $-12,%esp leal -4(%ebp),%eax pushl %eax call time .L33: pushl $0 pushl $10 pushl $0 pushl 4(%esi) call __strtol_internal movl %eax,%ebx addl $16,%esp pushl $0 pushl $10 pushl $0 pushl 8(%esi) call __strtol_internal movl %eax,-20(%ebp) addl $16,%esp addl $-4,%esp pushl %eax pushl %ebx pushl $.LC1 call printf Lab 2.1

Task 2: Determine the CPI of a Program. It’s probably easiest to start with the code used in Task 1. Your goal is to take a simple piece of code – something with a loop in it, and determine the CPI. To accomplish this, you will need to do the following: • Think – what formula are you going to use to get CPI? I’m purposely NOT giving you the formula – it’s part of what you need to figure out. • How will you get the pieces required as input for this formula? • Everyone will get a different answer to this task. In fact, to do this test in real life is extremely difficult because it must be done on “realistic code”, not on just a loop. But for this test, a loop will be just fine. A very useful program is found at ~jbreecher/public/docs/arch_params This program gives you all kinds of information about the machine you’re executing on. A crucial and required piece of data is the CPU speed – in megahertz, of the machine – this is the very last thing printed out. Lab 2.1

Task 2: Determine the CPI of a Program. Please do this assignment on one of the more modern processors in the lab. We have two flavors of “Modern” processor: AMD running at 1.5 GHz. These are interesting machines because they are dual processors. But those dual processors MIGHT get in the way. I’d recommend you get your setup running on a Uni-processor and then giving it a try on the dual AMDs. Ullman, Zermelo, Yates, Vandermonde, Ulam The following machines are 2.4GHz Pentium 4’s – I recommend that you concentrate your testing on these machines. aiken, godel, iverson, rice Lab 2.1

Task 3: Write a Piece of Intel Assembler Code There are at least two ways to do this. • One way is to write a piece of Assembler Code and then run it through gcc. GCC understands by the name of the file name.s, that it’s assembler and automatically assembles rather than compiles. If you want to go this route, fine, but I’m going to recommend a different way. • Insert a few lines of assembler code into a larger C program. The Gnu compiler has a technique that allows you to write assembler within the C program. This has a big advantage because it allows you to feed inputs to and obtain outputs from the Assembler code without worrying about printf, etc. kinds of statements. Your job for this task is to modify the assembler code as given in the following pages. You will use SOME OTHER Intel instruction – not the OR instruction. You will feed inputs to the instruction, get the result back, and print that result. The MORE COMPLEX the Intel instruction you choose, the more credit – so go wild. Lab 2.1

Task 3: Write a Piece of Intel Assembler Code Of special note in the code are the constructs: asm ("movl %0, %%eax": : "g" (input1)); \ Which says “move the variable ‘input1’ to the register eax.” asm ("movl %%eax, %0": "=g" (result) ); \ Which says “move the contents of register eax to variable ‘result’” Here’s an example of the code being compiled along with its usage. hopper% gcc -g -o assembler_code assembler_code.c -lm hopper% assembler_code 3 9 Input #1: 3: 00000000000000000000000000000011 Input #2: 9: 00000000000000000000000000001001 Result is: 11: 00000000000000000000000000001011 hopper% Lab 2.1

Code to Insert Assembler Within C /********************************************************************* This program shows how to call assembler from within C. *********************************************************************/ #include <stdlib.h> #include <stdio.h> #include <math.h> /********************************************************************* This is the definition of the assembler macro. *********************************************************************/ #define DO_INSTR(Input_1, Input_2, Result) \ ( { \ asm ("push %eax" ); \ asm ("push %ebx" ); \ asm ("movl %0, %%eax": : "g" (input1)); \ asm ("movl %0, %%ebx": : "g" (input2)); \ asm ("or %ebx, %eax" ); \ asm ("movl %%eax, %0": "=g" (result) ); \ asm ("pop %ebx"); \ asm ("pop %eax"); \ } ) Lab 2.1

Code to Insert Assembler Within C #define DO_NOP(Input_1, Input_2, Result) \ ( { \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ asm ("nop" ); \ } ) void int_to_binary( unsigned int , char *); /********************************************************************* The main code starts here. *********************************************************************/ int main( int argc, char *argv[] ) { int input1; /* This is the 1st input argument. */ int input2; /* This is the 2nd input argument. */ unsigned int result; /* This is where the macro places the result. */ char binary_text[40]; Lab 2.1

Code to Insert Assembler Within C if (argc < 3 ) { printf( "Usage: assembler_code <input1> <input2>\n"); exit(0); } input1 = atol( argv[1] ); input2 = atol( argv[2] ); result = 0; int_to_binary( input1, binary_text ); printf( "Input #1: %4d: %s\n", input1, binary_text ); int_to_binary( input2, binary_text ); printf( "Input #2: %4d: %s\n", input2, binary_text ); DO_INSTR( input1, input2, result ); DO_NOP( input1, input2, result ); int_to_binary( result, binary_text ); printf( "Result is: %4d: %s\n", result, binary_text ); } /* End of main */ Lab 2.1

Code to Insert Assembler Within C /********************************************************************* This function converts an integer to a string representing binary *********************************************************************/ void int_to_binary( unsigned int input, char *output ) { int index; unsigned int comparison; for ( index = 0; index < 32; index++ ) { comparison = pow( (double)2, (double)index ); if ( ( input & comparison ) != 0 ) output[31 - index] = '1'; else output[31 - index] = '0'; } output[32] = '\0'; return; } /* End of int_to_binary */ Lab 2.1

Code to Insert Assembler Within C 0x8048551 <main+145>: push %eax  Code for DO_INSTR 0x8048552 <main+146>: push %ebx 0x8048553 <main+147>: mov %edi,%eax 0x8048555 <main+149>: mov %esi,%ebx 0x8048557 <main+151>: or %ebx,%eax 0x8048559 <main+153>: mov %eax,%esi 0x804855b <main+155>: pop %ebx 0x804855c <main+156>: pop %eax 0x804855d <main+157>: nop  Code for DO_NOP 0x804855e <main+158>: nop 0x804855f <main+159>: nop 0x8048560 <main+160>: nop 0x8048561 <main+161>: nop 0x8048562 <main+162>: nop 0x8048563 <main+163>: nop 0x8048564 <main+164>: nop 0x8048565 <main+165>: nop 0x8048566 <main+166>: nop 0x8048567 <main+167>: nop 0x8048568 <main+168>: nop 0x8048569 <main+169>: nop Lab 2.1

Computer Architecture