chapter 2 n.
Download
Skip this Video
Download Presentation
Chapter 2

Loading in 2 Seconds...

play fullscreen
1 / 55

Chapter 2 - PowerPoint PPT Presentation


  • 208 Views
  • Uploaded on

CprE 381 Computer Organization and Assembly Level Programming, Fall 2013. Chapter 2. Instructions: Language of the Computer. Zhao Zhang Iowa State University Revised from original slides provided by MKP. Review of Week 4. MIPS procedure/function call convention Leaf and non-leaf examples

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chapter 2' - brosh


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
chapter 2

CprE 381 Computer Organization and Assembly Level Programming, Fall 2013

Chapter 2

Instructions: Language of the Computer

Zhao Zhang

Iowa State University

Revised from original slides provided by MKP

review of week 4
Review of Week 4
  • MIPS procedure/function call convention
  • Leaf and non-leaf examples
  • Clearing array example
  • String copy example
  • Other issues:
    • Load 32-bit immediate
    • Assembler, loader, and compiler effects

§2.8 Supporting Procedures in Computer Hardware

Chapter 2 — Instructions: Language of the Computer — 2

announcements
Announcements
  • Exam 1 on Friday Oct. 4
  • Course review on Wednesday Oct. 2
  • HW4 is due on Sep. 27
  • HW5 will be due on Oct. 11
    • Do HW5 as exercise before Exam 1
    • No HW and quizzes next week
  • Lab 2 demo is due this week and Lab 3 demo due next week
  • Lab 4 starts next week, due in one week

Chapter 1 — Computer Abstractions and Technology — 3

exam 1
Exam 1
  • Open book, open notes, calculator are allowed
  • E-book reader is allowed
    • Must be put in airplane mode
  • Coverage
    • Chapter 1, Computer Abstraction and Technology
    • Chapter 2, Instructions: Language of the Computer
    • Some contents from Appendix B
    • MIPS floating-point instructions

Chapter 1 — Computer Abstractions and Technology — 4

exam question types
Exam Question Types
  • Short conceptual questions
  • Calculation: speedup, power saving, CPI, etc.
  • MIPS assembly programming
    • Translate C statements to MIPS (arithmetic, load/store, branch and jump, others)
    • Translate C functions to MIPS (call convention)
  • Among others

Suggestions:

  • Review slidesand textbook
  • Review homework and quizzes

Chapter 1 — Computer Abstractions and Technology — 5

overview for week 5
Overview for Week 5

Overview for Week 5, Sep. 23 - 27

  • Bubble sorting example
    • It will be used in Mini-Projects
  • Floating point instructions
  • ARM and x86 instruction set overview

Chapter 1 — Computer Abstractions and Technology — 6

classic bubble sorting
Classic Bubble Sorting
  • Bubble sort: Swap two adjacent elements if they are out of order
  • Pass the array n times, each time a largest element will float to the top
  • Look at the first pass of five elements

1st try: 5 3 8 2 7 => 3 5 8 2 7

2nd try: 3 5 8 2 7 => 3 5 8 2 7

3rd try: 3 5 827 => 3 5 2 87

4th try: 3 5 2 7 8=> 3 5 2 7 8

Chapter 1 — Computer Abstractions and Technology — 7

classic bubble sorting1
Classic Bubble Sorting
  • Pass i only has to check for (n-i) swaps
    • In each pass, an element may float up until it meets a larger element
    • The sorted sub-array increments by one

1st pass: 5 3 8 2 7 => 3 5 2 7 8

2nd pass: 3 5 2 7 8=> 3 2 5 7 8

3ndpass: 3 2 5 7 8 => 2 3 5 7 8

4ndpass: 2 3 5 7 8 => 2 3 5 7 8

Chapter 1 — Computer Abstractions and Technology — 8

revised bubble sorting
Revised Bubble Sorting
  • The textbook bubble-sort is optimized to reduce comparisons

void sort (int v[], int n)

{

inti, j;

for (i= 0; i < n; i++) {

for (j = i – 1; j >= 0 && v[j] > v[j+1]; j--)

swap(v, j);

}

}

Chapter 1 — Computer Abstractions and Technology — 9

revised bubble sorting1
Revised Bubble Sorting
  • The classic one let a largest element float to the top of the unsorted sub-array
  • The revised one let an element float to its right place in the sorted sub-array

1stpass: 538 2 7 => 3 58 2 7

2ndpass: 3 58 2 7 => 3582 7

3nd pass: 3582 7 => 2 3 5 8 7

4nd pass: 2 3 5 87=> 2 3 5 7 8

Chapter 1 — Computer Abstractions and Technology — 10

the swap function
The Swap Function
  • The swap function is a leaf function

void swap(int v[], int k){int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp;}

    • v in $a0, k in $a1, temp in $t0

§2.13 A C Sort Example to Put It All Together

Chapter 2 — Instructions: Language of the Computer — 11

the swap function1
The Swap Function

swap: sll $t1, $a1, 2 # $t1 = k * 4

add $t1, $a0, $t1 # $t1 = v+(k*4)

# (address of v[k])

lw $t0, 0($t1) # $t0 (temp) = v[k]

lw $t2, 4($t1) # $t2 = v[k+1]

sw $t2, 0($t1) # v[k] = $t2 (v[k+1])

sw $t0, 4($t1) # v[k+1] = $t0 (temp)

jr $ra # return to calling routine

Chapter 2 — Instructions: Language of the Computer — 12

the sort function
The Sort Function

for (i = 0; i < n; i++) {

for (j = i – 1; j >= 0 && v[j] > v[j+1]; j--)

swap(v, j);

}

  • Save $ra to stack, as it’s a non-leaf function
  • Assign i and j to $s0 and $s1
    • They must be preserved when calling swap()
  • Move v, n from $a0 and $a1 to $s2 and $s2
    • They must be preserved, too
    • $a0 and $a1 are used when calling swap()
  • We need a stack frame of 5 words or 20 bytes

Chapter 1 — Computer Abstractions and Technology — 13

sort prologue and epilogue
Sort Prologue and Epilogue

sort: addi $sp,$sp, –20 # make room on stack for 5 registers

sw $ra, 16($sp) # save $ra on stack

sw $s3,12($sp) # save $s3 on stack

sw $s2, 8($sp) # save $s2 on stack

sw $s1, 4($sp) # save $s1 on stack

sw $s0, 0($sp) # save $s0 on stack

… # procedure body

exit1: lw $s0, 0($sp) # restore $s0 from stack

lw $s1, 4($sp) # restore $s1 from stack

lw $s2, 8($sp) # restore $s2 from stack

lw $s3,12($sp) # restore $s3 from stack

lw $ra,16($sp) # restore $ra from stack

addi $sp,$sp, 20 # restore stack pointer

jr $ra # return to calling routine

  • Entry: Get a frame, save $ra and $s3-$s0
  • Exit: Restore $s0-$s3 and $ra, free the frame

Chapter 2 — Instructions: Language of the Computer — 14

sort function body
Sort Function Body

A new pseudo instruction

moverd, rs

is equivalent to

add rd, rs, $zero

Example

move $s2, $a0 # $s2 = $zero

move $s3, $a1 # $s3 = $a1

No use of pseudo assembly instructions in Exam 1

Chapter 1 — Computer Abstractions and Technology — 15

sort function body1
Sort Function Body

Moveparams

move $s2, $a0 # save $a0 into $s2

move $s3, $a1 # save $a1 into $s3

move $s0, $zero # i = 0

for1tst: slt $t0, $s0, $s3 # $t0 = 0 if $s0 ≥ $s3 (i ≥ n)

beq $t0, $zero, exit1 # go to exit1 if $s0 ≥ $s3 (i ≥ n)

addi $s1, $s0, –1 # j = i – 1

for2tst: slti $t0, $s1, 0 # $t0 = 1 if $s1 < 0 (j < 0)

bne $t0, $zero, exit2 # go to exit2 if $s1 < 0 (j < 0)

sll $t1, $s1, 2 # $t1 = j * 4

add $t2, $s2, $t1 # $t2 = v + (j * 4)

lw $t3, 0($t2) # $t3 = v[j]

lw $t4, 4($t2) # $t4 = v[j + 1]

slt $t0, $t4, $t3 # $t0 = 0 if $t4 ≥ $t3

beq $t0, $zero, exit2 # go to exit2 if $t4 ≥ $t3

move $a0, $s2 # 1st param of swap is v (old $a0)

move $a1, $s1 # 2nd param of swap is j

jal swap # call swap procedure

addi $s1, $s1, –1 # j –= 1

j for2tst # jump to test of inner loop

exit2: addi $s0, $s0, 1 # i += 1

j for1tst # jump to test of outer loop

Outer loop

Inner loop

Passparams& call

Inner loop

Outer loop

Chapter 2 — Instructions: Language of the Computer — 16

sort function optimized
Sort Function Optimized

Old version:

void sort(int v[], int n)

inti, j;

for (i = 0; i < n; i++) {

for (j = i – 1; j >= 0 && v[j] > v[j+1]; j--)

swap(v, j);

}

New version:

void sort(int v[], int n)

{

int *pi, *pj;

for (pi = v; pi < &v[n]; pi++)

for (pj= pj - 1; pj>= v && swap(pj); pj--)

{}

}

Chapter 1 — Computer Abstractions and Technology — 17

new swap function
New Swap Function
  • A more efficient swap function that reduces memory loads

// swap two adjacent elements if they are

// out of order. Return 1 if swapped, 0

// otherwise

int swap(int *p)

{

if (p[0] > p[1]) {

inttmp = p[0];

p[0] = p[1];

p[1] = tmp;

return 1;

}

else

return 0;

}

Chapter 1 — Computer Abstractions and Technology — 18

new swap function1
New Swap Function
  • A new swap function

swap:

lw $t0, 0($a0) # load p[0]

lw $t1, 4($a0) # load p[1]

slt $t2, $t1, $t0 # p[1] < p[0]?

beq$t2, $zero, else

sw $t1, 0($a0) # swap

sw $t0, 4($a0) # swap

addi $v0, $zero, 1 # $v0 = 1

jr $ra

else:

addi $v0, $zero, 0 # $v0 = 0

jr $ra

Chapter 1 — Computer Abstractions and Technology — 19

new sort function
New Sort Function

The sort() function optimized

  • Register usage
    • $s0: v
    • $s1: &v[n]
    • $s2: pi
    • $s3: pj
  • Need a frame of 5 words to save $ra and $s0-$s2

Chapter 1 — Computer Abstractions and Technology — 20

sort prologue and epilogue1
Sort Prologue and Epilogue

sort:

addi $sp, $sp, -20 # frame of 5 words

sw $ra, 16($sp)

sw $s3, 12($sp)

sw$s2, 8($sp)

sw$s1, 4($sp)

sw$s0, 0($sp)

lw $s0, 0($sp)

lw$s1, 4($sp)

lw$s2, 8($sp)

lw$s3, 12($sp)

lw $ra, 16($sp)

addi $sp, $sp, 20 # release frame

jr $ra

MIPS code for sort function body

Chapter 1 — Computer Abstractions and Technology — 21

new sort outer loop
New Sort: Outer Loop

for (pi = v; pi < &v[n]; pi++)

for (pj = pj - 1; pj >= v && swap(pj); pj--)

{}

add $s0, $a0, $zero # $s0 = v

sll $a1, $a1, 2 # $a1 = 4*n

add $s1, $s0, $a1 # $s1 = &v[n]

add $s2, $s0, $zero # pi = v

j for1_tst

for1_loop:

addi$s2, $s2, 4 # pi++

for1_tst:

slt $t0, $s2, $s1 # pi < &v[n]?

bne $t0, $zero, for1_loop # yes? repeat

C code for the inner loop

MIPS code for the inner loop

Chapter 1 — Computer Abstractions and Technology — 22

new sort inner loop
New Sort: Inner Loop

for (pj= pi-1; pj>= v && swap(pj); pj--)

{}

addi $s3, $s2, -4 # pj = pi-1

j for2_tst

for2_loop:

addi $s3, $s3, -4 # pj--

for2_tst:

slt $t0, $s3, $s0 # pj < v?

bne $t0, $zero,for2_exit # yes? exit

add $a0, $s3, $zero # $a0 = pj

jal swap # swap(pj)

bne $v0, $zero,for2_loop # ret 1? cont

for2_exit:

Chapter 1 — Computer Abstractions and Technology — 23

lab mini projects
Lab Mini-Projects
  • You will use the sorting code to test your CPU design in the lab mini-projects
  • Use the new sorting code
    • The new code is more optimized
    • It will simplify the debugging

Chapter 1 — Computer Abstractions and Technology — 24

fp instructions in mips
FP Instructions in MIPS

Reading: Textbook Ch. 3.5 and B-71 – B80

  • FP hardware is coprocessor 1
    • Adjunct processor that extends the ISA
  • Separate FP registers
    • 32 single-precision: $f0, $f1, … $f31
    • Paired for double-precision: $f0/$f1, $f2/$f3, …
      • Release 2 of MIPS ISA supports 32 × 64-bit FP reg’s

Chapter 3 — Arithmetic for Computers — 25

fp instructions in mips1
FP Instructions in MIPS
  • FP instructions operate only on FP registers
    • Programs generally don’t do integer ops on FP data, or vice versa
    • More registers with minimal code-size impact

Chapter 1 — Computer Abstractions and Technology — 26

fp instructions in mips2
FP Instructions in MIPS
  • FP load and store instructions
    • lwc1, ldc1, swc1, sdc1
      • e.g., ldc1 $f8, 32($sp)
    • lwc1, swc1: Load/store single-precision
    • ldc1, swc1: Load/store double-precision

Chapter 1 — Computer Abstractions and Technology — 27

fp instructions in mips3
FP Instructions in MIPS
  • Single-precision arithmetic
    • add.s, sub.s, mul.s, div.s
      • e.g., add.s $f0, $f1, $f6
  • Double-precision arithmetic
    • add.d, sub.d, mul.d, div.d
      • e.g., mul.d $f4, $f4, $f6

Chapter 3 — Arithmetic for Computers — 28

fp instructions in mips4
FP Instructions in MIPS
  • Single- and double-precision comparison
    • c.xx.s, c.xx.d (xx is eq, lt, le, …)
    • Sets or clears FP condition-code bit
      • e.g. c.lt.s $f3, $f4
  • Branch on FP condition code true or false
    • bc1t, bc1f
      • e.g., bc1t TargetLabel

Chapter 1 — Computer Abstractions and Technology — 29

mips call convention fp
MIPS Call Convention: FP
  • The first two FP parameters in registers
    • 1st parameter in $f12 or $f12:$f13
      • A double-precision parameter takes two registers
    • 2nd FP parameter in $f14or $f14:$f15
    • Extra parameters in stack
  • $f0 stores single-precision FP return value
  • $f0:$f1 stores double-precision FP return value
  • $f0-$f19 are FP temporary registers
  • $f20-$f31 are FP saved temporary registers

Chapter 1 — Computer Abstractions and Technology — 30

fp example f to c
FP Example: °F to °C
  • C code:

float f2c (float fahr)

{ return ((5.0/9.0) * (fahr - 32.0));}

  • fahr in $f12, result in $f0
  • Assume literals in global memory space, e.g. const5 for 5.0 and const9 for 9.0
    • Can FP immediate be encoded in MIPS instructions?

Chapter 3 — Arithmetic for Computers — 31

fp example f to c1
FP Example: °F to °C
  • Compiled MIPS code:

f2c: lwc1 $f16, const5($gp)lwc1 $f18, const9($gp)div.s $f16, $f16, $f18 lwc1 $f18, const32($gp)sub.s $f18, $f12, $f18mul.s $f0, $f16, $f18jr $ra

Chapter 1 — Computer Abstractions and Technology — 32

fp example function call
FP Example: Function Call

extern float fahr, cel;

cel = f2c(fahr);

Assume fahris at 100($gp), celis at 104($gp)

lwc1 $f12, 100($gp) # load 1stpara

jal f2c

swcl $f0, 104($gp); # save ret val

Chapter 1 — Computer Abstractions and Technology — 33

fp example max
FP Example: Max

double max(double x, double y)

{

return (x > y) ? x : y;

}

max:

c.lt.d $f14, $f12 # y < x?

bc1f else # if false, do else

mov.d $f0, $f12 # $f0:$f1 = x

jr $ra

else:

mov.d $f0, $f14 # $f0:$f1 = y

jr $ra

Chapter 1 — Computer Abstractions and Technology — 34

fp example max1
FP Example: Max
  • How to call max?
    • Assume a, b, c at 100($gp), 108($gp), and 116($gp)

extern double a, b, c;

c = max(a, b);

ldc1 $f12, 100($gp) # $f12:$f13 = a

ldc1 $f14, 108($gp) # $f14:$f15 = b

jal max

sdc1 $f0, 116($gp) # c = $f0:$f1

Chapter 1 — Computer Abstractions and Technology — 35

fp example search value
FP Example: Search Value

int search(double X[], int size, double value)

{

for (inti = 0; i < size; i++)

if (X[i] == value)

return 1;

return 0;

}

Note 1: There are integer and FP parameters, and the return value is integer

Note 2: A real program may search a value in a range, e.g. [value - delta, value + delta]

Chapter 1 — Computer Abstractions and Technology — 36

fp example search value1
FP Example: Search Value

search:

add $t0, $zero, $zero # i = 0

j for_cond

for_loop:

sll $t1, $t0, 3 # $t1 = 8*i

add $t1, $a0, $t1 # $t1 = &X[i]

lwc1 $f2, 0($t1) # $f2 = X[i]

c.eq.d $f2, $f12 # X[i] == value?

bc1f endif # if false, skip

addi $v0, $zero, 1 # $v0 = 1

jr $ra # return

endif:

addi $t0, $t0, 1 # i++

for_cond:

slt $t1, $t0, $a1 # i < size?

bne $t1, $zero, for_loop # repeat if true

add $v0, $zero, $zero # to return 0

jr $ra

Chapter 1 — Computer Abstractions and Technology — 37

fp example array multiplication
FP Example: Array Multiplication
  • X = X + Y × Z
    • All 32 × 32 matrices, 64-bit double-precision elements
  • C code:

void mm (double x[][], double y[][], double z[][]) { int i, j, k; for (i = 0; i! = 32; i = i + 1) for (j = 0; j! = 32; j = j + 1) for (k = 0; k! = 32; k = k + 1) x[i][j] = x[i][j] + y[i][k] * z[k][j];}

    • Addresses of x, y, z in $a0, $a1, $a2, andi, j, k in $s0, $s1, $s2

Chapter 3 — Arithmetic for Computers — 38

fp example array multiplication1
FP Example: Array Multiplication
  • MIPS code:

li $t1, 32 # $t1 = 32 (row size/loop end) li $s0, 0 # i = 0; initialize 1st for loopL1: li $s1, 0 # j = 0; restart 2nd for loopL2: li $s2, 0 # k = 0; restart 3rd for loop sll $t2, $s0, 5 # $t2 = i * 32 (size of row of x)addu $t2, $t2, $s1 # $t2 = i * size(row) + jsll $t2, $t2, 3 # $t2 = byte offset of [i][j] addu $t2, $a0, $t2 # $t2 = byte address of x[i][j] l.d $f4, 0($t2) # $f4 = 8 bytes of x[i][j]L3: sll $t0, $s2, 5 # $t0 = k * 32 (size of row of z) addu $t0, $t0, $s1 # $t0 = k * size(row) + j sll $t0, $t0, 3 # $t0 = byte offset of [k][j] addu $t0, $a2, $t0 # $t0 = byte address of z[k][j] l.d $f16, 0($t0) # $f16 = 8 bytes of z[k][j] …

Chapter 3 — Arithmetic for Computers — 39

fp example array multiplication2
FP Example: Array Multiplication

…sll $t0, $s0, 5 # $t0 = i*32 (size of row of y) addu $t0, $t0, $s2 # $t0 = i*size(row) + k sll $t0, $t0, 3 # $t0 = byte offset of [i][k] addu $t0, $a1, $t0 # $t0 = byte address of y[i][k] l.d $f18, 0($t0) # $f18 = 8 bytes of y[i][k]mul.d $f16, $f18, $f16 # $f16 = y[i][k] * z[k][j] add.d $f4, $f4, $f16 # f4=x[i][j] + y[i][k]*z[k][j] addiu $s2, $s2, 1 # $k k + 1 bne $s2, $t1, L3 # if (k != 32) go to L3 s.d $f4, 0($t2) # x[i][j] = $f4 addiu $s1, $s1, 1 # $j = j + 1 bne $s1, $t1, L2 # if (j != 32) go to L2 addiu $s0, $s0, 1 # $i = i + 1 bne $s0, $t1, L1 # if (i != 32) go to L1

Chapter 3 — Arithmetic for Computers — 40

arm mips similarities
ARM & MIPS Similarities
  • ARM: the most popular embedded core
  • Similar basic set of instructions to MIPS

§2.16 Real Stuff: ARM Instructions

Chapter 2 — Instructions: Language of the Computer — 41

compare and branch in arm
Compare and Branch in ARM
  • Uses condition codes for result of an arithmetic/logical instruction
    • Negative, zero, carry, overflow
    • Compare instructions to set condition codes without keeping the result
  • Each instruction can be conditional
    • Top 4 bits of instruction word: condition value
    • Can avoid branches over single instructions

Chapter 2 — Instructions: Language of the Computer — 42

instruction encoding
Instruction Encoding

Chapter 2 — Instructions: Language of the Computer — 43

the intel x86 isa
The Intel x86 ISA
  • Evolution with backward compatibility
    • 8080 (1974): 8-bit microprocessor
      • Accumulator, plus 3 index-register pairs
    • 8086 (1978): 16-bit extension to 8080
      • Complex instruction set (CISC)
    • 8087 (1980): floating-point coprocessor
      • Adds FP instructions and register stack
    • 80286 (1982): 24-bit addresses, MMU
      • Segmented memory mapping and protection
    • 80386 (1985): 32-bit extension (now IA-32)
      • Additional addressing modes and operations
      • Paged memory mapping as well as segments

§2.17 Real Stuff: x86 Instructions

Chapter 2 — Instructions: Language of the Computer — 44

the intel x86 isa1
The Intel x86 ISA
  • Further evolution…
    • i486 (1989): pipelined, on-chip caches and FPU
      • Compatible competitors: AMD, Cyrix, …
    • Pentium (1993): superscalar, 64-bit datapath
      • Later versions added MMX (Multi-Media eXtension) instructions
      • The infamous FDIV bug
    • Pentium Pro (1995), Pentium II (1997)
      • New microarchitecture (see Colwell, The Pentium Chronicles)
    • Pentium III (1999)
      • Added SSE (Streaming SIMD Extensions) and associated registers
    • Pentium 4 (2001)
      • New microarchitecture
      • Added SSE2 instructions

Chapter 2 — Instructions: Language of the Computer — 45

the intel x86 isa2
The Intel x86 ISA
  • And further…
    • AMD64 (2003): extended architecture to 64 bits
    • EM64T – Extended Memory 64 Technology (2004)
      • AMD64 adopted by Intel (with refinements)
      • Added SSE3 instructions
    • Intel Core (2006)
      • Added SSE4 instructions, virtual machine support
    • AMD64 (announced 2007): SSE5 instructions
      • Intel declined to follow, instead…
    • Advanced Vector Extension (announced 2008)
      • Longer SSE registers, more instructions
  • If Intel didn’t extend with compatibility, its competitors would!
    • Technical elegance ≠ market success

Chapter 2 — Instructions: Language of the Computer — 46

basic x86 registers
Basic x86 Registers

Chapter 2 — Instructions: Language of the Computer — 47

basic x86 addressing modes
Basic x86 Addressing Modes
  • Two operands per instruction
  • Memory addressing modes
    • Address in register
    • Address = Rbase + displacement
    • Address = Rbase + 2scale× Rindex (scale = 0, 1, 2, or 3)
    • Address = Rbase + 2scale× Rindex + displacement

Chapter 2 — Instructions: Language of the Computer — 48

x86 instruction encoding
x86 Instruction Encoding
  • Variable length encoding
    • Postfix bytes specify addressing mode
    • Prefix bytes modify operation
      • Operand length, repetition, locking, …

Chapter 2 — Instructions: Language of the Computer — 49

implementing ia 32
Implementing IA-32
  • Complex instruction set makes implementation difficult
    • Hardware translates instructions to simpler microoperations
      • Simple instructions: 1–1
      • Complex instructions: 1–many
    • Microengine similar to RISC
    • Market share makes this economically viable
  • Comparable performance to RISC
    • Compilers avoid complex instructions

Chapter 2 — Instructions: Language of the Computer — 50

fallacies
Fallacies
  • Powerful instruction  higher performance
    • Fewer instructions required
    • But complex instructions are hard to implement
      • May slow down all instructions, including simple ones
    • Compilers are good at making fast code from simple instructions
  • Use assembly code for high performance
    • But modern compilers are better at dealing with modern processors
    • More lines of code  more errors and less productivity

§2.18 Fallacies and Pitfalls

Chapter 2 — Instructions: Language of the Computer — 51

fallacies1
Fallacies
  • Backward compatibility  instruction set doesn’t change
    • But they do accrete more instructions

x86 instruction set

Chapter 2 — Instructions: Language of the Computer — 52

pitfalls
Pitfalls
  • Sequential words are not at sequential addresses
    • Increment by 4, not by 1!
  • Keeping a pointer to an automatic variable after procedure returns
    • e.g., passing pointer back via an argument
    • Pointer becomes invalid when stack popped

Chapter 2 — Instructions: Language of the Computer — 53

concluding remarks
Concluding Remarks
  • Design principles

1. Simplicity favors regularity

2. Smaller is faster

3. Make the common case fast

4. Good design demands good compromises

  • Layers of software/hardware
    • Compiler, assembler, hardware
  • MIPS: typical of RISC ISAs
    • c.f. x86

§2.19 Concluding Remarks

Chapter 2 — Instructions: Language of the Computer — 54

concluding remarks1
Concluding Remarks
  • Measure MIPS instruction executions in benchmark programs
    • Consider making the common case fast
    • Consider compromises

Chapter 2 — Instructions: Language of the Computer — 55