slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Based on slides from D. Patterson and www-inst.eecs.berkeley/~cs152/ PowerPoint Presentation
Download Presentation
Based on slides from D. Patterson and www-inst.eecs.berkeley/~cs152/

Loading in 2 Seconds...

play fullscreen
1 / 165

Based on slides from D. Patterson and www-inst.eecs.berkeley/~cs152/ - PowerPoint PPT Presentation


  • 376 Views
  • Uploaded on

COM 249 – Computer Organization and Assembly Language Chapter 2 Instructions: Language of the Computer. Based on slides from D. Patterson and www-inst.eecs.berkeley.edu/~cs152/. Introduction. Words of a computer’s language are called its instructions Its vocabulary is its instruction set .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Based on slides from D. Patterson and www-inst.eecs.berkeley/~cs152/' - hayley


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

COM 249 – Computer Organization andAssembly LanguageChapter 2 Instructions:Language of the Computer

Based on slides from D. Patterson and

www-inst.eecs.berkeley.edu/~cs152/

Modified by S. J. Fritz Spring 2009 (1)

introduction
Introduction
  • Words of a computer’s language are called its instructions
  • Its vocabulary is itsinstruction set.
  • Goal:
    • Find a language that makes it easy to build the hardware and the compiler,
    • while maximizing performance and minimizing cost

Modified by S. J. Fritz Spring 2009 (2)

instruction set
Instruction Set

§2.1 Introduction

  • Different computers have different instruction sets
    • But with many aspects in common
  • Early computers had very simple instruction sets
    • Simplified implementation
  • Many modern computers also have simple instruction sets

Modified by S. J. Fritz Spring 2009 (3)

instruction set architecture
Instruction Set Architecture
  • Early trend was to add more and more instructions to new CPUs to do elaborate operations
    • VAX architecture had an instruction to multiply polynomials!
  • RISC philosophy (Cocke IBM, Patterson, Hennessy,1980s)–Reduced Instruction Set Computing (RISC)
    • Keep the instruction set small and simple, makes it easier to build fast hardware.
    • Let software do complicated operations by composing simpler ones.

Modified by S. J. Fritz Spring 2009 (4)

the mips instruction set
The MIPS Instruction Set
  • Stored program concept- instructions and data are stored as numbers.
  • MIPS Instruction Set is used as the example throughout the book
  • Stanford MIPS commercialized by MIPS Technologies (www.mips.com)
  • Large share of embedded core market
    • Applications in consumer electronics, network/storage equipment, cameras, printers, …
  • Typical of many modern ISAs
    • See MIPS Reference Data tear-out card, and Appendixes B and E

Modified by S. J. Fritz Spring 2009 (5)

mips architecture
MIPS Architecture
  • MIPS – semiconductor company that built one of the first commercial RISC architectures
  • We will study the MIPS architecture in some detail in this class
  • Why MIPS instead of Intel 80x86?
    • MIPS is simple, elegant. Don’t want to get bogged down in gritty details.
    • MIPS widely used in embedded apps, x86 little used in embedded, and more embedded computers than PCs

Modified by S. J. Fritz Spring 2009 (6)

slide7

Review: Instruction Set Design

software

instruction set

hardware

Which is easier to change?

Modified by S. J. Fritz Spring 2009 (7)

stored program computer
Stored Program Computer
  • Basic Principles
    • Use of instructions that are indistinguishable from numbers
    • Use of alterable memory for programs
  • Demands balance among number of instructions, the number of clock cycles needed by an instruction and the speed of the clock.

Modified by S. J. Fritz Spring 2009 (8)

overview of design principles
Overview of Design Principles

1. Simplicity favors regularity

  • keep all instructions a single size
  • require three register operands for arithmetic
  • keep register fields in same place in each instruction
  • regularity makes implementation simpler
  • simplicity enables higher performance at lower cost

2. Smaller is faster

  • the reason that MIPS has 32 registers rather than many more

Modified by S. J. Fritz Spring 2009 (9)

overview of design principles10
Overview of Design Principles

3.Make the common case fast

  • PC-relative addressing for conditional branch
  • immediate addressing for constant operands

4.Good design demands good compromises

  • compromise between larger addresses and keeping instructions same length

Modified by S. J. Fritz Spring 2009 (10)

mips instructions
MIPS Instructions
  • Design Principle 1: Simplicity favors regularity
  • The MIPS assembly language instruction

add a, b, c means a = b + c

  • Each line represents one instruction
  • Each instruction has exactly 3 operands for simplicity
  • There is one operation per MIPS instruction
  • Instructions are related to operations (=, +, -, *, /)

in C or Java

§2.2 Operations of the Computer Hardware

Modified by S. J. Fritz Spring 2009 (11)

arithmetic operations addition
Arithmetic Operations- Addition
  • The MIPS assembly language instruction

add a, b, c means a = b+c

  • This sequence adds four variables (a=b+c+d+e)

add a, b, c # the sum of b and c is placed in a

add a, a, d # the sum of b,c, and d is now in a

add a, a, e # the sum of b,c,d and e is now in a

  • Notice that it takes 3 instructions to add four variables

Modified by S. J. Fritz Spring 2009 (12)

mips addition and subtraction
MIPS Addition and Subtraction
  • Syntax of Instructions:

1 2,3,4

where:

1)operation by name

2) operand getting result (“destination”)

3) 1st operand for operation (“source1”)

4) 2nd operand for operation (“source2”)

  • Syntax is rigid:
    • 1 operator, 3 operands
    • Why? Keep Hardware simple via regularity

Modified by S. J. Fritz Spring 2009 (13)

mips addition and subtraction of integers
MIPS Addition and Subtraction of Integers
  • Addition in Assembly
    • Example: add $s0,$s1,$s2 (in MIPS)

Equivalent to: a = b + c (in C/Java)

where MIPS registers $s0,$s1,$s2 are associated with C variables a, b, c

  • Subtraction in Assembly
    • Example: sub $s3,$s4,$s5 (in MIPS)

Equivalent to: d = e - f (in C)

where MIPS registers $s3,$s4,$s5 are associated with C variables d, e, f

Modified by S. J. Fritz Spring 2009 (14)

addition and subtraction
Addition and Subtraction
  • How would MIPS do this C/Java statement?

a = b + c + d - e;

  • Break into multiple instructions

add $t0, $s1, $s2 # temp = b + c

add $t0, $t0, $s3 # temp = temp + d

sub $s0, $t0, $s4 # a = temp - e

  • Notice: A single line of C or Java may break up into several lines of MIPS. Everything after the hash mark- # - on each line is ignored (comments)

Modified by S. J. Fritz Spring 2009 (15)

compiling c into mips
Compiling C into MIPS
  • How do we do this?
  • C code:

f = (g + h) - (i + j);

  • Use intermediate temporary registers:
  • Compiled MIPS pseudocode:

add t0, g, h # temp t0 = g + hadd t1, i, j # temp t1 = i + jsub f, t0, t1 # f = t0 - t1

  • Comments are to the right of the #
  • Each line contains at most one instruction

Modified by S. J. Fritz Spring 2009 (16)

c java variables vs registers
C, Java Variables vs. Registers
  • In C (and most High Level Languages) variables are declared first and given a type
    • Example: int fahr, celsius; char a, b, c, d, e;
  • Each variable can ONLY represent a value of the declared type (cannot mix and match int and char variables).
  • In Assembly Language, the registers have no type; operation determines how register contents are treated

Modified by S. J. Fritz Spring 2009 (17)

operands of the computer hardware
Operands of the Computer Hardware
  • Operands of arithmetic instructions must be from a limited number of special memory locations called registers
  • Size of a MIPS register is32 bits- called aword (although there is a 64 bit version).
  • Major difference between variables in programming language (unlimited) andregisters is the limited number of registers- typically 32 in MIPS.

§2.3 Operands of the Computer Hardware

Modified by S. J. Fritz Spring 2009 (18)

operands of the computer hardware19
Operands of the Computer Hardware
  • Design Principle 2: Smaller is faster
    • Very large number of registers may increase clock cycle time because it takes electronic signals longer to travel farther.
    • Using more than 32 registers would require a different instruction format.
    • MIPS register convention is to use two character names following a dollar sign:

$s0, $s1… for variables

$t0, $t1… for temporary locations

$a0, $a1…for arguments

Modified by S. J. Fritz Spring 2009 (19)

compiling c into mips using registers
Compiling C into MIPS Using Registers
  • C code: (similar to previous example)

f = (g + h) - (i + j);

  • where f, g, h, i, j are assigned to registers $s0, $s1, $s2, $s3 and $s4 respectively:
  • Compiled MIPS code:

add $t0,$s1,$S2 #register $t0 = g + hadd $t1,$s3,$s4 #register $t1 = i + jsub $S0,$t0,$t1 #f gets $t0 - $t1

  • Variables have been replaced with registers

Modified by S. J. Fritz Spring 2009 (20)

register operands
Register Operands
  • Arithmetic instructions use registeroperands
  • MIPS has a 32 × 32-bit register file

(32 registers, each 32 bits)

    • Use for frequently accessed data
    • Numbered 0 to 31
    • 32-bit data called a “word”

Modified by S. J. Fritz Spring 2009 (21)

memory operands
Memory Operands
  • Programming Languages have both simple variables and complex data structures.
  • How can we handle large data structures with just a few registers?
    • Data structures are kept in memory.
  • MIPS includes instructions to transfer data between memory and registers.
    • Data transfer instructions ( load, store)

Modified by S. J. Fritz Spring 2009 (22)

memory operands23
Memory Operands
  • Data transfer Instruction
    • load copies data from memory to register
    • lw - load word
  • Format

opcode register , constant (register)

memory address

  • Syntax

lw $t0, 8 ($s3)

offset base address

Modified by S. J. Fritz Spring 2009 (23)

slide24

Memory Addressing

°

Since 1980 almost every machine uses addresses to the level of 8-bits ( byte)

2 questions for design of Instruction Set Architecture:

  • Since we could read a 32-bit word as
  • four loads of bytes from sequential byte addresses
  • or as one load word from a single byte address,

How do byte addresses map onto words?

Can a word be placed on any byte boundary?

Modified by S. J. Fritz Spring 2009 (24)

addressing objects alignment
Addressing Objects: Alignment
  • Since 8-bit bytes are useful, most architectures address individual bytes.
  • Address of a word matches the address of one of the four bytes in the word
  • Addresses of sequential words differ by 4 bytes
  • MIPS words must start at addresses that are multiples of 4 - called alignment restriction

Modified by S. J. Fritz Spring 2009 (25)

memory operands26
Memory Operands
  • Arithmetic operations occur on registers
  • More complex data structures (arrays and structures) are kept in memory
  • MIPS must include instructions that transfer data between memory and registers

(called data transfer instructions)

  • To access a word in memory, the instruction must include the memory address.

Modified by S. J. Fritz Spring 2009 (26)

memory operands27
Memory Operands
  • Main memory used for composite data
    • Arrays, structures, dynamic data
  • To apply arithmetic operations
    • Load values from memory into registers
    • Store result from register to memory
  • Memory is byte addressed
    • Each address identifies an 8-bit byte
  • Words are aligned in memory
    • Address must be a multiple of 4
  • MIPS is Big Endian
    • Most-significant byte at least address of a word
    • c.f. Little Endian: least-significant byte at least address

Modified by S. J. Fritz Spring 2009 (27)

addressing objects endianess
Addressing Objects: Endianess
  • Computers are grouped into those that use:
    • the address of the leftmost or “big end byte” as the word address
    • and those that use the “little end” or rightmost byte
  • MIPS is in the BIG Endian group

Modified by S. J. Fritz Spring 2009 (28)

addressing objects endianess and alignment
Addressing Objects: “Endianess” and Alignment
  • Big Endian: address of most significant :IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA
  • Little Endian: address of least significant: Intel 80x86, DEC Vax, DEC Alpha (Windows NT)

little endian byte 0

3 2 1 0

msb

lsb

0 1 2 3

0 1 2 3

Aligned

big endian byte 0

Alignment: require that objects fall

on address that is multiple of their size.

Not

Aligned

Modified by S. J. Fritz Spring 2009 (29)

big endian
Big Endian
  • "Big Endian" means that the high-order (most significant) byte of the number is stored in memory at the lowest address, and the low-order (least significant) byte at the highest address. (The “big end” comes first.)
  • A LongInt, would then be stored as:

Base Address+0 Byte3 Big Endian

Base Address+1 Byte2

Base Address+2 Byte1

Base Address+3 Byte0 Little Endian

  • Motorola processors (those used in Mac's) and mainframes use "Big Endian" byte order.
  • http://www.cs.umass.edu/~Verts/cs32/endian.html

A B C D

Modified by S. J. Fritz Spring 2009 (30)

little endian
Little Endian
  • "Little Endian" means that the low-order byte of the number is stored in memory at the lowest address, and the high-order byte at the highest address. (The little end comes first.)
  • A 4 byte LongInt Byte3 Byte2 Byte1 Byte0 will be arranged in memory as follows:

Base Address+0 Byte0

Base Address+1 Byte1

Base Address+2 Byte2

Base Address+3 Byte3

  • Intel processors (those used in PC's) use "Little Endian" byte order.
  • http://www.cs.umass.edu/~Verts/cs32/endian.html

Modified by S. J. Fritz Spring 2009 (31)

big endian and little endian
Big Endian and Little Endian
  • To represent the value 1025 (as a 4 byte integer):

00000000 00000000 0000010000000001

Address Big-Endian Little-Endian

If UNIX were stored as 2 two-byte words then in a Big-Endian systems, it would be stored as UNIX; in a Little-Endian system, it would be stored as NUXI. (See http://www.webopedia.com/TERM/B/big_endian.html )

Modified by S. J. Fritz Spring 2009 (32)

big and little endian
Big and Little Endian
  • Both have advantages and disadvantages
  • In "Big Endian" form, by having the high-order byte come first, you can test whether the number is positive or negative by looking at the byte at offset zero.
  • In "Little Endian" form, assembly language instructions for picking up a 1, 2, 4, or longer byte number proceed in exactly the same way for all formats and multiple precision math routines are correspondingly easy to write.
  • http://www.cs.umass.edu/~Verts/cs32/endian.html

Modified by S. J. Fritz Spring 2009 (33)

slide34

MIPS I Registers

  • Programmable storage
    • 232 x bytes of memory(r0-31)
    • 32 x 32-bit
    • General Purpose Registers GPRs (R0 = 0)

32 bits “wide”

Modified by S. J. Fritz Spring 2009 (34)

memory addresses and contents
100

10

101

1

Memory Addresses and Contents

3

2

1

0

Address

Data

Processor

Memory

The address of the third data element is 2 and the contents of Memory[2] is 10.

Modified by S. J. Fritz Spring 2009 (35)

registers vs memory
Registers vs. Memory
  • Registers are faster to access than memory
  • Operating on memory data requires loads and stores
    • More instructions to be executed
  • Compiler must use registers for variables as much as possible
    • Only spill to memory for less frequently used variables
    • Register optimization is important!

Modified by S. J. Fritz Spring 2009 (36)

arrays and data structures
Arrays and Data Structures
  • C and Java variables map onto registers; what about large data structures like arrays?
  • 1 of the 5 components of a computer - the memory- contains such data structures
  • But MIPS arithmetic instructions only operate on registers, never directly on memory.
  • Data transfer instructions transfer data between registers and memory:
    • Memory to register (Load)
    • Register to memory (Store)

Modified by S. J. Fritz Spring 2009 (37)

slide38

Store (to)

Load (from)

Anatomy: 5 components of any Computer

Registers are in the datapath of the processor; if operands are in memory, we must transfer them to the processor to operate on them, and then transfer back to memory when done.

Personal Computer

Computer

Processor

Memory

Devices

Input

Control

(“brain”)

Datapath

Registers

Output

These are “data transfer” instructions…

Modified by S. J. Fritz Spring 2009 (38)

slide39

Data Transfer: Memory to Registers

  • To transfer a word of data, we need to specify two things:
    • Register: specify this by # ($0 - $31) or symbolic name ($s0,…, $t0, …)
    • Memory address: more difficult
      • Think of memory as a single one-dimensional array, so we can address it simply by supplying a pointer to a memory address.
      • Other times, we want to be able to offset from this pointer.
  • Remember: “Load FROM memory”

Modified by S. J. Fritz Spring 2009 (39)

slide40

Data Transfer: Memory to Registers

  • To specify a memory address to copy from, specify two things:
    • A register containing a pointer to memory
    • A numerical offset (in bytes)
  • The desired memory address is the sum of these two values.
  • Example: 8($t0)
    • specifies the memory address pointed to by the value in $t0, plus 8 bytes

Modified by S. J. Fritz Spring 2009 (40)

slide41

Data Transfer: Memory to Register

  • Load Instruction Syntax:

1 2, 3(4)

lw $t0,12($s0)

where

1) operation name

2) register that will receive value

3) numerical offsetin bytes

4)register containing pointer to memory

  • MIPS Instruction Name:
    • lw (meaning Load Word, so 32 bits or one word are loaded at a time)

Modified by S. J. Fritz Spring 2009 (41)

slide42

Data Transfer: Memory to Register

Data flow

Example: lw $t0,12($s0)

This instruction will take the pointer in $s0, add 12 bytes to it, and then load the value from the memory pointed to by this calculated sum into register $t0

  • Notes:
    • $s0 is called the base register
    • 12 is called the offset
    • offset is generally used in accessing elements of array or structure: base register points to beginning of array or structure

Modified by S. J. Fritz Spring 2009 (42)

slide43

Data Transfer: Register to Memory

  • Also want to store from register into memory
    • Store instruction syntax is identical to Load’s
  • MIPS Instruction Name:

sw (meaning Store Word, so 32 bits or one word are loaded at a time)

  • Example: sw $t0,12($s0)

This instruction will take the pointer in $s0, add 12 bytes to it, and then store the value from register $t0 into that memory address

  • Remember: “ Store INTO memory”

Data flow

Modified by S. J. Fritz Spring 2009 (43)

data transfer instructions load
Data Transfer Instructions- Load
  • Load copies data from memory to a register- in MIPS lw or load word
  • Format – operation name followed by the register to be loaded, then a constant and register used to access memory
  • Sum of the constant portion of the instruction and the constants of the second registers forms the memory address

Modified by S. J. Fritz Spring 2009 (44)

data transfer instructions load45
Data Transfer Instructions- Load
  • Assume A is an array of 100 words, with a starting or base address in $s3
  • Let g, h be variables associated with $s1,$s2
  • C Assignment Statement:

g = h + A[8];

  • Compiling to MIPS with operand in Memory
  • First transfer A[8] to a register: (use load word - lw)

lw $t0, 8($s3) #Temp register $t0 gets A[8]

add $s1,$s2,$t0 # g = h + A[8]

  • The constant 8 is the offset and the register ($s3) added to form the address is called the base register.

Modified by S. J. Fritz Spring 2009 (45)

actual mips memory addresses and contents
100

10

101

1

Actual MIPS Memory Addresses and Contents

12

8

4

0

Address

Data

Processor

Memory

Since MIPS addresses each byte, word addresses are multiples of 4; there are 4 bytes in a word. Byte address of third word is 8.

Modified by S. J. Fritz Spring 2009 (46)

data transfer instructions store
Data Transfer Instructions- Store
  • Instruction complementary to load is called store, or store word – sw- which copies data from a register to memory
  • Format similar to load instruction: name of operation, followed by the register to be stored, then offset to select the element, and finally the base register.

Modified by S. J. Fritz Spring 2009 (47)

data transfer instructions store48
Data Transfer Instructions- Store
  • Assume variable h is associated with register $s2
  • Base address of array A is in $s3.
  • C code: A[12] = h + A[8];
  • MIPS code:

lw $t0, 32($s3) # temp reg $t0 gets A[8]

add $t0, $s2, $t0 # temp reg $t0 gets h + A[8]

sw $t0, 48($s3) # stores h + A[8] into A[12]

Modified by S. J. Fritz Spring 2009 (48)

mips memory addressing
MIPS Memory Addressing
  • Most architectures addresses individual bytes, therefore the address of a word matches the address of one of the 4 bytes within the word.
  • Addresses of sequential wordsdiffer by 4.
  • In MIPS, words must start at addresses that are multiples of 4.- called assignment restriction.
  • Remember MIPS is “big endian”
  • Byte addressing affects the array index.
  • Offset to be added to the base register $s3 (in previous example) must be (4 x 8) or 32.

Modified by S. J. Fritz Spring 2009 (49)

constants or immediate operands
Constants or Immediate Operands
  • Design Principle 3: Make the common case FAST
  • Constants occur frequently and by including constants in arithmetic instructions, they are faster than if the constants were loaded from memory:

lw $t0, AddrContant4($s1) # t0 = constant 4

  • To add 4 to register 3 use add immediate (addi):

addi $s3, $s3, 4 # $s3 = $s3+4

  • Since MIPS supports negative constants, there is no need for a subtract immediate instruction.

Modified by S. J. Fritz Spring 2009 (50)

slide51

Pointers v. Values

  • Key Concept:
  • A register can hold any 32-bit value. That value can be a (signed) int, an unsigned int, a pointer (memory address), and so on
  • If you write add $t2,$t1,$t0 then $t0 and $t1 must contain values
  • If you write lw $t2,0($t0) then $t0 must contain a pointer to memory
  • Don’t mix these up!

Modified by S. J. Fritz Spring 2009 (51)

slide52

Called the “address” of a word

Addressing: Byte vs. Word

  • Every word in memory has an address, similar to an index in an array
  • Early computers numbered words like C numbers elements of an array:
    • Memory[0], Memory[1], Memory[2], …
  • Computers needed to access 8-bit bytes as well as words (4 bytes/word)
  • Today machines address memory as bytes, (i.e.,“Byte Addressed”) hence 32-bit (4 byte) word addresses differ by 4
    • Memory[0], Memory[4], Memory[8], …

Modified by S. J. Fritz Spring 2009 (52)

immediates
Immediates
  • Immediates are numerical constants.
  • They appear often in code, so there are special instructions for them.
  • Add Immediate:

addi $s0,$s1,10(in MIPS)

f = g + 10 (in C)

where MIPS registers $s0,$s1 are associated with C or Java variables f, g

  • Syntax similar to add instruction, except that the

last argument is a numberinstead of a register.

Modified by S. J. Fritz Spring 2009 (53)

immediates and subtraction
Immediates and Subtraction
  • There is no Subtract Immediate in MIPS: Why?
  • Limit types of operations that can be done to absolute minimum
    • negative constants are less frequent
    • if an operation can be decomposed into a simpler operation, don’t include it
    • addi …, -X = subi …, X => no need for subi
  • addi $s0,$s1,-10(in MIPS)

f = g - 10(in C)

where MIPS registers $s0,$s1are associated with C or Java variables f, g

Modified by S. J. Fritz Spring 2009 (54)

register zero
Register Zero
  • One particular immediate, the number zero (0), appears very often in code.
  • So we define register zero ($0 or $zero) to always have the value 0; for example:

add $s0,$s1,$zero(in MIPS)

f = g(in C)

where MIPS registers $s0,$s1are associated with C variables f, g

  • Defined in hardware, so an instruction

add $zero,$zero,$s0

will not do anything!

Modified by S. J. Fritz Spring 2009 (55)

summarizing
Summarizing...
  • In MIPS Assembly Language:
    • Registers replace C variables
    • One Instruction (simple operation) per line
    • Simpler is Better
    • Smaller is Faster
  • New Instructions:

add, addi, sub# arithmetic operations

lw, sw # load, store –from/to memory

  • New Registers:

C or Java Variables: $s0 - $s7

Temporary Variables: $t0 - $t9

Zero: $zero

Modified by S. J. Fritz Spring 2009 (56)

slide57

Compilation with Memory

  • What offset in lw to select A[5] in C/Java?
  • 4x5=20 to select A[5]: byte v. word
  • Compile by hand using registers:g = h + A[5];

where g: $s1, h: $s2, $s3:base address of A

  • 1st transfer from memory to register:

lw $t0,20($s3) # $t0 gets A[5]

    • Add 20 to $s3 to select A[5], put into $t0
  • Next add it to h and place in gadd $s1,$s2,$t0 # $s1 = h+A[5]

Modified by S. J. Fritz Spring 2009 (57)

mips instruction encoding
MIPS Instruction Encoding

Modified by S. J. Fritz Spring 2009 (58)

mips assembler register convention
MIPS Assembler Register Convention
  • “caller saved”
  • “callee saved”
  • On Green Card in Column #2 at bottom

Modified by S. J. Fritz Spring 2009 (59)

slide60

Notes about Memory

  • Pitfall:
  • Forgetting that sequential word addresses in machines with byte addressing do not differ by 1.
    • Many an assembly language programmer has toiled over errors made by assuming that the address of the next word can be found by incrementing the address in a register by 1 instead of by the word size in bytes.
    • So remember that for both lw and sw, the sum of the base address and the offset must be a multiple of 4(to beword aligned)

Modified by S. J. Fritz Spring 2009 (60)

memory operand example 1
Memory Operand Example 1
  • C code:

g = h + A[8];

    • g in $s1, h in $s2, base address of A in $s3
  • Compiled MIPS code:
    • Index 8 requires offset of 32
      • 4 bytes per word

lw $t0, 32($s3) # load wordadd $s1, $s2, $t0

base register

offset

Modified by S. J. Fritz Spring 2009 (61)

memory operand example 2
Memory Operand Example 2
  • C code:

A[12] = h + A[8];

    • Variable h in $s2, base address of A in $s3
  • Compiled MIPS code:
    • Index 8 requires offset of 32

lw $t0, 32($s3) # load wordadd $t0, $s2, $t0sw $t0, 48($s3) # store word

Modified by S. J. Fritz Spring 2009 (62)

unsigned binary integers
Unsigned Binary Integers
  • Computers store numbers as binary digits (bits)
  • Given an n-bit number

§2.4 Signed and Unsigned Numbers

  • Range: 0 to +2n – 1
  • Example

0000 0000 0000 0000 0000 0000 0000 10112= 0 + … + 1×23 + 0×22 +1×21 +1×20= 0 + … + 8 + 0 + 2 + 1 = 1110

  • Using 32 bits

0 to +4,294,967,295

Modified by S. J. Fritz Spring 2009 (63)

binary numbers
Binary Numbers
  • The MIPS word is 32 bits so we can represent 232 different values.
  • Least significant bit refers to the rightmost bit
  • Most significant bit is the leftmost bit.
  • Sign and magnitude uses a separate sign bit to distinguish positive and negative numbers. Not used because of difficulty with arithmetic…

Modified by S. J. Fritz Spring 2009 (64)

two s complement representation
Two’s Complement Representation
  • Makes hardware representation simple:
  • Leading zero(0) means positive, leading one (1) means negative –called the sign bit
  • All negative numbers begin with a 1.
  • Has one negative number –2,147,483,64810 that does not have a corresponding positive number.

Modified by S. J. Fritz Spring 2009 (65)

two s complement representation66
Two’s Complement Representation
  • To form the negation of a binary number
    • Invert all bits to form the complement
    • Add one

For example, to negate binary 28

00011100 Binary 28

- Invert the digits. (0 becomes 1, 1 becomes 0)

11100011 Then we add 1.

+ 1

11100100 Binary -28

For more information see:

http://www.cs.cornell.edu/~tomf/notes/cps104/twoscomp.html

Modified by S. J. Fritz Spring 2009 (66)

two s complement representation67
Two’s Complement Representation
  • Going in the opposite direction- taking the negation and transforming it into the positive binary number
    • Invert all bits to form the complement
    • Add one

For example, to negate binary -28

11100100 Binary -28

- Invert the digits. (0 becomes 1, 1 becomes 0)

00011011 Then we add 1.

+ 1

00011100 Binary 28

  • This works because the binary representation of a sum of a number and its inverse equal –1

x + x = -1

Modified by S. J. Fritz Spring 2009 (67)

2 s complement simulator
2’s Complement Simulator
  • Try it with a simulator:
  • http://scholar.hw.ac.uk/site/computing/activity12.asp?outline

Modified by S. J. Fritz Spring 2009 (68)

2s complement signed integers example
2s-Complement Signed Integers Example
  • Given an n-bit number represented as
  • Range: –2n – 1 to +2n – 1 – 1
  • Example

1111 1111 1111 1111 1111 1111 1111 11002= –1×231 + 1×230 + … + 1×22 +0×21 +0×20= –2,147,483,648 + 2,147,483,644 = –410

  • Using 32 bits

–2,147,483,648 to +2,147,483,647

Modified by S. J. Fritz Spring 2009 (69)

2s complement signed integers
2s-Complement Signed Integers
  • Bit 31 is sign bit
    • 1 for negative numbers
    • 0 for non-negative numbers
  • –(–2n – 1) can’t be represented
  • Non-negative numbers have the same unsigned and 2s-complement representation
  • Some specific numbers:
    • 0: 0000 0000 … 0000
    • –1: 1111 1111 … 1111
    • Most-negative: 1000 0000 … 0000
    • Most-positive: 0111 1111 … 1111

Modified by S. J. Fritz Spring 2009 (70)

more examples
More Examples
  • References for Two’s Complement notation
  • http://www.duke.edu/~twf/cps104/twoscomp.html
  • http://en.wikipedia.org/wiki/Two's_complement
  • http://mathforum.org/library/drmath/sets/select/dm_twos_complement.html
  • http://www.fact-index.com/t/tw/two_s_complement.html
  • http://www.hal-pc.org/~clyndes/computer-arithmetic/twoscomplement.html
  • http://www.vb-helper.com/tutorial_twos_complement.html
  • http://web.bvu.edu/faculty/traylor/CS_Help_Stuff/Two's%20Complement.htm

Modified by S. J. Fritz Spring 2009 (71)

sign extension
Sign Extension
  • Representing a number using more bits
    • Preserve the numeric value
  • In MIPS instruction set
    • addi: extend immediate value
    • lb, lh: extend loaded byte/halfword
    • beq, bne: extend the displacement
  • Replicate the sign bit to the left
    • c.f. unsigned values: extend with 0s
  • Examples: 8-bit to 16-bit
    • +2: 0000 0010 => 0000 00000000 0010
    • –2: 1111 1110 => 1111 11111111 1110

Modified by S. J. Fritz Spring 2009 (72)

representing instructions
Representing Instructions
  • Instructions are encoded in binary
    • Called machine code
  • MIPS instructions
    • Encoded as 32-bit instruction words
    • Small number of formats encoding operation code (opcode), register numbers, …
    • Regularity!
  • Register numbers ( important!)
    • $t0 – $t7 are registers 8 – 15
    • $t8 – $t9 are registers 24 – 25
    • $s0 – $s7 are registers 16 – 23

§2.5 Representing Instructions in the Computer

Modified by S. J. Fritz Spring 2009 (73)

r format instructions

6

5

5

5

5

6

opcode

rs

rt

rd

shamt

funct

R-Format Instructions
  • Define “fields” of the following number of bits each: 6 + 5 + 5 + 5 + 5 + 6 = 32
  • For simplicity, each field has a name:

Modified by S. J. Fritz Spring 2009 (74)

r format instructions75
R-Format Instructions
  • Meaning of fields:
    • rs (Source Register): generally used to specify register containing first operand
    • rt (Target Register): generally used to specify register containing second operand (note that name is misleading)
    • rd (Destination Register): generally used to specify register which will receive result of computation
    • shamt (Shift amount)
    • funct ( Function) - selects specific variant of the opcode operation - sometimes called function code

Modified by S. J. Fritz Spring 2009 (75)

mips r format instructions summary

op

rs

rt

rd

shamt

funct

6 bits

5 bits

5 bits

5 bits

5 bits

6 bits

MIPS R-Format Instructions - Summary
  • MIPS fileds are given names to make them easier to remember
  • Instruction fields
    • op: operation code (opcode)
    • rs: first source register number
    • rt: second source register number
    • rd: destination register number
    • shamt: shift amount (00000 for now)
    • funct: function code (extends opcode)

Modified by S. J. Fritz Spring 2009 (76)

r format example

op

rs

rt

rd

shamt

funct

6 bits

5 bits

5 bits

5 bits

5 bits

6 bits

R-format Example

add $t0, $s1, $s2

Instruction format or layout

special

$s1

$s2

$t0

0

add

0

17

18

8

0

32

hex

binary

000000

10001

10010

01000

00000

100000

Mips instruction:

000000100011001001000000001000002 = 0232402016

Modified by S. J. Fritz Spring 2009 (77)

hexadecimal
Hexadecimal
  • Base 16
    • Compact representation of bit strings
    • 4 bits per hex digit
  • Example: e c a 8 6 4 2 0

1110 1100 1010 1000 0110 0100 0010 0000

Modified by S. J. Fritz Spring 2009 (78)

why multiple instruction formats
Why Multiple Instruction Formats?
  • Design Principle 4: Good design demands good compromises -
    • there is a need to keep instructions the same length and desire for a single format
  • There is a problem using previous (R-format) when an instruction needs longer fields
    • for example lw must specify two registers and a constant, but the constant would have only 5 bits available, so the largest value would be 25 = 32
  • Solution: allow I and J formats for different instructions - but keep all the same length = 32 bits

Modified by S. J. Fritz Spring 2009 (79)

slide80

Overview of MIPS

  • Simple instructions all 32 bits wide
  • Very structured, no unnecessary baggage
  • Only three instruction formatsrely on compiler to achieve performance — what are the compiler's goals?
  • help compiler where we can

op rs rt rd shamt funct

R

I

J

op rs rt 16 bit address

op 26 bit address

Modified by S. J. Fritz Spring 2009 (80)

additional mips instruction formats
Additional MIPS Instruction Formats
  • I-format: used for instructions with immediates, lw and sw (since the offset counts as an immediate), and the branches (beq and bne),
    • (but not the shift instructions; later)
  • J-format: used for j and jal (jump and link)
  • R-format: used for all other instructions
  • It will soon become clear why the instructions have been partitioned in this way.

Modified by S. J. Fritz Spring 2009 (81)

mips i format instructions

op

rs

rt

constant or address

6 bits

5 bits

5 bits

16 bits

MIPS I-format Instructions
  • Format for Immediate arithmetic and load/store instructions
    • rt: destination or source register number
    • Constant: –215 to +215 – 1
    • Address: offset added to base address in rs

Modified by S. J. Fritz Spring 2009 (82)

slide83

Name

Instruction Fields

Comments

6 bits

(31-26)

5 bits

(25-21)

5 bits

(20-16)

5 bits(15-11)

5 bits

(10-6)

6 bits

(5-0)

All MIPS instructions 32 bits

R-format

(6 fields)

op

rs

rt

rd

shamt

funct

Arithmetic instruction format

I-format

(4 fields)

op

rs

rt

Address/immediate

Data Transfer, branch, immediate instruction format

J-format

(2 fields)

op

Target address

Jump instruction format

Instruction Format Names and Field Descriptions

Instruction field notes:

The op andfunctfields form the op-code. The rsfield gives a source register and rt is also normally a source register.rdis the destination register, and shamtsupplies the shift amount for logical shift operations.

Modified by S. J. Fritz Spring 2009 (83)

r format example84

000000

0

01001

9

01010

10

01000

8

00000

0

100000

32

hex

R-Format Example
  • MIPS Instruction:

add $8,$9,$10

Decimal number per field representation:

Binary number per field representation:

hex representation: 012A 4020hex

decimal representation: 19,546,144ten

On Green Card: Format in column 1, opcodes in column 3

Modified by S. J. Fritz Spring 2009 (84)

green card
Green Card
  • green card /n./ [after the "IBM System/360 Reference Data" card] A summary of an assembly language, even if the color is not green. For example,"I'll go get my green card so I can check the addressing mode for that instruction."

www.jargon.net

Image from Dave's Green Card Collection: http://www.planetmvs.com/greencard/

Modified by S. J. Fritz Spring 2009 (85)

j format instructions

6 bits

opcode

target address

26 bits

J-Format Instructions
  • Define “fields” of the following number of bits each:
  • As usual, each field has a name:
  • Key Concepts
    • Keep opcode field identical to R-format and I-format for consistency.
    • Combine all other fields to make room for large target address.

Modified by S. J. Fritz Spring 2009 (86)

translating assembly language into machine language
Translating Assembly Language into Machine Language
  • Suppose $t1 has base of array A and $s2 corresponds to h in the assignment

A[300] = h + A[300]

  • In MIPS : ( try this )

lw $t0, 1200 ($t1)# temp register $t0 gets A[300]

add $t0, $s2, $t0 # temp register $t0 gets h+ A[300]

sw $t0, 1200($t1) #stores h = A[300] back into A[300]

These instructions can then be represented in machine language…

Modified by S. J. Fritz Spring 2009 (87)

translating assembly language into machine language88
Translating Assembly Language into Machine Language

lw $t0, 1200 ($t1) # temp register $t0 gets A[300]

add $t0, $s2, $t0 # temp register $t0 gets h+ A[300]

sw $t0, 1200($t1) #stores h = A[300] back into A[300]

Modified by S. J. Fritz Spring 2009 (88)

translating mips assembly language into machine language
Translating MIPS Assembly Language into Machine Language

The lw instruction (opcode) is 35, the base register is 9 ($t1), and the destination register ($t0) is 8. The offset 1200=300x4 is address.

The add instruction is specified by 0 in the op field and 32 in the funct field.

The sw instruction is 43 and the rest is similar to the lw instruction.

See the summary on page 101.

Modified by S. J. Fritz Spring 2009 (89)

translating mips assembly language into machine language90
Translating MIPS Assembly Language into Machine Language

Since 1200ten = 0000 0100 1011 0000two , the binary equivalent of the previous form is:

  • Notice the similarity in the first and last instructions. The only difference is in the third bit from the left.
  • This similarity simplifies hardware design…

Modified by S. J. Fritz Spring 2009 (90)

stored program computers
Stored Program Computers

The BIG Picture

  • Instructions represented in binary, just like data
  • Instructions and data stored in memory
  • Programs can operate on programs
    • e.g., compilers, linkers, …
  • Binary compatibility allows compiled programs to work on different computers
    • Standardized ISAs

Modified by S. J. Fritz Spring 2009 (91)

logical operations
Logical Operations

§2.6 Logical Operations

  • Instructions for bitwise manipulation
  • Useful for extracting and inserting groups of bits in a word

Modified by S. J. Fritz Spring 2009 (92)

shift operations

op

rs

rt

rd

shamt

funct

6 bits

5 bits

5 bits

5 bits

5 bits

6 bits

Shift Operations
  • shamt: how many positions to shift
  • Shift left logical
    • Shift left and fill with 0 bits
    • sll by i bits multiplies by 2i
    • sll $t2, $s0, 4 # reg $t2 = reg $s0 << 4 bits
  • Shift right logical
    • Shift right and fill with 0 bits
    • srl by i bits divides by 2i (unsigned only)
    • srl $t2, $s0, 4 # reg $t2 = reg $s0 >> 4 bits

Modified by S. J. Fritz Spring 2009 (93)

and operations
AND Operations
  • Useful to mask bits in a word
    • Select some bits, clear others to 0
    • Bit –by –bit operation, 1 if both are 1,0 otherwise

and $t0, $t1, $t2 #reg $t0=reg $t1 & reg $t2

$t2

0000 0000 0000 0000 0000 1101 1100 0000

$t1

0000 0000 0000 0000 0011 1100 0000 0000

$t0

0000 0000 0000 0000 0000 1100 0000 0000

Modified by S. J. Fritz Spring 2009 (94)

or operations
OR Operations
  • Useful to include bits in a word
    • Set some bits to 1, leave others unchanged
    • Places 1 in the result if either bit is 1, 0 otherwise

or $t0, $t1, $t2#reg $t0=reg $t1 | reg $t2

$t2

0000 0000 0000 0000 0000 1101 1100 0000

$t1

0000 0000 0000 0000 0011 1100 0000 0000

$t0

0000 0000 0000 0000 0011 1101 1100 0000

Modified by S. J. Fritz Spring 2009 (95)

not operations
NOT Operations
  • Useful to invert bits in a word
    • Change 0 to 1, and 1 to 0
  • For consistency, MIPS has NOR, a 3-operand instruction, instead of NOT
    • a NOR b == NOT ( a OR b )

nor $t0,$t1,$zero #reg$t0=-(reg $t1| $zero )

Register 0: always read as zero

$t1

0000 0000 0000 0000 0011 1100 0000 0000

$t0

1111 1111 1111 1111 1100 0011 1111 1111

The full MIPS instruction set also includes XOR

Modified by S. J. Fritz Spring 2009 (96)

conditional operations
Conditional Operations
  • MIPS includes two decision making instructions, similar to an if statement as well as a “go to”
  • Branch to a labeled instruction if a condition is true
    • Otherwise, continue sequentially
  • beq rs, rt, L1 #branch on equal
    • if (rs == rt) branch to instruction labeled L1;
  • bne rs, rt, L1 #branch not equal
    • if (rs != rt) branch to instruction labeled L1;
  • j L1
    • unconditional jump to instruction labeled L1

§2.7 Instructions for Making Decisions

Modified by S. J. Fritz Spring 2009 (97)

slide98

(false) i != j

(true) i == j

i == j?

f=g+h

f=g-h

Exit

Compiling C/Java if into MIPS

  • Compile by hand

if (i == j) f=g+h; else f=g-h;

  • Use this mapping: f: $s0 g: $s1 h: $s2 i: $s3 j: $s4

Modified by S. J. Fritz Spring 2009 (98)

compiling if statements
Compiling If Statements
  • C code:

if (i==j) f = g+h;else f = g-h;

where f, g, … in $s0, $s1, …

  • Compiled MIPS code:

bne $s3, $s4, Else #goto Else if i ≠ j add $s0, $s1, $s2 # skip if i = j j Exit #goto ExitElse: sub $s0, $s1, $s2Exit: …

Assembler calculates addresses

Modified by S. J. Fritz Spring 2009 (99)

compiling loop statements
Compiling Loop Statements
  • Code for loops is similar to that for decisions
  • C code:

while (save[i] == k) i += 1;

where i is in $s3, k in $s5, address of the array save is in $s6

    • First load save[i] into a temporary register. To do so, we need to form the address by multiplying the index by 4.
    • Then add $t1 to the base of save in $s6.
  • Compiled MIPS code:

Loop: sll $t1, $s3, 2 #temp reg $t1 = i * 4 add $t1, $t1, $s6 #$t1 = address of save[i] lw $t0, 0($t1) #temp reg $t1 = save[i] bne $t0, $s5, Exit #goto Exit if save[i] = k

addi $s3, $s3, 1 # i= i +1 j Loop # goto LoopExit: …

Modified by S. J. Fritz Spring 2009 (100)

basic blocks
Basic Blocks
  • A basic block is a sequence of instructions with
    • No embedded branches (except at end)
    • No branch targets (except at beginning)
  • A compiler identifies basic blocks for optimization
  • An advanced processor can accelerate execution of basic blocks

Modified by S. J. Fritz Spring 2009 (101)

more conditional operations
More Conditional Operations
  • Test for equality or inequality
  • Set result to 1 if a condition is true
    • Otherwise, set to 0
  • slt rd, rs, rt #set on less than
    • if (rs < rt) rd = 1; else rd = 0;
  • slti rt, rs, constant #set immediate
    • if (rs < constant) rt = 1; else rt = 0;
  • Use in combination with beq, bne

slt $t0, $s1, $s2 # if ($s1 < $s2)bne $t0, $zero, L # branch to L

Modified by S. J. Fritz Spring 2009 (102)

branch instruction design
Branch Instruction Design
  • MIPS does not include a branch on less than instruction.
  • Why not blt, bge, etc?
  • Uses Von neumann’s warning to keep equipment simple
  • Hardware for <, ≥, … slower than =, ≠
    • Combining with branch involves more work per instruction, requiring a slower clock
    • All instructions penalized!
  • beq and bne are the common case
  • This is a good design compromise- to use only slt, slti,beq, bne and zero for all relative conditions.

Modified by S. J. Fritz Spring 2009 (103)

signed vs unsigned
Signed vs. Unsigned
  • Signed comparison: slt, slti
  • Unsigned comparison: sltu, sltui
  • Example
    • $s0 = 1111 1111 1111 1111 1111 1111 1111 1111
    • $s1 = 0000 0000 0000 0000 0000 0000 0000 0001
    • slt $t0, $s0, $s1 # signed
      • –1 < +1  $t0 = 1
    • sltu $t0, $s0, $s1 # unsigned
      • +4,294,967,295 > +1  $t0 = 0
    • The value in reg $s0 is –1 if it is an integer and 4,294967,295 if it is an unsigned integer.
    • Register $s1 contains a 1 in either case.

Modified by S. J. Fritz Spring 2009 (104)

signed vs unsigned105
Signed vs. Unsigned
  • Treating signed numbers as if they were unsigned gives us a low cost way to check if 0 < x <y
  • This can also be used for a bounds check for an array.
  • An unsigned comparison of x < y also checks if x is negative as well as if x is less than y.

Modified by S. J. Fritz Spring 2009 (105)

case switch statement
Case/Switch Statement
  • Simplest implementation of switch is with a sequence of if-then-else statements
  • Alternative include a jump address table, or jump table.
    • Program indexes into the table and then jumps to the appropriate sequence
    • Jump table is array of addresses corresponding to the labels in the code
    • Loads entry into a register and then jumps to the address in the register
    • MIPS includes a jump register instruction (jr)

Modified by S. J. Fritz Spring 2009 (106)

slide107

“And in Conclusion…”

  • Memory is byte-addressable, but lw and sw access one word at a time.
  • A pointer (used by lw and sw) is just a memory address, so we can add to it or subtract from it (using offset).
  • A Decision allows us to decide what to execute at run-time rather than compile-time.
  • C/Java decisions are made using conditional statements within if, while, do while, for.
  • MIPS Decision making instructions are the conditional branches: beq and bne.
  • New Instructions:

lw, sw, beq, bne, j

Modified by S. J. Fritz Spring 2009 (107)

procedure calling
Procedure Calling
  • Steps required
    • Place parameters in registers
    • Transfer control to procedure
    • Acquire storage for procedure
    • Perform procedure’s operations
    • Place result in register for caller
    • Return to place of call

§2.8 Supporting Procedures in Computer Hardware

Modified by S. J. Fritz Spring 2009 (108)

register usage
Register Usage
  • $a0 – $a3: arguments (registers 4 – 7)
  • $v0, $v1: result values (registers 2 and 3)
  • $t0 – $t9: temporaries
    • Can be overwritten by callee
  • $s0 – $s7: saved
    • Must be saved/restored by callee
  • $gp: global pointer for static data (reg 28)
  • $sp: stack pointer (reg 29)
  • $fp: frame pointer (reg 30)
  • $ra: return address (reg 31)

Modified by S. J. Fritz Spring 2009 (109)

procedure call instructions
Procedure Call Instructions
  • Procedure call: jump and link (jal)

jal ProcedureLabel

    • Address of following instruction put in $ra
    • Jumps to target address
  • Procedure return: jump register (jr)

jr $ra

    • Copies $ra to program counter
    • Can also be used for computed jumps
      • e.g., for case/switch statements

Modified by S. J. Fritz Spring 2009 (110)

procedure call instructions111
Procedure Call Instructions
  • The link means that an address or link is formed that points to the calling site that allows the procedure to return to the proper address – (return address)- stored in $ra
  • The calling program – the caller, puts the parameter values in a register ($a0-$a3) and uses jal X to jump to procedure X (the callee).
  • The callee performs the calculations and then returns to the caller using jr $ra
  • The address of the current instruction is saved in theprogram counter - PC

Modified by S. J. Fritz Spring 2009 (111)

using more registers
Using More Registers
  • Compilers often need additional registers – to spill register to memory
  • Stack – (LIFO)- last in first out data structure
  • Stack pointer- adjusted by one word for each registered saved or restored
  • MIPS reserves register for the stack pointer, $sp
  • Stack “grows” from higher to lower addresses
  • Push - places data on the stack (subtract form stack pointer)
  • Pop- removes data from the stack ( adds to the stack pointer)
  • Stack “grows” from higher to lower addresses

Modified by S. J. Fritz Spring 2009 (112)

leaf procedure example
Leaf Procedure Example
  • C code:

int leaf_example (int g, h, i, j){ int f; f = (g + h) - (i + j); return f;}

    • Arguments g, …, j in $a0, …, $a3
    • f in $s0 (hence, need to save $s0 on stack)
    • Result in $v0

Modified by S. J. Fritz Spring 2009 (113)

leaf procedure example114
Leaf Procedure Example
  • MIPS code:
  • leaf_example: addi $sp, $sp, -4 sw $s0, 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0, $t0, $t1 add $v0, $s0, $zero lw $s0, 0($sp) addi $sp, $sp, 4 jr $ra

Save $s0 on stack

Procedure body

Result

Restore $s0

Return

Modified by S. J. Fritz Spring 2009 (114)

non leaf procedures
Non-Leaf Procedures
  • Procedures that call other procedures
  • For nested call, caller needs to save on the stack:
    • Its return address
    • Any arguments and temporaries needed after the call
  • Restore from the stack after the call

Modified by S. J. Fritz Spring 2009 (115)

non leaf procedure example
Non-Leaf Procedure Example
  • C code:

int fact (int n){ if (n < 1) return f; else return n * fact(n - 1);}

    • Argument n in $a0
    • Result in $v0

Modified by S. J. Fritz Spring 2009 (116)

non leaf procedure example117
Non-Leaf Procedure Example
  • MIPS code:

fact: addi $sp, $sp, -8 # adjust stack for 2 items sw $ra, 4($sp) # save return address sw $a0, 0($sp) # save argument slti $t0, $a0, 1 # test for n < 1 beq $t0, $zero, L1 addi $v0, $zero, 1 # if so, result is 1 addi $sp, $sp, 8 # pop 2 items from stack jr $ra # and returnL1: addi $a0, $a0, -1 # else decrement n jal fact # recursive call lw $a0, 0($sp) # restore original n lw $ra, 4($sp) # and return address addi $sp, $sp, 8 # pop 2 items from stack mul $v0, $a0, $v0 # multiply to get result jr $ra # and return

Modified by S. J. Fritz Spring 2009 (117)

local data on the stack
Local Data on the Stack
  • Local data allocated by callee
    • e.g., C automatic variables
  • Procedure frame (activation record)
    • Used by some compilers to manage stack storage

Modified by S. J. Fritz Spring 2009 (118)

memory layout
Memory Layout
  • Text: program code
  • Static data: global variables
    • e.g., static variables in C, constant arrays and strings
    • $gp initialized to address allowing ±offsets into this segment
  • Dynamic data: heap
    • E.g., malloc in C, new in Java
  • Stack: automatic storage

Modified by S. J. Fritz Spring 2009 (119)

character data
Character Data
  • Byte-encoded character sets
    • ASCII: 128 characters
      • 95 graphic, 33 control
    • Latin-1: 256 characters
      • ASCII, +96 more graphic characters
  • Unicode: 32-bit character set
    • Used in Java, C++ wide characters, …
    • Most of the world’s alphabets, plus symbols
    • UTF-8, UTF-16: variable-length encodings

§2.9 Communicating with People

Modified by S. J. Fritz Spring 2009 (120)

byte halfword operations
Byte/Halfword Operations
  • Could use bitwise operations
  • MIPS byte/halfword load/store
    • String processing is a common case

lb rt, offset(rs) lh rt, offset(rs)

    • Sign extend to 32 bits in rt

lbu rt, offset(rs) lhu rt, offset(rs)

    • Zero extend to 32 bits in rt

sb rt, offset(rs) sh rt, offset(rs)

    • Store just rightmost byte/halfword

Modified by S. J. Fritz Spring 2009 (121)

string copy example
String Copy Example
  • C code (naïve):
    • Null-terminated string

void strcpy (char x[], char y[]){ int i; i = 0; while ((x[i]=y[i])!='\0') i += 1;}

    • Addresses of x, y in $a0, $a1
    • i in $s0

Modified by S. J. Fritz Spring 2009 (122)

string copy example123
String Copy Example
  • MIPS code:

strcpy: addi $sp, $sp, -4 # adjust stack for 1 item sw $s0, 0($sp) # save $s0 add $s0, $zero, $zero # i = 0L1: add $t1, $s0, $a1 # addr of y[i] in $t1 lbu $t2, 0($t1) # $t2 = y[i] add $t3, $s0, $a0 # addr of x[i] in $t3 sb $t2, 0($t3) # x[i] = y[i] beq $t2, $zero, L2 # exit loop if y[i] == 0 addi $s0, $s0, 1 # i = i + 1 j L1 # next iteration of loopL2: lw $s0, 0($sp) # restore saved $s0 addi $sp, $sp, 4 # pop 1 item from stack jr $ra # and return

Modified by S. J. Fritz Spring 2009 (123)

32 bit constants
32-bit Constants
  • Most constants are small
    • 16-bit immediate is sufficient
  • For the occasional 32-bit constant

lui rt, constant

    • Copies 16-bit constant to left 16 bits of rt
    • Clears right 16 bits of rt to 0

§2.10 MIPS Addressing for 32-Bit Immediates and Addresses

0000 0000 0111 1101 0000 0000 0000 0000

lhi $s0, 61

0000 0000 0111 1101 0000 1001 0000 0000

ori $s0, $s0, 2304

Modified by S. J. Fritz Spring 2009 (124)

branch addressing

op

rs

rt

constant or address

6 bits

5 bits

5 bits

16 bits

Branch Addressing
  • Branch instructions specify
    • Opcode, two registers, target address
  • Most branch targets are near branch
    • Forward or backward
  • PC-relative addressing
    • Target address = PC + offset × 4
    • PC already incremented by 4 by this time

Modified by S. J. Fritz Spring 2009 (125)

jump addressing

op

address

26 bits

6 bits

Jump Addressing
  • Jump (j and jal) targets could be anywhere in text segment
    • Encode full address in instruction
  • (Pseudo)Direct jump addressing
    • Target address = PC31…28 : (address × 4)

Modified by S. J. Fritz Spring 2009 (126)

target addressing example
Target Addressing Example
  • Loop code from earlier example
    • Assume Loop at location 80000

Modified by S. J. Fritz Spring 2009 (127)

branching far away
Branching Far Away
  • If branch target is too far to encode with 16-bit offset, assembler rewrites the code
  • Example

beq $s0,$s1, L1

bne $s0,$s1, L2 j L1L2: …

Modified by S. J. Fritz Spring 2009 (128)

addressing mode summary
Addressing Mode Summary

Modified by S. J. Fritz Spring 2009 (129)

synchronization
Synchronization
  • Two processors sharing an area of memory
    • P1 writes, then P2 reads
    • Data race if P1 and P2 don’t synchronize
      • Result depends of order of accesses
  • Hardware support required
    • Atomic read/write memory operation
    • No other access to the location allowed between the read and write
  • Could be a single instruction
    • E.g., atomic swap of register ↔ memory
    • Or an atomic pair of instructions

§2.11 Parallelism and Instructions: Synchronization

Modified by S. J. Fritz Spring 2009 (130)

synchronization in mips
Synchronization in MIPS
  • Load linked: ll rt, offset(rs)
  • Store conditional: sc rt, offset(rs)
    • Succeeds if location not changed since the ll
      • Returns 1 in rt
    • Fails if location is changed
      • Returns 0 in rt
  • Example: atomic swap (to test/set lock variable)

try: add $t0,$zero,$s4 ;copy exchange value

ll $t1,0($s1) ;load linked

sc $t0,0($s1) ;store conditional

beq $t0,$zero,try ;branch store fails

add $s4,$zero,$t1 ;put load value in $s4

Modified by S. J. Fritz Spring 2009 (131)

translation and startup
Translation and Startup

Many compilers produce object modules directly

§2.12 Translating and Starting a Program

Static linking

Modified by S. J. Fritz Spring 2009 (132)

assembler pseudoinstructions
Assembler Pseudoinstructions
  • Most assembler instructions represent machine instructions one-to-one
  • Pseudoinstructions: figments of the assembler’s imagination

move $t0, $t1→add $t0, $zero, $t1

blt $t0, $t1, L→slt $at, $t0, $t1bne $at, $zero, L

    • $at (register 1): assembler temporary

Modified by S. J. Fritz Spring 2009 (133)

producing an object module
Producing an Object Module
  • Assembler (or compiler) translates program into machine instructions
  • Provides information for building a complete program from the pieces
    • Header: described contents of object module
    • Text segment: translated instructions
    • Static data segment: data allocated for the life of the program
    • Relocation info: for contents that depend on absolute location of loaded program
    • Symbol table: global definitions and external refs
    • Debug info: for associating with source code

Modified by S. J. Fritz Spring 2009 (134)

linking object modules
Linking Object Modules
  • Produces an executable image

1. Merges segments

2. Resolve labels (determine their addresses)

3. Patch location-dependent and external refs

  • Could leave location dependencies for fixing by a relocating loader
    • But with virtual memory, no need to do this
    • Program can be loaded into absolute location in virtual memory space

Modified by S. J. Fritz Spring 2009 (135)

loading a program
Loading a Program
  • Load from image file on disk into memory

1. Read header to determine segment sizes

2. Create virtual address space

3. Copy text and initialized data into memory

      • Or set page table entries so they can be faulted in

4. Set up arguments on stack

5. Initialize registers (including $sp, $fp, $gp)

6. Jump to startup routine

      • Copies arguments to $a0, … and calls main
      • When main returns, do exit syscall

Modified by S. J. Fritz Spring 2009 (136)

dynamic linking
Dynamic Linking
  • Only link/load library procedure when it is called
    • Requires procedure code to be relocatable
    • Avoids image bloat caused by static linking of all (transitively) referenced libraries
    • Automatically picks up new library versions

Modified by S. J. Fritz Spring 2009 (137)

lazy linkage
Lazy Linkage

Indirection table

Stub: Loads routine ID,Jump to linker/loader

Linker/loader code

Dynamicallymapped code

Modified by S. J. Fritz Spring 2009 (138)

starting java applications
Starting Java Applications

Simple portable instruction set for the JVM

Compiles bytecodes of “hot” methods into native code for host machine

Interprets bytecodes

Modified by S. J. Fritz Spring 2009 (139)

c sort example
C Sort Example
  • Illustrates use of assembly instructions for a C bubble sort function
  • Swap procedure (leaf)

void swap(int v[], int k){ int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp;}

    • v in $a0, k in $a1, temp in $t0

§2.13 A C Sort Example to Put It All Together

Modified by S. J. Fritz Spring 2009 (140)

the procedure swap
The Procedure Swap

swap: sll $t1, $a1, 2 # $t1 = k * 4

add $t1, $a0, $t1 # $t1 = v+(k*4)

# (address of v[k])

lw $t0, 0($t1) # $t0 (temp) = v[k]

lw $t2, 4($t1) # $t2 = v[k+1]

sw $t2, 0($t1) # v[k] = $t2 (v[k+1])

sw $t0, 4($t1) # v[k+1] = $t0 (temp)

jr $ra # return to calling routine

Modified by S. J. Fritz Spring 2009 (141)

the sort procedure in c
The Sort Procedure in C
  • Non-leaf (calls swap)

void sort (int v[], int n)

{

int i, j;

for (i = 0; i < n; i += 1) {

for (j = i – 1;

j >= 0 && v[j] > v[j + 1];

j -= 1) {

swap(v,j);

}

}

}

    • v in $a0, k in $a1, i in $s0, j in $s1

Modified by S. J. Fritz Spring 2009 (142)

the procedure body
The Procedure Body

Moveparams

move $s2, $a0 # save $a0 into $s2

move $s3, $a1 # save $a1 into $s3

move $s0, $zero # i = 0

for1tst: slt $t0, $s0, $s3 # $t0 = 0 if $s0 ≥ $s3 (i ≥ n)

beq $t0, $zero, exit1 # go to exit1 if $s0 ≥ $s3 (i ≥ n)

addi $s1, $s0, –1 # j = i – 1

for2tst: slti $t0, $s1, 0 # $t0 = 1 if $s1 < 0 (j < 0)

bne $t0, $zero, exit2 # go to exit2 if $s1 < 0 (j < 0)

sll $t1, $s1, 2 # $t1 = j * 4

add $t2, $s2, $t1 # $t2 = v + (j * 4)

lw $t3, 0($t2) # $t3 = v[j]

lw $t4, 4($t2) # $t4 = v[j + 1]

slt $t0, $t4, $t3 # $t0 = 0 if $t4 ≥ $t3

beq $t0, $zero, exit2 # go to exit2 if $t4 ≥ $t3

move $a0, $s2 # 1st param of swap is v (old $a0)

move $a1, $s1 # 2nd param of swap is j

jal swap # call swap procedure

addi $s1, $s1, –1 # j –= 1

j for2tst # jump to test of inner loop

exit2: addi $s0, $s0, 1 # i += 1

j for1tst # jump to test of outer loop

Outer loop

Inner loop

Passparams& call

Inner loop

Outer loop

Modified by S. J. Fritz Spring 2009 (143)

the full procedure
The Full Procedure

sort: addi $sp,$sp, –20 # make room on stack for 5 registers

sw $ra, 16($sp) # save $ra on stack

sw $s3,12($sp) # save $s3 on stack

sw $s2, 8($sp) # save $s2 on stack

sw $s1, 4($sp) # save $s1 on stack

sw $s0, 0($sp) # save $s0 on stack

… # procedure body

exit1: lw $s0, 0($sp) # restore $s0 from stack

lw $s1, 4($sp) # restore $s1 from stack

lw $s2, 8($sp) # restore $s2 from stack

lw $s3,12($sp) # restore $s3 from stack

lw $ra,16($sp) # restore $ra from stack

addi $sp,$sp, 20 # restore stack pointer

jr $ra # return to calling routine

Modified by S. J. Fritz Spring 2009 (144)

effect of compiler optimization
Effect of Compiler Optimization

Compiled with gcc for Pentium 4 under Linux

Modified by S. J. Fritz Spring 2009 (145)

effect of language and algorithm
Effect of Language and Algorithm

Modified by S. J. Fritz Spring 2009 (146)

lessons learned
Lessons Learned
  • Instruction count and CPI are not good performance indicators in isolation
  • Compiler optimizations are sensitive to the algorithm
  • Java/JIT compiled code is significantly faster than JVM interpreted
    • Comparable to optimized C in some cases
  • Nothing can fix a dumb algorithm!

Modified by S. J. Fritz Spring 2009 (147)

arrays vs pointers
Arrays vs. Pointers

§2.14 Arrays versus Pointers

  • Array indexing involves
    • Multiplying index by element size
    • Adding to array base address
  • Pointers correspond directly to memory addresses
    • Can avoid indexing complexity

Modified by S. J. Fritz Spring 2009 (148)

example clearing and array
Example: Clearing and Array

Modified by S. J. Fritz Spring 2009 (149)

comparison of array vs ptr
Comparison of Array vs. Ptr
  • Multiply “strength reduced” to shift
  • Array version requires shift to be inside loop
    • Part of index calculation for incremented i
    • c.f. incrementing pointer
  • Compiler can achieve same effect as manual use of pointers
    • Induction variable elimination
    • Better to make program clearer and safer

Modified by S. J. Fritz Spring 2009 (150)

arm mips similarities
ARM & MIPS Similarities
  • ARM: the most popular embedded core
  • Similar basic set of instructions to MIPS

§2.16 Real Stuff: ARM Instructions

Modified by S. J. Fritz Spring 2009 (151)

compare and branch in arm
Compare and Branch in ARM
  • Uses condition codes for result of an arithmetic/logical instruction
    • Negative, zero, carry, overflow
    • Compare instructions to set condition codes without keeping the result
  • Each instruction can be conditional
    • Top 4 bits of instruction word: condition value
    • Can avoid branches over single instructions

Modified by S. J. Fritz Spring 2009 (152)

instruction encoding
Instruction Encoding

Modified by S. J. Fritz Spring 2009 (153)

the intel x86 isa
The Intel x86 ISA
  • Evolution with backward compatibility
    • 8080 (1974): 8-bit microprocessor
      • Accumulator, plus 3 index-register pairs
    • 8086 (1978): 16-bit extension to 8080
      • Complex instruction set (CISC)
    • 8087 (1980): floating-point coprocessor
      • Adds FP instructions and register stack
    • 80286 (1982): 24-bit addresses, MMU
      • Segmented memory mapping and protection
    • 80386 (1985): 32-bit extension (now IA-32)
      • Additional addressing modes and operations
      • Paged memory mapping as well as segments

§2.17 Real Stuff: x86 Instructions

Modified by S. J. Fritz Spring 2009 (154)

the intel x86 isa155
The Intel x86 ISA
  • Further evolution…
    • i486 (1989): pipelined, on-chip caches and FPU
      • Compatible competitors: AMD, Cyrix, …
    • Pentium (1993): superscalar, 64-bit datapath
      • Later versions added MMX (Multi-Media eXtension) instructions
      • The infamous FDIV bug
    • Pentium Pro (1995), Pentium II (1997)
      • New microarchitecture (see Colwell, The Pentium Chronicles)
    • Pentium III (1999)
      • Added SSE (Streaming SIMD Extensions) and associated registers
    • Pentium 4 (2001)
      • New microarchitecture
      • Added SSE2 instructions

Modified by S. J. Fritz Spring 2009 (155)

the intel x86 isa156
The Intel x86 ISA
  • And further…
    • AMD64 (2003): extended architecture to 64 bits
    • EM64T – Extended Memory 64 Technology (2004)
      • AMD64 adopted by Intel (with refinements)
      • Added SSE3 instructions
    • Intel Core (2006)
      • Added SSE4 instructions, virtual machine support
    • AMD64 (announced 2007): SSE5 instructions
      • Intel declined to follow, instead…
    • Advanced Vector Extension (announced 2008)
      • Longer SSE registers, more instructions
  • If Intel didn’t extend with compatibility, its competitors would!
    • Technical elegance ≠ market success

Modified by S. J. Fritz Spring 2009 (156)

basic x86 registers
Basic x86 Registers

Modified by S. J. Fritz Spring 2009 (157)

basic x86 addressing modes
Basic x86 Addressing Modes
  • Two operands per instruction
  • Memory addressing modes
    • Address in register
    • Address = Rbase + displacement
    • Address = Rbase + 2scale× Rindex (scale = 0, 1, 2, or 3)
    • Address = Rbase + 2scale× Rindex + displacement

Modified by S. J. Fritz Spring 2009 (158)

x86 instruction encoding
x86 Instruction Encoding
  • Variable length encoding
    • Postfix bytes specify addressing mode
    • Prefix bytes modify operation
      • Operand length, repetition, locking, …

Modified by S. J. Fritz Spring 2009 (159)

implementing ia 32
Implementing IA-32
  • Complex instruction set makes implementation difficult
    • Hardware translates instructions to simpler microoperations
      • Simple instructions: 1–1
      • Complex instructions: 1–many
    • Microengine similar to RISC
    • Market share makes this economically viable
  • Comparable performance to RISC
    • Compilers avoid complex instructions

Modified by S. J. Fritz Spring 2009 (160)

fallacies
Fallacies

§2.18 Fallacies and Pitfalls

  • Powerful instruction  higher performance
    • Fewer instructions required
    • But complex instructions are hard to implement
      • May slow down all instructions, including simple ones
    • Compilers are good at making fast code from simple instructions
  • Use assembly code for high performance
    • But modern compilers are better at dealing with modern processors
    • More lines of code  more errors and less productivity

Modified by S. J. Fritz Spring 2009 (161)

fallacies162
Fallacies
  • Backward compatibility  instruction set doesn’t change
    • But they do acquire more instructions

x86 instruction set

Modified by S. J. Fritz Spring 2009 (162)

pitfalls
Pitfalls
  • Sequential words are not at sequential addresses
    • Increment by 4, not by 1!
  • Keeping a pointer to an automatic variable after procedure returns
    • e.g., passing pointer back via an argument
    • Pointer becomes invalid when stack popped

Modified by S. J. Fritz Spring 2009 (163)

concluding remarks
Concluding Remarks

§2.19 Concluding Remarks

  • Design principles

1. Simplicity favors regularity

2. Smaller is faster

3. Make the common case fast

4. Good design demands good compromises

  • Layers of software/hardware
    • Compiler, assembler, hardware
  • MIPS: typical of RISC ISAs
    • c.f. x86

Modified by S. J. Fritz Spring 2009 (164)

concluding remarks165
Concluding Remarks
  • Measure MIPS instruction executions in benchmark programs
    • Consider making the common case fast
    • Consider compromises

Modified by S. J. Fritz Spring 2009 (165)