Review of RISC CPU Designs

Review of RISC CPU Designs Lecturer：吳安宇 Date：2005/3/4

Computer Architecture • After this course, you should: • Have a firm grasp of processor instruction sets. • Recognize the main components of a computer and how they interact. • Be able to design a simple pipelined processor. • Have the HW knowledge necessary for later courses in the curriculum. 台灣大學吳安宇教授

Why should you care? • It is interesting. • How do you make a processor that runs at 3Ghz? 台灣大學吳安宇教授

What do we cover? • MIPS is roughly split into three parts. • The first third discusses instruction set architectures • Next, we on processor implementations. • Finally, we talk about memory systems, I/O, and how to connect it all together. 台灣大學吳安宇教授

Instruction set architectures • An instruction set describes the basic functions that a processor can perform. • It serves as an interface between hardware and software; programs are sequences of instructions that get executed by hardware. • Several important issues: • The instruction set in CA lacked many features, such as support for function calls. We’ll work with a larger, more realistic processor. • We’ll also see more ways in which the instruction set architecture affects the hardware design. • We (i.e., you) will do more assembly-language programming too. 台灣大學吳安宇教授

Processor design • The second part of the semester will address two other limitations of the single-cycle processor from CA. • Supporting more complex instructions would increase the cycle time. • The CPU hardware is not fully utilized, so it runs slower than it could. • We will focus on pipelining, which is one of the most important ways of speeding up processors. • The idea behind pipelining is very simple, but there are many details and special cases that must be handled. • Every modern processor uses pipelining. 台灣大學吳安宇教授

Memory and I/O • Memory and I/O are often bottlenecks in modern machines. • Processor speeds far outpace memory and I/O speed (network). • A 4GHz processor won’t help you browse the web any faster if you’re stuck on a 56kbps modem. • The issues associated with memory and I/O (NOT covered in this course) • How caches can dramatically improve the speed of memory accesses. • How processors, memory and peripheral devices can be connected, and CPU support for I/O communications. 台灣大學吳安宇教授

MIPS • In this class, we’ll use the MIPS instruction set architecture (ISA) to illustrate concepts in assembly language and machine organization • Of course, the concepts are not MIPS-specific • MIPS is just convenient because it is realistic, yet simple (unlike x86, CISC) • MIPS was one of the first RISC ISA’s. It is still used in many places today. Primarily in embedded systems, like: • Various routers from Cisco • Game machines like the Nintendo 64 and Sony Playstation 2 (PS2) 台灣大學吳安宇教授

SoC ExampleEmotion Engine in PS2 台灣大學吳安宇教授

PS2 and IP • Emotion Engine • MIPS R3000A Based Design • MPEG decoder • Vector generator (co-processor) • Reach 6.2G Flops 台灣大學吳安宇教授

Instruction Set Architecture (ISA) • As mentioned earlier, the ISA is the interface between hardware and software. • The ISA serves as an abstraction layer between the HW and SW • Software doesn’t need to know how the processor is implemented • Any processor that implements the ISA appears equivalent • An ISA enables processor innovation without changing software • This software compatibility has made billions of dollars for Intel. • Before ISA is finalized, software was re-written for each new machine. 台灣大學吳安宇教授

A little ISA history • 1964: IBM System/360, the first computer “family” • IBM wanted to sell a range of machines that ran the same software • 1960’s, 1970’s: Complex Instruction Set Computer (CISC) era • Much assembly programming, compiler technology immature • Simple machine implementations • Complex instructions simplified programming, little impact on design • 1980’s: Reduced Instruction Set Computer (RISC) era • Most programming in high-level languages, mature compilers • Aggressive machine implementations • Simpler, cleaner ISA’s facilitated pipelining, higher clock frequencies • 1990’s: Post-RISC era • ISA complexity largely relegated to non-issue • CISC and RISC chips use same techniques (pipelining, superscalar, ..) • ISA compatibility outweighs any RISC advantage in general purpose • Embedded processors prefer RISC for lower power, cost 台灣大學吳安宇教授

Basic MIPS Architecture

Basic MIPS Architecture • We started with how instruction set architectures (ISA) abstract away the hardware implementation details, enabling software compatibility across processor generations. • Today we’ll begin our discussion of the MIPS ISA, which will be our example system for much of this semester. • We present the basic instruction set architecture. • This also involves some discussion of the CPU hardware. • This architecture is mostly a superset of the one from CA, so today’s lecture should also serve as a quick review. 台灣大學吳安宇教授

MIPS: register-to-register, three address • MIPS is a register-to-register, or load/store, architecture. • The destination and sources must all be registers. • Special instructions, which we’ll see later today, are needed to access main memory. • MIPS uses three-address instructions for data manipulation. • Each ALU instruction contains a destination and two sources. • For example, an addition instruction (a = b + c) has the form: 台灣大學吳安宇教授

Register file review • Here is a block symbol for a general 2kx n register file. • If Write = 1, then D data is stored into D address. • You can read from two registers at once, by supplying the A address and B address inputs. The outputs appear as A data and B data. • Registers are clocked, sequential devices. • We can read from the register file at any time. • Data is written only on the positive edge of the clock. 台灣大學吳安宇教授

MIPS register file • MIPS processors have 32 registers, each of which holds a 32-bit value. • Register addresses are 5 bits long. • The data inputs and outputs are 32-bits wide. • More registers might seem better, but there is a limit to the goodness. • It’s more expensive, because of both the registers themselves as well as the decoders and muxes needed to select individual registers. • Instruction lengths may be affected, as we’ll see on Friday. 台灣大學吳安宇教授

MIPS register names • MIPS register names begin with a $. There are two naming conventions: • By number: $0 $1 $2 … $31 • By (mostly) two-character names, such as: $a0-$a3 $s0-$s7 $t0-$t9 $sp $ra • Not all of the registers are equivalent: • E.g., register $0 or $zero always contains the value 0 • (go ahead, try to change it) • Other registers have special uses, by convention: • E.g., register $sp is used to hold the “stack pointer” • You have to be a little careful in picking registers for your programs. 台灣大學吳安宇教授

Basic arithmetic and logic operations • The basic integer arithmetic operations include the following: add sub mul div • And here are a few logical operations: and or xor • Remember that these all require three register operands; for example: add $t0, $t1, $t2 # $t0 = $t1 + $t2 mul $s1, $s1, $a0 # $s1 = $s1 _ $a0 台灣大學吳安宇教授

Larger expressions • More complex arithmetic expressions may require multiple operations at the instruction set level. t0 = (t1 + t2) x (t3 - t4) • Temporary registers may be necessary, since each MIPS instructions can access only two source registers and one destination. • In this example, we could re-use $t3 instead of introducing $s0. • But be careful not to modify registers that are needed again later. 台灣大學吳安宇教授

Immediate operands • The ALU instructions we’ve seen so far expect register operands. How do you get data into registers in the first place? • Some MIPS instructions allow you to specify a signed constant, or “immediate” value, for the second source instead of a register. For example, here is the immediate add instruction, addi: addi $t0, $t1, 4 # $t0 = $t1 + 4 • Immediate operands can be used in conjunction with the $zero register to write constants into registers: addi $t0, $0, 4 # $t0 = 4 • MIPS is still considered a load/store architecture, because arithmetic operands cannot be from arbitrary memory locations. They must either be registers or constants that are embedded in the instruction. 台灣大學吳安宇教授

We need more space! • Registers are fast and convenient, but we have only 32 of them, and each one is just 32-bits wide. • That’s not enough to hold data structures like large arrays. • We also can’t access data elements that are wider than 32 bits. • We need to add some main memory to the system! • RAM is cheaper and denser than registers, so we can add lots of it. • But memory is also significantly slower, so registers should be used whenever possible. • In the past, using registers wisely was the programmer’s job. • For example, C has a keyword “register” that marks commonly-used variables which should be kept in the register file if possible. • However, modern compilers do a pretty good job of using registers intelligently and minimizing RAM accesses. 台灣大學吳安宇教授

Memory review • Memory sizes are specified much like register files; here is a 2k x n RAM. • A chip select input CS enables or “disables” the RAM. • ADRS specifies the memory location to access. • WR selects between reading from or writing to the memory. • To read from memory, WR should be set to 0. OUT will be the n-bit value stored at ADRS. • To write to memory, we set WR = 1. DATA is the n-bit value to store in memory. 台灣大學吳安宇教授

MIPS memory • MIPS memory is byte-addressable, which means that each memory address references an 8-bit quantity. • The MIPS architecture can support up to 32 address lines. • This results in a 232 x 8 RAM, which would be 4 GB of memory. • Not all actual MIPS machines will have this much! 台灣大學吳安宇教授

Loading and storing bytes • The MIPS instruction set includes dedicated load and store instructions for accessing memory, much like the CA example processor. • The main difference is that MIPS uses indexed addressing. • The address operand specifies a signed constant and a register. • These values are added to generate the effective address. • The MIPS “load byte” instruction lb transfers one byte of data from main memory to a register. lb $t0, 20($a0) # $t0 = Memory[$a0 + 20] • The “store byte” instruction sb transfers the lowest byte of data from a register into main memory. sb $t0, 20($a0) # Memory[$a0 + 20] = $t0 lb $t0, const($a0) 台灣大學吳安宇教授

Indexed addressing and arrays • Indexed addressing is good for accessing contiguous locations of memory, like arrays or structures. • The constant is the base address of the array or structure. • The register indicates the element to access. • For example, if $a0 contains 0, then lb $t0, 2000($a0)reads the first byte of an array starting at address 2000. • If $a0 contains 8, then the same instruction would access the ninth byte of the array, at address 2008. • This is why array indices in C and Java start at 0 and not 1. lb $t0, const($a0) 台灣大學吳安宇教授

Arrays and indexed addressing • You can also reverse the roles of the constant and register. This can be useful if you know exactly which array or structure elements you need. • The register could contain the address of the data structure. • The constant would then be the index of the desired element. • For example, if $a0 contains 2000, then lb $t0, 0($a0) accesses the first byte of an array starting at address 2000. • Changing the constant to 8 would reference the ninth byte of the array, at address 2008. lb $t0, 8($a0) 台灣大學吳安宇教授

Loading and storing words • You can also load or store 32-bit quantities—a complete word instead of just a byte—with the lw and sw instructions. lw $t0, 20($a0) # $t0 = Memory[$a0 + 20] sw $t0, 20($a0) # Memory[$a0 + 20] = $t0 • Most programming languages support several 32-bit data types. • Integers • Single-precision floating-point numbers • Memory addresses, or pointers • Unless otherwise stated, we’ll assume words are the basic unit of data. 台灣大學吳安宇教授

Memory alignment • Keep in mind that memory is byte-addressable, so a 32-bit word actually occupies four contiguous locations of main memory. • The MIPS architecture requires words to be aligned in memory; 32-bit words must start at an address that is divisible by 4. • 0, 4, 8 and 12 are valid word addresses. • 1, 2, 3, 5, 6, 7, 9, 10 and 11 are not valid word addresses. • Unaligned memory accesses result in a bus error, which you may have unfortunately seen before. • This restriction has relatively little effect on high-level languages and compilers, but it makes things easier and faster for the processor. 台灣大學吳安宇教授

The array example revisited • Remember to be careful with memory addresses when accessing words. • For instance, assume an array of words begins at address 2000. • The first array element is at address 2000. • The second word is at address 2004, not 2001. • Revisiting the earlier example, if $a0 contains 2000, then lw $t0, 0($a0) accesses the first word of the array, but lw $t0, 8($a0) would access the third word of the array, at address 2008. 台灣大學吳安宇教授

Computing with memory • So, to compute with memory-based data, you must: • Load the data from memory to the register file. • Do the computation, leaving the result in a register. • Store that value back to memory if needed. • For example, let’s say that an integer array A starts at address 4096. How can we do the following using MIPS assembly language? A[2] = A[1] x A[1] 台灣大學吳安宇教授

Basic MIPS Summary • We introduced the MIPS architecture. • The MIPS processor has thirty-two 32-bit registers. • Three-address, register-to-register instructions are used. • Immediates can be used to load or compute with constants • Loads and stores use indexed addressing to access RAM. • Memory is byte-addressable, and words must be aligned. • In section, we’ll begin discussing control flow. • In next lecture, we’ll continue with control flow and some other new instructions that will let us write more interesting programs. 台灣大學吳安宇教授

More MIPS Summary • W saw several additional MIPS features. • Assemblers can translate more powerful pseudo-instructions into the simpler instructions actually supported in hardware. • Branches and jumps help to implement various high-level control flow structures, like conditional statements and loops. • We also studied MIPS machine language. • All instructions are the same length, 32 bits. • The three instruction formats are I-type, R-type and J-type. • The 16-bit constant field in I-type instructions is enough for most common situations. In other cases, we can always resort to longer code fragments. 台灣大學吳安宇教授

Functions in MIPS Summary • We focused on implementing function calls in MIPS. • We call functions using jal, passing arguments in registers $a0-$a3. • Functions place results in $v0-$v1 and return using jr $ra. • Managing resources is an important part of function calls. • To keep important data from being overwritten, registers are saved according to conventions for caller-save and callee-save registers. • Each function call uses stack memory for saving registers, storing local variables and passing extra arguments and return values. • MIPS programmers must follow many conventions. Nothing prevents a rogue program from overwriting registers or stack memory used by some other function. • In section, we’ll look at writing recursive functions. 台灣大學吳安宇教授

Review of RISC CPU Designs

Review of RISC CPU Designs

Presentation Transcript

NLC-RISC

RISC MACHINE

RISC

RISC

Literature review designs

RISC Takers

RISC

Review of PANDA Magnet Designs

RISC Machines

CPU 64-bit RISC Processor CPU clock speed 576 MHz RAM 64 MB

RISC

Lecture 3: Review CPU Design

RISC Processors

RISC Machines

RISC Architecture

CISC, RISC and Post RISC

RISC

CPU Review and Programming Models

Literature review designs

Review of PANDA Magnet Designs