CS 35101 Computer Architecture Spring 2006 Week 6/7

124 Views

Download Presentation
## CS 35101 Computer Architecture Spring 2006 Week 6/7

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**CS 35101Computer ArchitectureSpring 2006Week 6/7**Paul Durand (www.cs.kent.edu/~durand) Course url: www.cs.kent.edu/~durand/cs35101.htm**Head’s Up**• Week 6 & 7 material • Digital Logic Design • Processor organization / description • MIPS arithmetic operations • PH 3.1, 3.2, 3.3 • Reminders • Midterm #1 – Thursday, February 23rd • Next week’s material • MIPS arithmetic operations • Reading assignment – PH 3.4 through 3.5**To make the architect’s crucial task even conceivable, it**is necessary to separate the architecture, the definition of the product as perceivable by the user, from its implementation. Architecture versus implementation defines a clean boundary between parts of the design task, and there is plenty of work on each side of it. The Mythical Man-Month, Brooks, pg. 256**Fetch**PC = PC+4 Exec Decode Review: MIPS Organization, so far Processor Memory Register File 1…1100 src1 addr src1 data 5 32 src2 addr 32 registers ($zero - $ra) 5 dst addr read/write addr 5 src2 data write data 32 230 words 32 32 32 bits br offset read data 32 Add PC 32 32 32 32 Add 32 4 write data 0…1100 32 0…1000 32 4 5 6 7 0…0100 32 ALU 0 1 2 3 0…0000 32 word address (binary) 32 bits 32 byte address (big Endian)**Processor Organization**• Processor control needs to have the • Ability to input instructions from memory • Logic to control instruction sequencing and to issue signals that control the way information flows between the datapath components and the operations performed by them • Processor datapath needs to have the • Ability to load data from and store data to memory • Interconnected components - functional units (e.g., ALU) and storage units (e.g., Register File) - for executing the ISA • Need a way to describe the organization • High level (block diagram) description • Schematic (gate level) description • Textural (simulation/synthesis level) description**Less**Abstract More Accurate Slower Simulation Schematic capture + logic simulation package like LogicWorks Special languages + simulation systems for describing the inherent parallel activity in hardware (VHDL and verilog) Levels of Description of a Digital System Architectural Functional/Behavioral Register Transfer Logic Circuit models programmer's view at a high level; written in your favorite programming language more detailed model, like the block diagram view model is in terms of datapath FUs, registers, busses; register xfer operations are clock phase accurate model is in terms of logic gates; delay information can be specified for gates; digital waveforms model is in terms of circuits (electrical behavior); accurate analog waveforms**Why Simulate First?**• Physical breadboarding • discrete components/lower scale integration precedes actual construction of the prototype • verification of the initial design • No longer possible as designs reach higher levels of integration! • Simulation before construction - aka functional verification • high level constructs means faster to design and test • can play “what if” more easily • limited performance (can’t usually simulate all possible input transitions) and accuracy (can’t usually model wiring delays accurately), however**Because ease of use is the purpose, this ratio of function**to conceptual complexity is the ultimate test of system design. Neither function alone nor simplicity alone defines a good design. The Mythical Man-Month, Brooks, pg. 43**Fetch**PC = PC+4 Exec Decode Review: MIPS Organization, so far Processor Memory Register File 1…1100 src1 addr src1 data 5 32 src2 addr 32 registers ($zero - $ra) 5 dst addr read/write addr 5 src2 data write data 32 230 words 32 32 32 bits br offset read data 32 Add PC 32 32 32 32 Add 32 4 write data 0…1100 32 0…1000 32 4 5 6 7 0…0100 32 ALU 0 1 2 3 0…0000 32 word address (binary) 32 bits 32 byte address (big Endian)**Arithmetic**• Where we've been: • Abstractions: • Instruction Set Architecture (ISA) • Assembly and machine language • What's up ahead: • Implementing the architecture zero ovf 1 1 A 32 ALU result 32 B 32 4 m (operation)**Number Representation**• Bits are just bits (have no inherent meaning) • conventions define the relationships between bits and numbers • Binary numbers (base 2) - integers 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 . . . • in decimal from 0 to 2n-1 for n bits • Of course, it gets more complicated • storage locations (e.g., register file words) are finite, so have to worry about overflow (i.e., when the number is too big to fit into 32 bits) • have to be able to represent negative numbers, e.g., how do we specify -8 in addi $sp, $sp, -8 #$sp = $sp - 8 • in real systems have to provide for more that just integers, e.g., fractions and real numbers (and floating point)**Possible Representations**• Issues: • balance • number of zeros • ease of operations • Which one is best? Why?**MIPS Representations**• 32-bit signed numbers (2’s complement):0000 0000 0000 0000 0000 0000 0000 0000two = 0ten0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten0000 0000 0000 0000 0000 0000 0000 0010two = + 2ten... 0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten1000 0000 0000 0000 0000 0000 0000 0010two = – 2,147,483,646ten... 1111 1111 1111 1111 1111 1111 1111 1101two = – 3ten1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten • What if the bit string represented addresses? • need operations that also deal with only positive (unsigned) integers maxint minint**Review: Signed Binary Representation**-23 = -(23 - 1) = 1011 then add a 1 1010 complement all the bits 23 - 1 =**Two's Complement Operations**• Negating a two's complement number: complement all the bits and add a 1 • remember: “negate” and “invert” are quite different! • Converting n-bit numbers into numbers with more than n bits: • MIPS 16-bit immediate gets converted to 32 bits for arithmetic • copy the most significant bit (the sign bit) into the other bits 0010 -> 0000 0010 1010 -> 1111 1010 • sign extension versus zero extend (lb vs. lbu)**Goal: Design a ALU for the MIPS ISA**• Must support the Arithmetic/Logic operations of the ISA • Tradeoffs of cost and speed based on frequency of occurrence, hardware budget**MIPS Arithmetic and Logic Instructions**31 25 20 15 5 0 • Signed arithmetic generates overflow, but no carry out R-type: op Rs Rt Rd funct I-Type: op Rs Rt Immed 16 Type op funct ADDI 001000 xx ADDIU 001001 xx SLTI 001010 xx SLTIU 001011 xx ANDI 001100 xx ORI 001101 xx XORI 001110 xx LUI 001111 xx Type op funct ADD 000000 100000 ADDU 000000 100001 SUB 000000 100010 SUBU 000000 100011 AND 000000 100100 OR 000000 100101 XOR 000000 100110 NOR 000000 100111 Type op funct 000000 101000 000000 101001 SLT 000000 101010 SLTU 000000 101011 000000 101100**Design Trick: Divide & Conquer**• Break the problem into simpler problems, solve them and glue together the solution • Example: assume the immediates have been taken care of before the ALU • now down to 10 operations • can encode in 4 bits 00 add 01 addu 02 sub 03 subu 04 and 05 or 06 xor 07 nor 12 slt 13 sltu**Addition & Subtraction**• Just like in grade school (carry/borrow 1s) 0111 0111 0110+ 0110- 0110- 0101 • Two's complement operations easy • subtraction using addition of negative numbers0111 0111 - 0110+ 1010 • Overflow (result too large for finite computer word): • e.g., adding two n-bit numbers does not yield an n-bit number 0111 + 0001 1101 0001 0001 0001 1 0001 1000**Building a 1-bit Binary Adder**S = A xor B xor carry_in carry_out = AB v Acarry_in v Bcarry_in (majority function) carry_in A 1 bit Full Adder S B carry_out • How can we use it to build a 32-bit adder? • How can we modify it easily to build an adder/subtractor?**c0=carry_in**A0 1-bit FA S0 B0 c1 A1 1-bit FA S1 B1 c2 A2 1-bit FA S2 B2 c3 . . . c31 A31 1-bit FA S31 B31 c32=carry_out Building 32-bit Adder • Just connect the carry-out of the least significant bit FA to the carry-in of the next least significant bit and connect . . . • Ripple Carry Adder (RCA) • advantage: simple logic, so small (low cost) • disadvantage: slow and lots of glitching (so lots of energy consumption)**add/subt**c0=carry_in A0 1-bit FA S0 B0 c1 control (0=add,1=subt) A1 1-bit FA B0 if control = 0, !B0 if control = 1 S1 B0 B1 c2 A2 1-bit FA S2 B2 c3 . . . c31 A31 1-bit FA S31 B31 c32=carry_out Building 32-bit Adder/Subtractor • Remember 2’s complement is just • complement all the bits • add a 1 in the least significant bit A 0111 0111 B - 0110+ 1010**Overflow Detection and Effects**• Overflow: the result is too large to represent in the number of bits allocated • When adding operands with different signs, overflow cannot occur! Overflow occurs when • adding two positives yields a negative • or, adding two negatives gives a positive • or, subtract a negative from a positive gives a negative • or, subtract a positive from a negative gives a positive • On overflow, an exception (interrupt) occurs • Control jumps to predefined address for exception • Interrupted address (address of instruction causing the overflow) is saved for possible resumption • Don't always want to detect (interrupt on) overflow**New MIPS Instructions**• Sign extend - addiu, sltiu • Zero extend - lbu • No overflow detected - addu, subu, addiu, sltu, sltiu**Conclusion**• We can build an ALU to support the MIPS ISA • we can efficiently perform subtraction using two’s complement • we can replicate a 1-bit ALU to produce a 32-bit ALU • Important points about hardware • all of the gates are always working (concurrent) • the speed of a gate is affected by the number of inputs to the gate (fan-in) and the number of gates that the output is connected to (fan-out) • the speed of a circuit is affected by the number of gates in series (on the “critical path” or the “number of levels of logic”) • Our primary focus: comprehension, however, • Clever changes to organization can improve performance (similar to using better algorithms in software)