Chapter five part 2 control
This presentation is the property of its rightful owner.
Sponsored Links
1 / 60

Chapter Five Part 2: Control PowerPoint PPT Presentation


  • 82 Views
  • Uploaded on
  • Presentation posted in: General

Chapter Five Part 2: Control. Control. Selecting the operations to perform (ALU, read/write, etc.) Controlling the flow of data (multiplexor inputs) Information comes from the 32 bits of the instruction

Download Presentation

Chapter Five Part 2: Control

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Chapter five part 2 control

Chapter FivePart 2: Control


Control

Control

  • Selecting the operations to perform (ALU, read/write, etc.)

  • Controlling the flow of data (multiplexor inputs)

  • Information comes from the 32 bits of the instruction

  • Example: add $8, $17, $18 Instruction Format:000000 10001 10010 01000 00000100000 op rs rt rd shamt funct

  • ALU's operation based on instruction type and function code


Instructions in mips

Instructions in MIPS:


Overview of mips

Overview of MIPS

  • three instruction formats

op rs rt rdshamtfunct

R

I

J

op rs rt 16 bit address

op 26 bit address


Machine language

Machine Language

  • R-Instruction Format:00000010001100100100000000100000 op rs rt rdshamtfunct

op: op code

rs: the first register source operand

rt: the second register source operand

rd: the register destination operand; gets result

shamt: shift amount

(see chap 5)

funct: function; selects the

variant of the

operation in the op

field.


Branches and jumps

Branches and Jumps

  • Instructions:

    bne $t4,$t5,LabelNext instruction is at Label if $t4 ° $t5

    beq $t4,$t5,LabelNext instruction is at Label if $t4 = $t5

    j LabelNext instruction is at Label

  • Formats:

op rs rt 16 bit address

I

J

op 26 bit address


Control1

Control

  • Must describe ALU control hardware to compute 3-bit ALU control input

  • Later will describe main control unit (CU)

  • Common to use several levels of control. Reduces size of CU. May increase speed of CU.


Control2

Control

  • ALU performs differently depending on instruction class:

    • Load/store: use ALU to compute memory address (addition)

    • R-type: performs one of 5 actions depending on value of the 6-bit function field.

    • Branch: ALU subtracts


Control3

Control

ALU control unit inputs:

  • function field of instruction

  • 2-bit control field called ALUop. Sent by the main control unit.

  • ALUop determined by instruction type 00 = lw, sw means ADD 01 = beq, means SUB 11 = arithmetic means use the function code

  • ALU control unit outputs:

    • 3-bit signal to the ALU


  • Control4

    Control

    Design question: Why do we have a funct field in the R-type instruction?

    Why not just more op-codes?

    Already used up the available op-codes

    Since the R-type instruction doesn’t use 32 bits for the register fields, have extra space that we can use for the funct field


    Control5

    Control

    • ALU control input (3 bit control line to ALU)

      000 AND001OR010add110subtract111set-on-less-than

    • Why is the code for subtract 110 and not 011?


    Control6

    Control

    Instruction from memory

    Main

    Control

    unit

    Bits 31-26

    opcode

    ALU

    ALUOp

    2 bits

    ALU

    Control

    unit

    3 bits

    Bits 5-0

    funct

    ALU control code


    Control7

    Control

    • Below: how to set ALU control inputs based on the 2-bit ALUOp control and the 6-bit function code.


    Alu control hardware implementation

    ALU Control hardware implementation

    • Many ways to implement the mapping from 2-bit ALUOp field and 6-bit funct field to the 3 ALU operation control bits.

    • Only need a small number of the 64 possible values of the funct field

    • funct field only used when ALUOp bits equal 10.

    • So can design small logic that recognizes a subset of possible values and sets the ALU control bits.


    Alu control implementation

    ALU Control implementation

    • Technique:

      • Look for unique bits that will identify a particular output function.

    • Example:

      • Branch eq has ALUOp = 01. No other code has a 1 in the least significant bit.

      • Thus the 1 in the lsb uniquely identifies the branch and thus uniquely identifies ALU code 110.

      • The bit in the msb of ALUOp is a don’t care condition in this case.

      • See next slide


    Alu control implementation1

    ALU Control implementation

    • Example 2:

      • A 1 in the msb of ALUOp uniquely identifies the arithmetic function.

      • Thus the lsb of ALUOp is a don’t care

      • Also, funct bits 5 and 4 are always 10, thus don’t help recognize the ALU control bits and can be ignored.


    Alu control implementation2

    ALU Control implementation

    • Describe it using a truth table (can turn into gates).

    • Notes:

      • Inputs are the ALUOp and funct code field

      • Only show entries for which ALU control must have specific value, eg ALUOp does not use 11 so table contains 1X and X1 not 10 and 01

      • When funct field used, the first two bits (F5, F4) are always 10, so they are don’t care terms


    Alu control implementation3

    ALU Control implementation

    • Implementing the TT. Call the output bits Op2, Op1, Op0.

    • Truth table for Op2 = 1 (leftmost bit of Operation field in previous table)

    • Only show entries for which output is 1.

    • Truth table above shows input combinations for which the ALU control is 010, 001, 110, 111.

    • Combinations 011, 100, and 101 are not used.

    • Don’t care about funct code if the ALUOp field is not 10.


    Alu control implementation4

    ALU Control implementation

    • Truth table for Op2 = 1 (leftmost bit of Operation field in previous table)


    Alu control implementation5

    ALU Control implementation

    Truth table for Op1 = 1 (middle bit of Operation field in bottom table)


    Alu control implementation6

    ALU Control implementation

    • Truth table for Op0 = 1 (rightmost bit of Operation field in bottom table)


    Alu control implementation7

    ALU control implementation


    Full control implementation

    Full Control Implementation

    • Have now designed the ALU control

    • Must design the full control logic


    Control8

    Control

    • 3 instruction formats:

    R-type

    0 rs rt rd shamt funct

    31-26 25-21 20-16 15-11 10-6 5-0

    Load/Store

    35 or 43 rs rt address

    31-26 25-21 20-16 15-0

    Branch

    4 rs rt address

    31-26 25-21 20-16 15-0


    Control9

    Control

    • Opcode is always in bits 31-26. Refer to this as Op[5-0]

    • Two registers to read are always rs and rt at positions 25-21 and 20-16. True for R-type, branch, store.(rt)

    • Base register for load and store is always in bit positions 25-21 (rs)

    • The 16-bit offset for branch equal, load, store always in bit positions 15-0

    • Destination register is in one of two places:

      • Load: positions 20-16

      • R-type: positions 15-11 (rd)

      • Need a mux to select this

    • Can now add labels, muxes, control lines to datapath. See next slide.


    Control10

    Control

    Added:

    instruction labels, extra mux (for Write register number input of the register

    file) to datapath, ALU control block, write signals for state elements,

    read signal for data memory, control signals for multiplexors.


    Control11

    Control


    Control12

    Control


    Control13

    Control

    • The control unit can set all but one of the control signals based on the opcode

    • PCSrc is set if instruction is branch on equal and Zero output of ALU is 1.

    • Control units and signals added to datapath on next slide

    • Table on next slide defines how the control signals should be set for each opcode.


    Control14

    Control

    Datapath with control lines and control units


    Control use of the datapath

    Control: use of the datapath

    • R-type instructions. Four steps (all are done in one clock cycle)

      • Signals stabilize in circuit in roughly the order of these 4 step

      • Example: add $t1, $t2, $t3

      • Step 1: Fetch and increment PC.


    Control use of the datapath1

    Control: use of the datapath

    • Step 2: Two registers are read ($t2 and $t3).

    • Main CU computes setting of control lines.


    Control use of the datapath2

    Control: use of the datapath

    • Step 3: ALU operates on data using the function code bits.


    Control use of the datapath3

    Control: use of the datapath

    • Step 4: Result from ALU written into register file (register $t1).


    Control use of the datapath4

    Control: use of the datapath

    • Execution of a load word: lw $t1, offset($t2)

      • Step 1: instruction fetched and PC incremented

      • Step 2: A register ($t2) value is read

      • Step 3: The ALU computse the sum of the register valuie and the sign-extended lower 16 bits of the offset

      • Step 4: the sum from ALU used as the address for data memory

      • Step 5: The data from memory written into the register file. Register destination given by bits 20-16 of the instruction ($t1)


    Control use of the datapath5

    Control: use of the datapath

    • The load word instruction ( lw $t1, offset($t2) )


    Control use of the datapath6

    Control: use of the datapath

    • Branch equal instruction beq $t1, $t2, offset

      • Step 1: instruction fetched, PC incremented

      • Step 2: two registers read ($t1, $t2)

      • Step 3: ALU subtracts. PC + 4 added to the sign-extended lower 16 bits of the instruction (offset) shifted left by two. Result is the branch target

      • Zero result used by CU to decide which adder result to store into the PC

      • Diagram next slide.


    Control use of the datapath7

    Control: use of the datapath

    • Branch equal instruction beq $t1, $t2, offset


    Finalizing control

    Finalizing Control

    • Instruction formats and resulting control signals

    • Encoding of the instruction formats:


    Finalizing control a sideways truth table

    Finalizing Control: a sideways truth table


    Finalizing control1

    Finalizing Control


    Finalizing control2

    Finalizing Control

    • Implementing jumps

      • Similar to branch but computes target differently and is not conditional

      • Low-order 2 bits always 00

      • Next lower 26 bits come from the 26-bit immediate field

      • Upper 4 bits of the address come from the PC of the jump instruction + 4

    2 address

    31-26 25-0


    Finalizing control3

    Finalizing Control

    • Implementing jumps

      • To implement, store into PC the concatenation of

        • Upper 4 bits of current PC + 4 (bits 31-28)

        • The 26-bit immediate field of the jump instruction

        • The bits 00

      • Need additional multiplexor

      • Incremented PC

      • Branch target PC

      • Or the jump target PC

    • Need additional control signal called jump. Asserted when opcode is 2.


    Finalizing control4

    Finalizing Control


    Exercise

    Exercise

    • Add an instruction that adds a register value with a memory value.

      • How must the datapath change to accommodate this instruction?

      • What new control signals are necessary?


    Exercise r m r

    Exercise: R + M -> R

    0

    1

    Problem: what is

    the address size?

    Must sign extend


    Exercise1

    Exercise


    Our simple control structure

    Our Simple Control Structure

    • All of the logic is combinational

    • We wait for everything to settle down, and the right thing to be done

      • ALU might not produce “right answer” right away

      • we use write signals along with clock to determine when to write

    • Cycle time determined by length of the longest path

    We are ignoring some details like setup and hold times


    Single cycle why it is not used

    Single Cycle: why it is not used

    • Bottom line: it’s inefficient.

    • Every clock cycle must have same length for every instruction.

      • Clock cycle determined by the longest possible path in machine

      • In this instruction set: load instruction

      • Load uses 5 functional units in series:

        • Instruction memory

        • Register file

        • ALU

        • Data memory

        • Register file

    • Thus cycles per instruction (CPI) is 1.


    Single cycle performance

    Single Cycle: performance

    • Assume the operation time for the major function units in an implementation is:

      • Memory units: 200 ps

      • ALU/adders: 100 ps

      • Register file (read or write): 50 ps

      • Muxes, CU, PC accesses, sign extension, wires: no delay

    • Instruction mix:

      • 25% loads

      • 10% stores

      • 45% R-format

      • 15% branches

      • 5% jumps


    Single cycle performance1

    Single Cycle: performance

    • Compare:

      • An implementation in which every instruction operates in 1 clock cycle of a fixed length

      • An implementation where every instruction takes 1 clock cycle using a variable-length clock. Clock is only as long as it needs to be.

      • Latter approach is not really practical.

    • Comparing execution times:

      CPU execution time = Instruction count X CPI X Clock cycle time


    Single cycle performance2

    Single Cycle: performance

    • Need clock cycle time for the two implementations. Instruction count and CPI are the same.

    • Critical path for the different instructions:


    Single cycle performance3

    Single Cycle: performance

    • Using these critical paths, can compute the required length for each instruction class:


    Single cycle performance4

    Single Cycle: performance

    • Time of single clock cycle implementation:

      • Clock cycle of single implementation determined by longest instruction (8ns).

      • Since CPI = 1 and the longest instruction(lw) takes 600 ps

        CPU execution time = Instruction count X 1 X 600 ps


    Single cycle performance5

    Single Cycle: performance

    • Average time per instruction with variable clock:

      CPU clock cycle = 600 x 25% 550 x 10% +400 x 45% + 350 x 15% + 200 x 5%

      = 447.5 ps

      Average CPU execution time = Instruction count X 1 X 6.447.5 ps


    Single cycle performance6

    Single Cycle: performance

    • Variable clock implementation has shorter average clock cycle, so is faster.

    • Performance ratio:

      CPU performance variable clock = CPU execution time single clock

      CPU performance single clock CPU execution time variable clock

      = IC x CPU clock cycle single clock

      IC x CPU clock cycle variable clock

      = CPU clock cycle single clock = 600 = 1.34

      CPU clock cycle variable clock 447.5


    Where we are headed

    Where we are headed

    • Single Cycle Problems:

      • what if we had a more complicated instruction like floating point?

      • wasteful of area

      • implementing a variable-speed clock for each instruction class is extremely difficult. Overhead larger than any advantage gained.

    • One Solution:

      • use a “smaller” cycle time

      • have different instructions take different numbers of cycles

      • a “multicycle” datapath


    Where we are headed1

    Where we are headed

    • Differences between single-cycle data path and multi-cycle datapath:

      • A single memory unit is used for both instructions and data

      • There is a single ALU, rather than an ALU and two adders

      • One or more registers are added after every major functional unit to hold the output of that unit until the value is used in a subsequent clock cycle.


    Where we are headed2

    Where we are headed


  • Login