Ece243
This presentation is the property of its rightful owner.
Sponsored Links
1 / 115

ECE243 PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on
  • Presentation posted in: General

ECE243. CPU. IMPLEMENTING A SIMPLE CPU. How are machine instructions implemented? What components are there? How are they connected and controlled?. MINI ISA:. every instruction is 1-byte wide data and address values are also 1-byte wide address space

Download Presentation

ECE243

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ece243

ECE243

CPU


Implementing a simple cpu

IMPLEMENTING A SIMPLE CPU

  • How are machine instructions implemented?

  • What components are there?

  • How are they connected and controlled?


Mini isa

MINI ISA:

  • every instruction is 1-byte wide

    • data and address values are also 1-byte wide

  • address space

    • byte addressable (every byte has an address)

    • 8 addr bits => 256 byte locations

  • 4 registers:

    • k0..k3

  • PC (resets to $80)

  • Condition codes:

    • Z (zero), N (negative)

    • these are used by branches


Some definitions

Some Definitions:

  • IMM3: a 3-bit signed immediate, 2 parts:

    • 1 sign bit: sign(IMM3)

    • 2 bit value: value(IMM3)

  • IMM4: a 4-bit signed immediate

  • IMM5: a 5-bit unsigned immediate

  • R1, R2: registers variables

    • represent one of k0..k3

  • SE8(X):

    • means sign-extend value X to 8 bits

  • NOTE: ALL INSTS DO THIS LAST:

    • PC = PC + 1


Mini isa instructions

Mini ISA Instructions

load R1 (R2):

R1 = mem[R2]

  PC = PC + 1

store R1 (R2):

mem[R2] = R1

  PC = PC + 1

add R1 R2

R1 = R1+ R2

IF (R1 == 0) Z = 1 ELSE Z = 0

IF (R1< 0) N = 1 ELSE N = 0

  PC = PC + 1

sub R1 R2

R1= R1 - R2

IF (R1 == 0) Z = 1 ELSE Z = 0

IF (R1< 0) N = 1 ELSE N = 0

PC = PC + 1


Mini isa instructions1

Mini ISA Instructions

nand R1 R2

R1= R1 bitwise-NAND R2

IF (R1 == 0) Z = 1 ELSE Z = 0

IF (R1< 0) N = 1 ELSE N = 0

   PC = PC + 1

ori IMM5

K1 = K1 bitwise-OR IMM5

IF (R1 == 0) Z = 1 ELSE Z = 0

IF (R1< 0) N = 1 ELSE N = 0

  PC = PC + 1

shift R1 IMM3

IF (sign(IMM3)) R1 =R1 << value(IMM3)

ELSE R1 = R1 >> value(IMM3)

IF (R1 == 0) Z = 1 ELSE Z = 0

IF (R1< 0) N = 1 ELSE N = 0

  PC = PC + 1


Mini isa instructions2

Mini ISA Instructions

bz IMM4

IF (Z == 1) PC = PC + SE8(IMM4)

    PC = PC + 1

bnz IMM4

IF (Z == 0) PC = PC + SE8(IMM4)

    PC = PC + 1

bpz IMM4

IF (N == 0) PC= PC + SE8(IMM4)

   PC = PC + 1


Encodings inst opcode

ENCODINGS: Inst(opcode)

  • Load(0000), store(0010), add(0100), sub(0110), nand(1000):

  • Ori:

7 6 5 4 3 2 1 0

7 6 5 4 3 2 1 0


Encodings inst opcode1

ENCODINGS: Inst(opcode)

  • Shift:

  • BZ(0101), BNZ(1001), BPZ(1101):

7 6 5 4 3 2 1 0

7 6 5 4 3 2 1 0


Designing a cpu

DESIGNING A CPU

  • Two main components:

    • datapath and control

  • datapath:

    • registers, functional units, muxes, wires

    • must be able to perform all steps of every inst

  • control:

    • a finite state machine (FSM)

    • commands the datapath

    • performs: fetch, decode, read, execute, write, get next inst


Ece2431

ECE243

CPU: basic components


Registers

REGISTERS

REGWrite?

out

in

REG

clock

8

8

  • REGISTERS

    • can always read

    • we assume falling-edge-triggered

    • in is stored if REGWrite=1 on falling clock edge

    • we won’t normally draw the clock input


Muxes

MUXES

out

0

1

8

8

8

select

  • ‘select’ signal chooses which input to route to output


Register file

REGISTER FILE

2

2

2

8

8

REGWrite?

R1

Out1

Reg

FILE

(k0,k1,k2,k3)

Out2

R2

in

Rwrite

clock

8

  • Out1 is the value of reg indexed by R1

  • Out2 is the value of reg indexed by R2

  • if REGWrite is 1 when clock goes low

    • then the value on ‘in’ is written to reg indexed by Rwrite


Alu arithmetic logic unit

ALU (arithmetic logic unit)

8

8

8

Z

N

In0

out

In1

3

ALUop

  • ALUop:

    • add = 000

    • sub = 001

    • or = 010

    • nand = 011

    • shift = 100

  • Z = nor(out7,out6,out5…out0)

  • N = out bit 7 (implies negative---sign bit)


Memory

MEMORY

  • our CPU has two memories for simplicity:

    • instruction memory and data memory

    • known as a “Harvard architecture”


Instruction mem

INSTRUCTION MEM

INST

MEM

addr

Iout

8

8

  • is read only

  • Iout is set to the value indexed by the address


Data memory

DATA MEMORY

8

8

8

MEMRead?

MEMWrite?

DATA

MEM

addr

clock

Din

Dout

  • can read or write

    • but only one in a given clock cycle

  • on falling clock edge:

    • if MEMWrite==1: value on Din is stored at addr

    • if MEMRead==1: value at addr is output on Dout


Se8 x sign extend to 8 bits

SE8(x): SIGN-EXTEND TO 8 BITS

I3

O3

O7

I2

O2

O6

I1

O1

O5

I0

O0

O4

  • assuming 4-bit input

  • Recall: want:

    • SE8(0100) -> 00000100

    • SE8(1100) -> 11111100

  • In bits i3,i2,i1,i0; out bits o7…o0


Ze8 x zero extend to 8 bits

ZE8(x): ZERO EXTEND TO 8 bits

O3

O7

O2

O6

0

O1

O5

I4

O0

O4

I3

I2

I1

I0

  • assuming 5-bit input

  • Recall: want

    • ZE8(00100) -> 00000100

    • ZE8(11100) -> 00011100

  • In bits i4,i3,i2,i1,i0; out bits o7…o0


Ece2432

ECE243

CPU: Single Cycle Implementation


Single cycle datapath

SINGLE CYCLE DATAPATH

Inst1

Inst2

1 cyc

  • each instruction executes entirely

    • in one cycle of the cpu clock

  • registers are triggered by the falling edge

    • new values begin propagating through datapath

    • some values may be temporarily incorrect

  • the clock period is large enough to ensure:

    • that all values correct before next falling edge


Fetch

FETCH

8

8

  • needed by every instruction

    • i.e., every instruction must be fetched

addr

PC

INST

MEM

inst

PCwrite?


Pc pc 1

PC = PC + 1

8

8

PC

INST

MEM

addr

inst

PCwrite?


Branches bz imm4

BRANCHES: BZ IMM4

8

8

8

7 6 5 4 3 2 1 0

IMM4

opcode

  • (if branch is taken does: PC = PC + IMM4 + 1)

PC

INST

MEM

addr

inst

PCwrite?

+

1


Add add r1 r2

ADD add R1 R2

0

1

1

+

+

8

8

8

4

8

i7 i6 i5 i4 i3 i2 i1 i0

R2

0 1 0 0

R1

SE8

Inst:

  • does r1 = r1 + r2

  • same datapath for sub and nand

PC

INST

MEM

addr

inst

PCwrite?

PCsel

IMM4


Shift shift r1 imm3

SHIFT: SHIFT R1 IMM3

i7 i6 i5 i4 i3 i2 i1 i0

0

1

IMM3

R1

0 1 1

N

1

Z

+

+

2

2

8

2

4

8

8

8

SE8

R2

2

REGwrite?

REG

FILE

Rw

Out1

PC

A

L

U

INST

MEM

addr

R1

Out2

inst

PCwrite?

in

PCsel

IMM4

ALUop


Ori ori imm5

ORI: ORI IMM5

0

1

N

1

Z

+

A

L

U

+

2

8

8

8

8

8

4

2

SE8

i7 i6 i5 i4 i3 i2 i1 i0

IMM5

1 1 1

R2

2

  • does: k1 <- k1 bitwise-or IMM5

REGwrite?

REG

FILE

Rw

Out1

PC

INST

MEM

addr

R1

Out2

inst

PCwrite?

in

PCsel

ZE8

IMM3

IMM4

ALUop

ALU2


Store store r1 r2

Store: Store R1 (R2)

0

1

1

0

N

1

Z

5

3

00

+

+

A

L

U

01

10

11

2

2

8

8

8

4

2

8

8

2

i7 i6 i5 i4 i3 i2 i1 i0

R2

opcode

R1

SE8

R2

2

Inst:

  • does: mem[r2] = r1

R1sel

REGwrite?

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

R1

Out2

inst

PCwrite?

in

ZE8

IMM5

PCsel

ZE8

IMM3

IMM4

ALUop

ALU2


Load load r1 r2

Load: Load R1 (R2)

0

1

1

0

N

1

Z

3

5

00

+

A

L

U

+

01

10

11

8

4

8

2

2

8

8

2

2

8

i7 i6 i5 i4 i3 i2 i1 i0

R2

opcode

R1

SE8

R2

2

Inst:

  • does: r1 = mem[r2]

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

Din

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

R1

Out2

inst

PCwrite?

in

ZE8

IMM5

PCsel

ZE8

IMM3

IMM4

ALUop

ALU2


Final datapath

Final Datapath!

0

1

1

0

1

0

N

1

Z

5

3

00

A

L

U

+

+

01

10

11

8

8

8

2

4

2

2

8

8

2

SE8

R2

2

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

R1

Out2

inst

PCwrite?

in

ZE8

IMM5

PCsel

ZE8

IMM3

IMM4

ALUop

ALU2


Designing the control unit

DESIGNING THE CONTROL UNIT

opcode

PCsel

CTRL

Z

N

  • CONTROL SIGNALS TO GENERATE:

    • PCsel, PCwrite, REGwrite, MEMread, MEMwrite, R1sel, ALUop, ALU2, RFin


Control signals

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

load R1 (R2)


Control signals1

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

store R1 (R2)


Control signals2

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

add R1 R2


Control signals3

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

sub R1 R2


Control signals4

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

nand R1 R2


Control signals5

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

ori IMM5


Control signals6

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

shift R1 IMM3


Control signals7

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

bz IMM4


Control signals8

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

bnz IMM4


Control signals9

Control Signals

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

0

R1

Out2

inst

PCwrite?

in

0

1

ZE8

IMM5

PCsel

1

0

ZE8

IMM3

IMM4

ALUop

ALU2

N

1

Z

3

5

00

+

+

A

L

U

01

10

11

2

2

8

2

8

2

8

8

8

4

SE8

R2

2

bpz IMM4


All control signals

All Control Signals


All control signals1

All Control Signals


Building control logic memread

Building Control Logic: MemRead


Building control logic pcsel

Building Control Logic: PCSel


Ece2433

ECE243

CPU: Multicycle Implementation


A multicycle datapath

A Multicycle Datapath


Key difference 1 only 1 memory

Key Difference #1: Only 1 Memory


Key difference 2 only 1 alu

Key Difference #2: Only 1 ALU


Key difference 3 temp regs

Key Difference #3: Temp Regs

what benefit are tmp regs / multicycle?


Key difference 3 temp regs1

Key Difference #3: Temp Regs

critical path is long large clock period


Key difference 3 temp regs2

Key Difference #3: Temp Regs

smaller critical pathsshorter clock period


Key difference 3 temp regs3

Key Difference #3: Temp Regs

let’s examine these one at a time


Ir instruction register

IR: Instruction Register

holds inst encoding


Mdr memory data register

MDR: Memory Data Register

holds the value returned from Memory


R1 and r2

R1 and R2

hold values from the register file


Aluout

ALUout

holds the result calculcated by the ALU


Cycle by cycle operation

Cycle by Cycle Operation


All insts cycle1 fetch and increment pc

All Insts Cycle1: Fetch and Increment PC

IR ← mem[PC]; PC ← PC + 1;

increment PC

fetch next inst into the IR


All insts cycle2 decoding inst reading reg file

All Insts Cycle2: Decoding Inst & Reading Reg File

R1 ← Kx; R2 ← Ky

Note: not all insts need R1 and R2


Add sub nand cycle3 calculate

Add, Sub, Nand Cycle3: Calculate

ALUout ← R1 op R2


Add sub nand cycle4 write to reg file

Add, Sub, Nand Cycle4: Write to Reg FIle

Kx ← ALUout


Shift cycle3 calculate

Shift Cycle3: Calculate

ALUout ← R1 op IMM3


Shift cycle4 write to reg file

Shift Cycle4: Write to Reg FIle

Kx ← ALUout


Ori cycle3 read k1 from reg file

ORI Cycle3: Read K1 from Reg File

R1 ← k1


Ori cycle4 calculate

ORI Cycle4: Calculate

ALUout ← R1 op IMM5


Ori cycle5 write to reg file

ORI Cycle5: Write to Reg FIle

ky ← ALUout


Load cycle3 addr to mem value into mdr

Load Cycle3: addr to Mem, value into MDR

MDR ← mem[R2]


Load cycle4 write value into reg file

Load Cycle4: write value into reg file

ky ← MDR


Store cycle3 addr to mem value to mem

Store Cycle3:addr to Mem, value to Mem

mem[R2] ← R1


Branches cycle3

Branches Cycle3

PC ← PC + IMM4


Summary

Summary

Example: total time to execute one of each instruction:

Single cycle: 1*4 + 1*4+1*1 = 9 cycles; 9 cycles / 1MHz = 9us

Multicycle: 3*4 + 4*4 + 1*5 = 33 cycles; 33 cycles / 4MHz = 8.25us


Implementing multicycle control

Implementing Multicycle Control


Control an fsm

Control: An FSM

  • need a state transition diagram

  • how many states are there?

  • how many bits to represent state?


Multicycle control as an fsm

Multicycle Control as an FSM


Multicycle control hardware

Multicycle Control Hardware

Z

N

Pcwrite

Pcsel

ALUop

IR

Ctrl logic

IR:3..0

Next_state

Current_state

State Register

(4 bits)


Ece2434

ECE243

CPU: Adding a New Instruction


Example question adding a new instruction

EXAMPLE QUESTION:ADDING A NEW INSTRUCTION

  • Implement a post-increment load:

  • load r1, (r2)+

    Does: RF[r1] = MEM[RF[r2]]

    RF[r2] = RF[r2] + 1

    r2 is permanently changed to be r2+1


Implementing rf r1 mem rf r2 rf r2 rf r2 1

Implementing: RF[r1] = MEM[RF[r2]]; RF[r2] = RF[r2] + 1

Recall: load r1, (r2)

IR= mem[PC] , PC = PC + 1

R1 = RF[r1], R2 = RF[r2]

MDR = mem[R2]

RF[r1] = MDR


Modifying the datapath

Modifying the Datapath

RF[r2] = RF[r2] + 1


Ece2435

ECE243

CPU: Pipelining


A fast food sandwich shop

A Fast-Food Sandwich Shop

cook

take

order

select

bun

add

ingredients

wrap and

bag

cash and

change


With one cook

With One Cook

cook

take

order

select

bun

add

ingredients

wrap and

bag

cash and

change

customer1

customer1

customer1

customer1

customer1

  • one customer is serviced at a time


Like the single cycle cpu

Like the single-cycle CPU

1

0

0

0

1

1

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

N

1

Z

Din

1

5

3

REG

FILE

Rw

Out1

PC

INST

MEM

addr

R1

00

+

A

L

U

+

Out2

inst

01

10

PCwrite?

in

11

2

8

8

8

2

2

2

8

4

8

ZE8

IMM5

PCsel

ZE8

IMM3

IMM4

ALUop

ALU2

SE8

R2

2

  • one instruction flows through at a time

add k1, k2


With two cooks

With Two Cooks?

cook

cook

take

order

select

bun

add

ingredients

wrap and

bag

cash and

change


Pipelining

Pipelining

  • Like an assembly line

  • Doesn’t change the interface or result

    • improves performance


Pipelining a cpu rough idea

Pipelining a CPU (rough idea)

1

0

1

0

1

0

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

N

1

Z

Din

1

5

3

REG

FILE

Rw

Out1

PC

INST

MEM

addr

R1

00

A

L

U

+

+

Out2

inst

01

10

PCwrite?

in

11

8

4

8

8

2

8

8

2

2

2

ZE8

IMM5

PCsel

ZE8

IMM3

IMM4

ALUop

ALU2

SE8

R2

2


Pipelining details

Pipelining Details:

0

1

1

0

1

0

N

1

Z

5

3

00

A

L

U

+

+

01

10

11

8

8

8

2

4

2

2

8

8

2

SE8

R2

2

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

Din

1

REG

FILE

Rw

Out1

PC

INST

MEM

addr

R1

Out2

inst

PCwrite?

in

ZE8

IMM5

PCsel

ZE8

IMM3

IMM4

ALUop

ALU2


With three cooks

With Three Cooks?

cook

cook

cook

take

order

select

bun

add

ingredients

wrap and

bag

cash and

change


Pipelining a cpu rough idea1

Pipelining a CPU (rough idea)

1

0

1

0

1

0

MEMwrite

MEMread

addr

Data

MEM

R1sel

REGwrite?

RFin

N

1

Z

Din

1

5

3

REG

FILE

Rw

Out1

PC

INST

MEM

addr

R1

00

A

L

U

+

+

Out2

inst

01

10

PCwrite?

in

11

8

4

8

8

2

8

8

2

2

2

ZE8

IMM5

PCsel

ZE8

IMM3

IMM4

ALUop

ALU2

SE8

R2

2


Visualizing pipelining

Visualizing Pipelining

Fetch

(inst mem)

Decode

(reg file)

Execute

(ALU and

data mem)


Visualizing pipelining again

Visualizing Pipelining (again)

Fetch

(inst mem)

Decode

(reg file)

Execute

(ALU and

data mem)


Fast food hazards

Fast Food Hazards

cook

cook

cook

take

order

select

bun

add

ingredients

wrap and

bag

cash and

change

customer3

customer2

customer1

What if: c1 and c2 are friends, c2 has no money, and

c2 needs to know how much change c1 will get before

ordering (to ensure c2 can afford his order)?


Fast food hazards1

Fast Food Hazards

cook

cook

cook

take

order

select

bun

add

ingredients

wrap and

bag

cash and

change

customer2

customer1


Cpu hazards

CPU Hazards

Fetch

(inst mem)

Decode

(reg file)

Execute

(ALU and

data mem)

  • called a data hazard

  • must be observed to ensure correct execution

  • there are two solutions to data hazards


Solution1 stalling

Solution1: Stalling

Fetch

(inst mem)

Decode

(reg file)

Execute

(ALU and

data mem)


How to insert bubbles

How to insert bubbles

  • option1: hardware stalls the pipeline

    • need extra logic to do so

    • happens ‘automatically’ for any code

  • option2: compiler inserts “no-ops”

    • a no-op is an instruction that does nothing

    • ex: add r0,r0,r0 (NIOS)

    • compiler must do it right or wrong results!

  • example: inserting a bubble with a no-op:

    add k1, k2

    noop

    add k3, k1


Solution2 forwarding lines

Solution2: Forwarding Lines

Fetch

(inst mem)

Decode

(reg file)

Execute

(ALU and

data mem)

  • add “forwarding” logic

    • to pass values directly between stages


Control hazards

Control Hazards

  • cpu predicts each branch is not taken

  • Better: predict taken

    • why?---loops are common, usually taken

  • More advanced: remember what each branch did last time

  • “branch predictor”:

    • a table that remembers what each branch did the last time

    • uses this to make a prediction next time


Some real cpu pipelines

Some Real CPU Pipelines

TC nxt IP

TC fetch

Drv

Alloc

Rename

Que

Sch

Sch

Sch

Disp

Disp

RF

RF

Ex

Flgs

BrCk

Drv

21264 Pipeline (Alpha)

Microprocessor Report 10/28/96

Pentium IV’s Pipeline:


Ece2436

ECE243

CPU: Alternate Architectures


Another multicycle cpu

ANOTHER MULTICYCLE CPU

Internal bus

Control

Signals to

All components

CONTROL

PC

MEMRead

MEMWrite

Imm3,4,5

IR

1

addr

MEM

Y

MAR

111 … 000

Dout

MDR

Din

Select

ALUop

ALU

Regs k0..k3

Z


Some control signals

SOME CONTROL SIGNALS

  • PCout:

    • write PC value to bus

  • PCin:

    • read bus value into PC

  • MDRinBus:

    • read value from bus into MDR

  • MDRinMem:

    • write value from Dout of MEM into MDR

  • MDRoutBus:

    • write value from MDR onto bus


Ex ctrl add k1 k2 k1 k1 k2

Ex: Ctrl: Add k1, k2 # k1 = k1 + k2

Internal bus

Control

Signals to

All components

CONTROL

PC

MEMRead

MEMWrite

Imm3,4,5

IR

1

addr

MEM

Y

MAR

111 … 000

Dout

MDR

Din

Select

ALUop

ALU

Regs k0..k3

Z


Ex ctrl add k1 k2 k1 k1 k21

Ex: Ctrl: Add k1, k2 # k1 = k1 + k2

Internal bus

Control

Signals to

All components

CONTROL

PC

MEMRead

MEMWrite

Imm3,4,5

IR

1

addr

MEM

Y

MAR

111 … 000

Dout

MDR

Din

Select

ALUop

ALU

Regs k0..k3

Z


Characterization of isas

CHARACTERIZATION OF ISAs

  • attribute #1:

    • number of explicit operands

  • Attribute #2:

    • are registers general purpose?

  • Attribute #3:

    • Can an operand be a memory location?

  • Attribute #4:

    • RISC vs CISC

  • Attribute #5:

    • Relation between instructions and data


Att1 num of explicit operands

att1: num of explicit operands

  • focus on calculation instructions (add,sub…)

  • running example: A = B + C (C-code)

    • assume A, B, C are memory locations

  • 0 operands:

    • eg., stack based (like first calculator CPUs)

    • push and pop operations, refer to top of stack


Att1 num of explicit operands1

att1: num of explicit operands

  • 1 operand:

    • eg., accumulator based;

    • accumulator is a reg inside cpu

    • instructions use accum as destination.


Att1 num of explicit operands2

att1: num of explicit operands

  • 2 operands

    • eg: 68k, ia32


Att1 num of explicit operands3

att1: num of explicit operands

  • 3-operand

    • eg: MIPS, SPARC, POWERpc

  • How many operands is NIOS?


Att2 are regs general purpose

Att2: are regs general purpose?

  • if yes:

    • you can use any register for any purpose

    • special registers are by convention only

  • if no:

    • some registers have hardwired purposes

    • ex: in 68k, A7 is hardwired to be stack pointer

    • used implicitly for jsr, rts, link instructions

  • Are NIOS registers general purpose?


Att3 operand mem location

Att3: operand = mem location?

  • with respect to calculation insts (add, sub)

  • if yes:

    • one operand can be in memory, the other in a register

    • maybe: can can also write result to memory

  • if no:

    • called a load/store architecture

    • only load/store insts can get/put memory values to/from regs

  • Can a NIOS operand be a mem location?


Att4 risc vs cisc

Att4: RISC vs CISC

  • Are there instructions with many steps?

    • a vague and debatable question

  • CISC: complex instruction set computer

    • Many, complex instructions

    • can be hard to pipeline!

    • ex: 68k, x86, PowerPC?

  • RISC: reduced instruction set computer

    • Fewer, simple instructions

    • easy to pipeline

    • ex: MIPS, alpha, Powerpc?

  • Which is NIOS?

  • Quandry: x86 is a CISC

    • but pentiumIV has a 20-stage pipeline!

    • How’d they do it?


Att5 relation bet insts data

Att5: Relation bet. insts & data

  • SISD: single instruction, single data

    • everyting we have seen so far

    • an inst only writes one reg/memory location

  • SIMD: single instruction, multiple data

    • one instruction tells CPU to operate on an array of regs or memory locations

    • ex: multimedia extensions: MMX, SSE, 3Dnow (intel); altivec (powerpc)

    • ex: IBM/Sony/toshiba Cell processor (vector processor)

  • MIMD: multiple instruction, multiple data

    • ex: Cluster of workstations, SMP servers, multicores, hyperthreading

  • Which is NIOS?


  • Login