slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Computer Architecture: A Constructive Approach Pipelined Implementations of SMIPS Joel Emer Computer Science & Artif PowerPoint Presentation
Download Presentation
Computer Architecture: A Constructive Approach Pipelined Implementations of SMIPS Joel Emer Computer Science & Artif

Loading in 2 Seconds...

play fullscreen
1 / 24

Computer Architecture: A Constructive Approach Pipelined Implementations of SMIPS Joel Emer Computer Science & Artif - PowerPoint PPT Presentation


  • 84 Views
  • Uploaded on

Computer Architecture: A Constructive Approach Pipelined Implementations of SMIPS Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology. Two-Cycle SMIPS: Analysis. Register File. Fetch. Execute. stage. PC. Execute. Decode. fr. +4.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Computer Architecture: A Constructive Approach Pipelined Implementations of SMIPS Joel Emer Computer Science & Artif' - tomai


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Computer Architecture: A Constructive Approach

Pipelined Implementations of SMIPS

Joel Emer

Computer Science & Artificial Intelligence Lab.

Massachusetts Institute of Technology

http://csg.csail.mit.edu/6.S078

two cycle smips analysis
Two-Cycle SMIPS: Analysis

Register File

Fetch

Execute

stage

PC

Execute

Decode

fr

+4

In any given clock cycle, lot of unused hardware !

Data

Memory

Inst

Memory

Pipelined execution of instructions to increase the through put

http://csg.csail.mit.edu/6.S078

instruction pipelining

f0

f1

f2

Instruction pipelining

Valid/Invalid

  • Much more complicated than arithmetic pipelines, e.g., IFFT
  • The entities in an instruction pipeline are not independent of each other
    • This causes pipeline stalls or requires other fancy tricks to avoid stalls

x

inQ

outQ

sReg1

sReg2

http://csg.csail.mit.edu/6.S078

hazards in instruction pipelining
Hazards in instruction pipelining
  • Structural hazard: Two instructions in the pipeline may require the same resource at the same time, e.g., contention for memory
  • Data hazard: An instruction in the pipeline may affect the state of the machine (rf, dMem or pc) – the next instruction must be fully cognizant of this change
  • Control hazard: An instruction in the pipeline may determine the next instruction to be executed, e.g., branches

Notice that none of these hazards are present in the IFFT pipeline.

http://csg.csail.mit.edu/6.S078

slide5

The power of computers comes from the fact that the instructions in a program are not independent of each other must deal with hazard

http://csg.csail.mit.edu/6.S078

two cycle smips
Two-Cycle SMIPS

Register File

stage

PC

Execute

Decode

ir

+4

Data

Memory

Inst

Memory

Register ir is to hold a fetched instruction and register stage is to remember which stage (fetch/execute) we are in

http://csg.csail.mit.edu/6.S078

ir the instruction register
ir: The instruction register
  • You may recall from our earlier discussion of pipelining (e.g., IFFT) that there is a possibility that the intermediate or pipelined registers do not contain any meaningful data
    • We can associate a (Valid/Invalid) bit to ir
    • Equivalently we can think of a pipeline register as a one-element FIFO

http://csg.csail.mit.edu/6.S078

two stage smips harvard
Two-stage SMIPS (Harvard)

Register File

PC

Execute

Decode

ir

+4

Let us assume we keep fetching instructions from pc, pc+4, pc+8, … and correct it when control hazard is detected

Data

Memory

Inst

Memory

http://csg.csail.mit.edu/6.S078

two stage pipeline smips harvard first attempt
Two-stage pipeline SMIPS (Harvard) – first attempt

modulemkProc(Proc);

Reg#(Addr) pc <- mkRegU;

RFilerf <- mkRFile;

Memory mem <- mkTwoPortedMemory;

letiMem = mem.iport; letdMem = mem.dport;

PipeReg#(TypeFetch2Execute) ir<- mkPipeReg;

ruledoProc;

letinst <- iMem(MemReq{op: Ld, addr: pc,

data: ?});

ir.enq(TypeFetch2Execute{pc: pc, inst: inst});

letnext_pc = pc + 4;

http://csg.csail.mit.edu/6.S078

two stage pipeline smips harvard first attempt1
Two-stage pipeline SMIPS (Harvard) – first attempt

if(ir.notEmpty) begin

letirpc= ir.first.pc;

letinst = ir.first.inst;

letdInst = decode(inst);

Data rVal1 = rf.rd1(dInst.rSrc1);

Data rVal2 = rf.rd2(dInst.rSrc2);

leteInst = exec(dInst, rVal1, rVal2, irpc);

if(memType(eInst.iType))

eInst.data <- dMem(MemReq{

op: eInst.iType==Ld ? Ld : St,

addr: eInst.addr, data: eInst.data});

if(regWriteType(eInst.iType))

rf.wr(eInst.rDst, eInst.data);

if (eInst.brTaken) next_pc = eInst.addr;

ir.deq; end

pc <= next_pc;

oops!

http://csg.csail.mit.edu/6.S078

control hazard
Control Hazard

Register File

Epoch

PC

Execute

Decode

ir

+4

Not only we must set the pc but throw away the fetched instruction in ir

Data

Memory

Inst

Memory

A new Epoch register can keep track of whether the instruction in ir is from the current epoch

http://csg.csail.mit.edu/6.S078

two stage smips unpipelined 1
Two-Stage SMIPS (unpipelined) - 1

modulemkProc(Proc);

RFilerf <- mkRFile;

Memory mem <- mkMemory;

FIFO#(FBundle) fr <- mkFIFO;

FIFOF#(Addr) nextPc <- mkFIFOF;

let exec <- mkSMIPSExecutionCombinational;

ruledoFetch;

Addr pc = nextPc.first;

nextPc.deq;

let instResp <- mem.side(MemReq{op:Ld,

addr:pc, data:?});

fr.enq(FBundle{predpc:pc, epoch:epoch,

instResp:instResp});

endrule

http://csg.csail.mit.edu/6.S078

two stage smips unpipelined 2
Two-Stage SMIPS (Unpipelined) - 2

ruledoDecodeExecuteWb;

letpcPlus4 = fr.first.predpc + 4;

letdecInst = decode(fr.first.instResp, pcPlus4);

Data src1 = rf.rd1(decInst.op1);

Data src2 = rf.rd2(decInst.op2);

Instrinst = unpack(fr.first.instResp);

letexecInst = exec.exec(decInst, src1, src2);

if(execInst.itype==Ld || execInst.itype==St) begin

execInst.data<- mem.side(MemReq{op:execInst.itype==Ld?

Ld: St, addr:execInst.addr, data:execInst.data});

end

nextPc.enq(exec.con? execInst.addr : pcPlus4);

if((execInst.itype == Alu || execInst.itype == Ld) && execInst.rdst != 0)

rf.wr(execInst.rdst, execInst.data);

fr.deq;

endrule

http://csg.csail.mit.edu/6.S078

two stage smips pipelined 1
Two-Stage SMIPS (Pipelined) - 1

modulemkProc(Proc);

Reg#(Bool) fetchEpoch <- mkReg(False);

Reg#(Bool) execEpoch <- mkReg(True);

Reg#(Addr) fetchPc <- mkRegU;

RFilerf <- mkRFile;

Memory mem <- mkMemory;

FIFO#(FBundle) fr <- mkFIFO;

FIFOF#(Addr) nextPc <- mkFIFOF;

let exec <- mkSMIPSExecutionCombinational;

http://csg.csail.mit.edu/6.S078

two stage smips pipelined 2
Two-Stage SMIPS (Pipelined) - 2

ruledoFetch;

Addr pc = ?;

Bool epoch = fetchEpoch;

if(nextPc.notEmpty)

begin

pc = nextPc.first;

epoch = !fetchEpoch;

nextPc.deq;

end

else

pc = fetchPc + 4;

fetchPc <= pc;

fetchEpoch <= epoch;

letinstResp <- mem.side(MemReq{op:Ld, addr:pc, data:?});

fr.enq(FBundle{predpc:pc, epoch:epoch, instResp:instResp});

endrule

http://csg.csail.mit.edu/6.S078

two stage smips pipelined 3
Two-Stage SMIPS (Pipelined) - 3

ruledoDecodeExecuteWb;

if(fr.first.epoch==execEpoch)

begin

letpcPlus4 = fr.first.predpc + 4;

letdecInst = decode(fr.first.instResp, pcPlus4);

Data src1 = rf.rd1(decInst.op1);

Data src2 = rf.rd2(decInst.op2);

Instrinst = unpack(fr.first.instResp);

letexecInst = exec.exec(decInst, src1, src2);

if(execInst.itype==Ld || execInst.itype==St) begin

execInst.data <- mem.side(MemReq{op:execInst.itype==Ld ?

Ld: St, addr:execInst.addr, data:execInst.data});

end

if(execInst.cond)

begin

nextPc.enq(execInst.addr);

execEpoch <= !execEpoch;

end

http://csg.csail.mit.edu/6.S078

two stage smips pipelined 4
Two-Stage SMIPS (Pipelined) - 4

if((execInst.itype == Alu || execInst.itype == Ld) &&

execInst.rdst != 0)

rf.wr(wbr.first.rdst, wbr.first.data);

fr.deq;

end

else

fr.deq;

endrule

http://csg.csail.mit.edu/6.S078

3 stage smips pipelined 1
3-Stage SMIPS (Pipelined) - 1

ruledoDecodeExecuteWb;

letwbStall = wbStallReg;

if(wbRind.notEmpty)

begin

wbRind.deq;

wbStall = False;

end

if(fr.first.epoch==execEpoch)

begin

letpcPlus4 = fr.first.predpc + 4;

letdecInst = decode(fr.first.instResp, pcPlus4, cp0_statsEn, cp0_fromhost, cp0_tohost);

Boolstall = wbStall;

if(!stall) begin

Data src1 = rf.rd1(decInst.op1);

Data src2 = rf.rd2(decInst.op2);

//execute

Instrinst = unpack(fr.first.instResp);

$display("E: Procinst: %d", inst);

traceFull("Proc", "exec", inst);

let execInst = exec.exec(decInst, src1, src2);

if(execInst.itype==Ld || execInst.itype==St) begin

execInst.data <- mem.side(MemReq{op:execInst.itype==Ld ? Ld : St, addr:execInst.addr, data:execInst.data});

end

if (execInst.cond)

begin

nextPc.enq(execInst.addr);

execEpoch <= !execEpoch;

end

if((execInst.itype == Alu || execInst.itype == Ld) && execInst.rdst != 0)

begin

wbr.enq(WBBundle{itype:execInst.itype, rdst:execInst.rdst, data:execInst.data});

wbStall = True;

end

fr.deq;

if(decInst.wr_stats) cp0_statsEn <= unpack(truncate(src2));

if(decInst.wr_host) cp0_tohost <= src2;

if(cp0_statsEn) num_inst <= num_inst + 1;

end

end

else

fr.deq;

http://csg.csail.mit.edu/6.S078

3 stage smips pipelined 2
3-Stage SMIPS (Pipelined) - 2

if(!stall) begin

Data src1 = rf.rd1(decInst.op1);

Data src2 = rf.rd2(decInst.op2);

Instrinst = unpack(fr.first.instResp);

letexecInst = exec.exec(decInst, src1, src2);

if(execInst.itype==Ld || execInst.itype==St) begin

execInst.data <- mem.side(MemReq{op:execInst.itype==Ld?

Ld: St, addr:execInst.addr, data:execInst.data});

end

if(execInst.cond)

begin

nextPc.enq(execInst.addr);

execEpoch <= !execEpoch;

end

http://csg.csail.mit.edu/6.S078

3 stage smips pipelined 3
3-Stage SMIPS (Pipelined) - 3

if((execInst.itype == Alu || execInst.itype == Ld) && execInst.rdst != 0)

begin

wbr.enq(WBBundle{itype:execInst.itype, rdst:execInst.rdst, data:execInst.data});

wbStall = True;

end

fr.deq;

end

end

else

fr.deq;

wbStallReg <= wbStall;

endrule

http://csg.csail.mit.edu/6.S078

two stage pipelined smips princeton architecture
Two-stage Pipelined SMIPS Princeton Architecture

Register File

Epoch

PC

Execute

Decode

ir

+4

Just like the Harvard design except for an additional structural hazard when a memory-type instruction is in the execute phase

Memory

http://csg.csail.mit.edu/6.S078

pipelined smips princeton
Pipelined SMIPS (Princeton)

modulemkProc(Proc);

Reg#(Addr) pc <- mkRegU;

Reg#(Bool) epoch <- mkReg(True);

RFilerf <- mkRFile;

Memory mem <- mkOnePortedMemory;

letuMem = mem.port;

PipeReg#(FBundle) ir<- mkPipeReg;

Wire#(Bool) dAcc<- mkDWire(False);

ruledoProc;

let next_pc = pc;

if(!dAcc) begin

let inst <- uMem(MemReq{op:Ld, addr:pc, data:?});

ir.enq(TypeFetch2Execute{pc:pc, epoch:epoch,inst: inst});

next_pc = pc + 4;

end

http://csg.csail.mit.edu/6.S078

pipelined smips princeton1
Pipelined SMIPS (Princeton)

if(ir.notEmpty) begin

letirpc= ir.first.pc;

let inst = ir.first.inst;

if(ir.first.epoch==epoch) begin

letdInst = decode(inst);

Data rVal1 = rf.rd1(dInst.rSrc1);

Data rVal2 = rf.rd2(dInst.rSrc2);

leteInst = exec(dInst, rVal1, rVal2, irpc);

if(memType(eInst.iType)) begin

eInst.data <- uMem(MemReq{

op: eInst.iType==Ld ? Ld : St,

addr: eInst.addr, data: eInst.data});

dAcc= True;

end

http://csg.csail.mit.edu/6.S078

slide24

Pipelined SMIPS (Princeton)

if(regWriteType(eInst.iType))

rf.wr(eInst.rDst, eInst.data); if(eInst.brTaken) begin

next_pc = eInst.addr;

epoch = ! epoch;

end

end

ir.deq;

end

pc <= next_pc;

endrule

endmodule

next time -- Data Hazards

http://csg.csail.mit.edu/6.S078