lec 8 pipelining n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Lec 8: Pipelining PowerPoint Presentation
Download Presentation
Lec 8: Pipelining

Loading in 2 Seconds...

play fullscreen
1 / 27

Lec 8: Pipelining - PowerPoint PPT Presentation


  • 114 Views
  • Uploaded on

Lec 8: Pipelining. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Basic Pipelining. Five stage “RISC” load-store architecture Instruction fetch (IF) get instruction from memory Instruction Decode (ID) translate opcode into control signals and read regs Execute (EX)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Lec 8: Pipelining' - elton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
lec 8 pipelining

Lec 8: Pipelining

Kavita Bala

CS 3410, Fall 2008

Computer Science

Cornell University

basic pipelining
Basic Pipelining

Five stage “RISC” load-store architecture

  • Instruction fetch (IF)
    • get instruction from memory
  • Instruction Decode (ID)
    • translate opcode into control signals and read regs
  • Execute (EX)
    • perform ALU operation
  • Memory (MEM)
    • Access memory if load/store
  • Writeback (WB)
    • update register file

Following slides thanks to Sally McKee

pipelined implementation
Pipelined Implementation
  • Break the execution of the instruction into cycles (five, in this case)
  • Design a separate stage for the execution performed during each cycle
  • Build pipeline registers (latches) to communicate between the stages

Slides thanks to Sally McKee

stage 1 fetch and decode
Stage1: Fetch and Decode
  • Design a datapath that can fetch an instruction from memory every cycle
    • Use PC to index memory to read instruction
    • Increment the PC (assume no branches for now)
  • Write everything needed to complete execution to the pipeline register (IF/ID)
    • The next stage will read this pipeline register
    • Note that pipeline register must be edge triggered

Slides thanks to Sally McKee

slide5

M

U

X

1

PC + 1

incr

Instruction

bits

IF / ID

Pipeline register

Rest of pipelined datapath

Instruction

Memory/

Cache

PC

en

en

Slides thanks to Sally McKee

stage 2 decode
Stage2: Decode
  • Reads the IF/ID pipeline register, decodes instruction, and reads register file (specified by regA and regB of instruction bits)
    • Decode can be easy, just pass on the opcode and let later stages figure out their own control signals for the instruction
  • Write everything needed to complete execution to the pipeline register (ID/EX)
    • Pass on the offset field and destination register specifiers (or simply pass on the whole instruction!)
    • Pass on PC+1 even though decode didn’t use it
slide7

PC + 1

PC + 1

Contents

Of regA

Destreg

Contents

Of regB

Data

Instruction

bits

Control

Signals

ID / EX

Pipeline register

IF / ID

Pipeline register

regA

Register File

regB

Rest of pipelined datapath

Stage 1: Inst Fetch datapath

en

Slides thanks to Sally McKee

stage 3 execute
Stage3: Execute
  • Design a datapath that performs the proper ALU operation for the instruction specified and values present in the ID/EX pipeline register
    • The inputs are the contents of regA and either the contents of regB or the offset field in the instruction
    • Also, calculate PC+1+offset, in case this is a branch
  • Write everything needed to complete execution to the pipeline register (EX/Mem)
    • ALU result, contents of regB and PC+1+offset
    • Instruction bits for opcode and destReg specifiers
slide9

Alu

Result

Contents

Of regA

Rest of pipelined datapath

Contents

Of regB

M

U

X

contents

of regB

A

L

U

Control

Signals

ID / EX

Pipeline register

EX/Mem

Pipeline register

PC + 1

+ offset

Magic

PC + 1

Stage 2: Decode datapath

Control

Signals

Slides thanks to Sally McKee

stage 4 memory operation
Stage4: Memory Operation
  • Design a datapath that performs the proper memory operation for the instruction specified and values present in the EX/Mem pipeline register
    • ALU result contains address for ld and st instructions
    • Opcode bits control memory R/W and enable signals
  • Write everything needed to complete execution to the pipeline register (Mem/WB)
    • ALU result and MemData
    • Instruction bits for opcode and destReg specifiers
slide11

This goes back to the MUX

before the PC in stage 1

MUX control

for PC input

Alu

Result

Alu

Result

Rest of pipelined datapath

Control

Signals

Mem/WB

Pipeline register

EX/Mem

Pipeline register

PC+1

+offset

Data Memory

Stage 3: Execute datapath

contents

of regB

Memory

Read Data

en R/W

Control

Signals

Slides thanks to Sally McKee

stage 5 write back
Stage5: Write Back
  • Design a datapath that completes the execution of this instruction, writing to the register file if required
    • Write MemData to destReg for ld instruction
    • Write ALU result to destReg for arithmetic/logic instructions
    • Opcode bits also control register write enable signal

Slides thanks to Sally McKee

slide13

This goes back to data

input of register file

M

U

X

bits 0-2

This goes back to the

destination register specifier

M

U

X

bits 15-17

register write enable

Alu

Result

Memory

Read Data

Stage 4: Memory datapath

Control

Signals

Mem/WB

Pipeline register

Slides thanks to Sally McKee

sample code simple
Sample Code (Simple)
  • Assume eight-register machine
  • Run the following code on a pipelined datapath

add 3 1 2 ; reg 3 = reg 1 + reg 2

nand 6 4 5 ; reg 6 = ~(reg 4 & reg 5)

lw 4 20 (2) ; reg 4 = Mem[reg2+20]

add 5 2 5 ; reg 5 = reg 2 + reg 5

sw 7 12(3) ; Mem[reg3+12] = reg 7

Slides thanks to Sally McKee

slide15

+

A

L

U

M

U

X

1

target

PC+1

PC+1

0

R0

R1

regA

ALU

result

R2

Register file

regB

valA

M

U

X

PC

Inst

mem

Data

mem

instruction

R3

ALU

result

mdata

R4

valB

R5

R6

M

U

X

data

R7

offset

dest

valB

Bits 0-2

dest

dest

dest

Bits 15-17

M

U

X

Bits 21-23

op

op

op

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Slides thanks to Sally McKee

slide16

+

A

L

U

M

U

X

1

0

0

0

0

R0

0

36

R1

0

9

R2

Register file

0

M

U

X

PC

Inst

mem

Data

mem

nop

12

R3

0

0

18

R4

7

0

R5

41

R6

M

U

X

data

22

R7

0

dest

0

Initial

State

Bits 0-2

0

0

0

Bits 15-17

M

U

X

Bits 21-23

nop

nop

nop

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Slides thanks to Sally McKee

slide17

+

A

L

U

add 3 1 2

M

U

X

1

0

1

0

0

R0

0

36

R1

0

9

R2

Register file

0

M

U

X

PC

Inst

mem

Data

mem

add 3 1 2

12

R3

0

0

18

R4

7

0

R5

41

R6

M

U

X

data

22

R7

0

dest

0

Fetch:

add 3 1 2

Bits 0-2

0

0

0

Bits 15-17

M

U

X

Bits 21-23

nop

nop

nop

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Time: 1

Slides thanks to Sally McKee

slide18

+

A

L

U

nand 6 4 5 add 3 1 2

M

U

X

1

0

2

1

0

R0

0

36

R1

1

0

9

R2

Register file

2

36

M

U

X

PC

Inst

mem

Data

mem

nand 6 4 5

12

R3

0

0

18

R4

7

9

R5

41

R6

M

U

X

data

22

R7

3

dest

0

Fetch:

nand 6 4 5

Bits 0-2

3

0

0

Bits 15-17

M

U

X

Bits 21-23

add

nop

nop

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Time: 2

Slides thanks to Sally McKee

slide19

+

A

L

U

lw 4 20(2) nand 6 4 5 add 3 1 2

M

U

X

1

4

3

2

0

R0

0

36

R1

4

0

36

9

R2

Register file

5

18

M

U

X

PC

Inst

mem

Data

mem

lw 4 20(2)

12

R3

45

0

18

R4

9

7

7

R5

41

R6

M

U

X

data

22

R7

6

dest

9

Fetch:

lw 4 20(2)

Bits 0-2

3

6

3

0

Bits 15-17

M

U

X

Bits 21-23

nand

add

nop

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Time: 3

Slides thanks to Sally McKee

slide20

+

A

L

U

add 5 2 5 lw 4 20(2) nand 6 4 5 add 3 1 2

M

U

X

1

8

4

3

0

R0

0

36

R1

2

45

18

9

R2

Register file

4

9

M

U

X

PC

Inst

mem

Data

mem

add 5 2 5

12

R3

-3

0

18

R4

45

7

7

18

R5

41

R6

M

U

X

data

22

R7

20

dest

7

Fetch:

add 5 2 5

Bits 0-2

3

6

4

6

3

Bits 15-17

M

U

X

Bits 21-23

lw

nand

add

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Time: 4

Slides thanks to Sally McKee

slide21

+

A

L

U

sw 7 12(3) add 5 2 5 lw 4 20 (2) nand 4 5 6 add

M

U

X

1

23

5

4

0

R0

0

45

36

R1

2

-3

9

9

R2

Register file

5

9

M

U

X

PC

Inst

mem

Data

mem

sw 7 12(3)

45

R3

29

0

18

R4

-3

7

7

R5

41

R6

M

U

X

data

22

R7

20

5

dest

18

Fetch:

sw 7 12(3)

Bits 0-2

6

3

4

5

4

6

Bits 15-17

M

U

X

Bits 21-23

add

lw

nand

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Time: 5

Slides thanks to Sally McKee

slide22

+

A

L

U

sw 7 12(3) add 5 2 5 lw 4 20(2) nand

M

U

X

1

9

5

0

R0

0

-3

36

R1

3

29

9

9

R2

Register file

7

45

M

U

X

PC

Inst

mem

Data

mem

45

R3

16

99

18

R4

29

7

7

22

R5

-3

R6

M

U

X

data

22

R7

12

dest

7

No more

instructions

Bits 0-2

4

6

5

7

5

4

Bits 15-17

M

U

X

Bits 21-23

sw

add

lw

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Time: 6

Slides thanks to Sally McKee

slide23

+

A

L

U

sw 7 12(3) add 5 2 5 lw

M

U

X

1

15

0

R0

0

36

R1

16

45

9

R2

Register file

M

U

X

PC

Inst

mem

Data

mem

45

R3

99

57

0

99

R4

16

7

R5

-3

R6

M

U

X

data

22

R7

12

dest

22

No more

instructions

Bits 0-2

5

4

7

7

5

Bits 15-17

M

U

X

Bits 21-23

sw

add

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Time: 7

Slides thanks to Sally McKee

slide24

+

A

L

U

sw 7 12(3) add

M

U

X

1

0

R0

16

36

R1

57

9

R2

Register file

M

U

X

PC

Inst

mem

Data

mem

45

R3

0

99

22

R4

57

16

R5

-3

R6

M

U

X

data

22

R7

dest

22

No more

instructions

Bits 0-2

5

7

Bits 15-17

M

U

X

Bits 21-23

sw

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Time: 8

Slides thanks to Sally McKee

slide25

+

A

L

U

sw

M

U

X

1

0

R0

36

R1

9

R2

Register file

M

U

X

PC

Inst

mem

Data

mem

45

R3

99

R4

16

R5

-3

R6

M

U

X

data

22

R7

dest

No more

instructions

Bits 0-2

Bits 15-17

M

U

X

Bits 21-23

IF/

ID

ID/

EX

EX/

Mem

Mem/

WB

Time: 9

Slides thanks to Sally McKee

time graphs
Time Graphs

Time: 1 2 3 4 5 6 7 8 9

add

nand

lw

add

sw

fetch decode execute memory writeback

fetch decode execute memory writeback

fetch decode execute memory writeback

fetch decode execute memory writeback

fetch decode execute memory writeback

Slides thanks to Sally McKee

pipelining recap
Pipelining Recap
  • Powerful technique for masking latencies
    • Logically, instructions execute one at a time
    • Physically, instructions execute in parallel
      • Instruction level parallelism
  • Decouples the processor model from the implementation
    • Interface vs. implementation
  • BUT dependencies between instructions complicate the implementation