slide1
Download
Skip this Video
Download Presentation
Instruction Set Issues

Loading in 2 Seconds...

play fullscreen
1 / 24

Instruction Set Issues - PowerPoint PPT Presentation


  • 100 Views
  • Uploaded on

Instruction Set Issues. MIPS easy Instructions are only committed at MEM  WB transition Other architectures are more difficult Instructions may update state early FP more difficult Memory updating ops (e.g. string moves). Instruction Set Issues (cont.). Difficult architectural features

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Instruction Set Issues' - trella


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
instruction set issues
Instruction Set Issues
  • MIPS easy
    • Instructions are only committed at MEMWB transition
  • Other architectures are more difficult
    • Instructions may update state early
    • FP more difficult
    • Memory updating ops (e.g. string moves)
instruction set issues cont
Instruction Set Issues (cont.)
  • Difficult architectural features
    • “Odd” bits of state (e.g. condition codes)
      • May need saving/restoring on exceptions
    • Implicitly set condition codes
      • Complicate branch resolution
      • Explicit setting helps here (still a RAW hazard)
    • Multicycle operations
      • Widely differing execution times, lots of potential data hazards, etc.
instruction set issues1
Instruction Set Issues
  • VAX suffers from many of these problems
  • Solution: pipeline the microcode
  • Intel 32-bit 80x86 processors since 1995 use a similar approach
a 5 handling multicycle operations
A.5. Handling Multicycle Operations
  • MIPS: FP operations
    • Long latency (EX repeated)
    • Several functional units
    • Structural hazards
    • Data hazards
dlx fp design
DLX: FP Design
  • Four functional units:
    • Integer ALU
      • as before
    • FP multiplier
      • also used for integer multiplication
    • FP adder
      • addition, subtraction and conversion
    • FP divider
      • also used for integer division
hazards
Hazards
  • Divides
    • Structural hazard
  • Multiple register writes possible in a cycle
  • Out-of-order completion
    • WAW hazards
    • Exception-handling complications
  • RAW hazards increase
potential raw hazards
Potential RAW Hazards
  • Example (SPARC syntax):

ldd [%fp-8], %f4

fmuld %f4, %f6, %f0

faddd %f0, %f8, %f2

std %f2, [%fp-16]

multiple writes

Simpler: all stalls

at one point

Multiple Writes
  • Up to four instructions may need to write in the same cycle
  • Solution
    • Track writes in ID
    • Stall at instruction issue
  • Alternatively:
    • Stall at MEM or WB
      • Stall instruction with shorter latency (may free RAW hazards)
waw hazards
WAW Hazards
  • Example:

faddd %f4, %f6, %f2

… ! Integer op

ldd [%fp-8], %f2

waw hazards cont
WAW Hazards (cont.)
  • Rare
    • Compiler scheduling may result in unlikely instruction sequences, so must be caught
  • Solutions:
    • Stall issue of ldd
    • Prevent write by faddd
maintaining precise exceptions

Complete long

before fdivd

Maintaining Precise Exceptions
  • Out-of-order completion:

fdivd %f2, %f4, %f0

faddd %f10, %f8, %f10

fsubd %f12, %f14, %f12

  • Sub may cause an exception after add is complete, but not div
    • No longer precise
maintaining precise exceptions1
Maintaining Precise Exceptions
  • It may be very difficult to handle exceptions precisely
    • E.g. the add has destroyed one of its operands!
  • Four solutions:
    • Accept imprecise exceptions
      • Needed for VM & IEEE FP
      • Allow switching between precise and imprecise modes
maintaining precise exceptions2
Maintaining Precise Exceptions
  • Solutions (cont.)
    • Buffer results until earlier instructions complete
      • Buffers may grow very large, and extensive forwarding required
      • History files: restore original register values
      • Future files: store new register values
    • Software executes intervening instructions to get “up to date” before returning from exception
maintaining precise exceptions3
Maintaining Precise Exceptions
  • Solutions (cont.)
    • Hybrid scheme
      • Instructions are only issued when it is certain that preceding instructions will not cause an exception
      • May require stalling the pipeline
performance of the mips fp pipeline
Performance of the MIPS FP Pipeline
  • Structural Hazards (divide unit)
    • Very low: 0-2 cycles per FP operation
  • RAW hazards
    • Divide: 12-24 cycles, average 14.2
    • Add: 0.7-2.3 cycles, average 1.7
    • In general, about 0.5 × latency
overall mips fp performance
Overall MIPS FP Performance
  • Stalls per instruction
    • 0.65-1.21 cycles
    • Average: 0.87
    • 82% from FP RAW hazards
a 6 putting it all together mips r4000 pipeline
A.6. Putting It All TogetherMIPS R4000 Pipeline
  • 64-bit instruction set
  • Eight stage pipeline
    • superpipelining
    • IF + IS: instruction fetch
    • RF: decode/register fetch
    • EX: execution
    • DF + DS + TC: data cache access
    • WB: write back
mips r4000 pipeline
MIPS R4000 Pipeline
  • Performance
    • Load delay: two cycles
    • Branch delay: three cycles
      • Delayed branch (one cycle)
      • Predict-not-taken strategy, with anulling
  • Increased forwarding requirements
    • Three stages between EX and WB now
mips r4000 pipeline1
MIPS R4000 Pipeline
  • Floating Point
    • Three functional units
      • Divider, multiplier, adder
      • Shared components (8 sub-units)
    • Latency: 2–112 cycles
    • Initiation rate: 1–111 cycles
    • Complicated stall handling
mips r4000 pipeline2
MIPS R4000 Pipeline
  • Performance:
    • CPI between 1.2 and 2.8 for SPEC92 benchmarks
    • Average: 2.0
      • Integer: 1.54
      • FP: 2.48
    • Integer apps: mainly branch delays
    • FP apps: mainly FP data hazard stalls (RAW)
ad