1 / 31

Superscalar Microprocessors

Superscalar Microprocessors. Robert Hock 4/23/02. Superscalar Microprocessors. Topics Covered Superscalar Processor Overview MIPS R10000 Intel IA32 PowerPC. What does superscalar mean?. Definition:

elisha
Download Presentation

Superscalar Microprocessors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Superscalar Microprocessors • Robert Hock 4/23/02

  2. Superscalar Microprocessors • Topics Covered • Superscalar Processor Overview • MIPS R10000 • Intel IA32 • PowerPC

  3. What does superscalar mean? • Definition: • Superscalar machines are able to issue multiple instructions for each clock cycle from a conventional linear instruction stream

  4. In English This Time • A superscalar processor can run code out of sequence in order to optimize it. Instructions of various lengths introduce latency into the program execution. By piplining these instructions, it is possible to execute multiple instructions out of sync.

  5. How Does it Work? • Instructions are introduced in sequence • These instructions are scheduled dynamically by the hardware • More than one instruction can be issued each clock cycle • The number of instructions issued is also set dynamically by the hardware

  6. Phases of the Superscalar Pipeline • Fetch • Pre-fetch • Decode • Rename • Issue • Execute • Complete • Reorder • Commit • Retire • Write-Back

  7. Fetch & Decode • Fetching & Decoding can be done faster than Execution • Processor Fetches & Decodes more instructions than it Commits, because it discards instructions from mispredicted branch paths

  8. Pre-Fetch & Pre-Decoding • Pre-Decoding is done when instructions are transferred from memory to the cache • The Pre-Decoded instruction is more simple than the original • The Decoder can decode this format faster than the original

  9. Renaming • Renaming is the process of giving physical registers to take the place of logical registers

  10. Issue • Waiting instructions are analyzed to find instructions beyond the current instructions that can be executed independantly • This is “Look-Ahead” capability • Instructions can be issued in-order or out-of-order

  11. Execute • Instruction is Executed in either a single cycle, or may take multiple cycles • After Execution, the Completion phase is reached

  12. Reorder • The Reorder logic sorts whether the instruction was on a predictive branch, and whether that branch was correct • Execution exceptions are marked

  13. Commit • An executed instruction is committed when: • All previous instructions required by the program have already been committed • No interrupt has occurred • If instruction was executed from a branch prediction and the branch was correct

  14. Retire • An instruction is Retired when: • The instruction has been committed • The instruction has been removed because of branch prediction or exception

  15. Write-Back • As the name implies, final instruction data is written back

  16. MIPS R10000 Overview • 64-bit instruction set • Can decode 4 instructions per cycle • Has 5 execution pipelines • Uses dynamic scheduling and out-of-order execution • Does speculative branching

  17. MIPS R10000 Pipeline Diagram

  18. R10000 Functional Units • Integer ALU1 • Integer ALU2 • Load/Store Unit • Float Adder • Float Multiply

  19. R10000 Pipeline Stages • Stage 1 • Fetch 4 Instructions per cycle • Stage 2 • 4 Instructions are Decoded & Renamed • Only 1 Branch Instruction can be decoded per cycle • Stage 3 • Decoded Instructions Issued

  20. R10000 Pipeline Stages(cont) • Stages 4-6 (dependant on instruction) • Float Multiply (3 stage pipeline) • Float Adder (3 stage pipeline) • Integer ALU1 (1 stage pipeline) • Integer ALU2 (1 stage pipeline)

  21. Intel IA-32 Overview • 32-bit instruction set. • 3-Way Pipelined • 12 stage pipeline • “Optimized” Scheduling, that necessitates retiring instructions in linear order

  22. IA-32 Functional Units • Integer • Float • Load • Store1 • Store2 • Jump • MMX (Multimedia Instructions)

  23. IA-32 Pipeline Stages • Stages 1-5 • Fetch and Predecode • Stages 6&7 • Decode • Stage 8 • Renaming

  24. IA-32 Pipeline (cont) • Stages 9&10 • Issue • Stage 11 • Execution • Stage 12 • Retirement

  25. IA-32 Latencies • Integer Arithmetic – 1 • Integer Mult – 4 • Float Add – 3 • Float Mult – 5 • Load & Store – 3 • MMX Arithmetic –1 • MMX Mult – 3

  26. PowerPC 750 Overview • 64-bit RISC Processor • 32-bit addressing

  27. Functional Units • Float (3 Stage Pipeline) • Branch • Load/Store • Single Cycle Integer • Multi Cycle Integer

  28. PowerPC Pipeline • Fetch • Issue • Integer OP (+3 Depth) • Load OP (+7 Depth) • Store OP (+5 Depth) • Float OP (+6 Depth)

  29. Conclusion • While the R10000 and PowerPC are truly RISC based, the IA-32 has its roots in the CISC world. • The IA-32 has a deeper pipeline, allowing for increased clock cycles, which allows for increased sales. This is despite the fact that it delivers only mediocre performance.

  30. Conclusion (cont) • For intensive numerical computation and 3D rendering the MIPS R10000 is superior • For everyday applications that would require low-voltage/heat, the PowerPC line has an edge. • For the home user, the IA-32 will be sufficient until the AMD 64-bit Hammer line is introduced.

  31. For More Information • http://www.mips.com • http://www.intel.com • http://www.ibm.com • http://e-www.motorola.com/

More Related