1 / 13

10/27: Lecture Topics

10/27: Lecture Topics. Survey results Current Architectural Trends Operating Systems Intro What is an OS? Issues in operating systems. Superscalar Pipelines. Superscalar pipelines can execute multiple instructions at once 2+ instructions in any stage of the pipeline

jess
Download Presentation

10/27: Lecture Topics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 10/27: Lecture Topics • Survey results • Current Architectural Trends • Operating Systems Intro • What is an OS? • Issues in operating systems

  2. Superscalar Pipelines • Superscalar pipelines can execute multiple instructions at once • 2+ instructions in any stage of the pipeline • Some processors allow 8 instructions to be issued at once • Most programs can only take advantage of 1 or 2 issue slots

  3. Out-of-Order Execution • Allows you to execute any instruction that you can • Enables more issue slots to be filled • Often out-of-order execution, but in-order commit • that is, write back results in the order they should have occurred • Note: IA-64 is in-order

  4. Longer Pipelines • Pipelines are getting longer • original RISC pipelines had 5 stages • pipelines now have up to 20 stages • Allows the clock cycle to be very fast • Okay as long as you can accurately predict branches (or get rid of them)

  5. Speculation • Prediction • better branch predictors (95% accurate) • predict many levels of branches • predict variable values • predict load addresses • Simultaneously execute both paths of a branch • Execute instructions even if there could be a dependency • sw after lw could be the same address, but probably not • let the sw execute and then fix it if you were wrong

  6. Predicated Execution • Predicated execution allows conditional moves and conditional adds instead of only conditional branches • Avoids branches, which are bad because pipelines are so long • IA-64 almost everything in IA-64 is predicated (many 1-bit predicate registers) • HW problem with movn and movz was an example of this

  7. VLIW • Long Instruction Words (LIW) and Very Long Instruction Words (VLIW) • each instruction contains multiple smaller instructions that execute in parallel • (V)LIW instructions can be 128 to 1024 bits long and contain 3 to 16 instructions • It's the compiler's job to find independent instructions to execute

  8. Register Windows • Saving registers on the stack during procedure call hurts performance • Register windows use a stack of registers that are allocated to a procedure as it needs it Baz() Bar() Foo()

  9. Smarter Compilers • VLIW requires good compilers • Predicated execution and speculation needs help from the compiler • Old architectures had instructions to emulate high-level constructions (bad) • New architectures provide many general instructions and instruction options • IA-64 will keep compiler writers busy for a decade

  10. Multiple CPUs on a Chip • Chip multiprocessors • multiple simple CPUs, but share a cache • can run multiple programs simultaneously • single programs are no faster • like a multiprocessor machine but cheaper • Simultaneous Multithreading (SMT) • more complex CPUs • like chip multiprocessors + superscalar + out-of-order • also improves single program performance • developed at UW • memory bandwidth is an issue for both

  11. Funky Hardware on a Chip • We can squeeze more and more transistors on a chip • What do we do with them? • Bigger caches (boring) • Put programmable hardware on the CPU • FPGAs can be (re)programmed quickly • hardware runs 1000X faster than software • Graphics specific hardware • Instruction Co-Processors • Simultaneously run two copies of all programs to avoid hardware glitches

  12. Low Power • CPUs are being put in everything, even devices that have very small batteries (tiny sensors) • Need to make CPUs that use very little power (only as much as they need) • reduce the CPU clock frequency • allow the OS to turn off part of the chip • Transmeta is building chips that emulate Intel x86, but with less power

  13. Time to Market • It used to be solely about being the fastest • Now being adequate is enough • Being the first technology to fill a need is the most important

More Related