1 / 28

Microprocessors

Microprocessors. Introduction to PowerPC Architecture History & Interesting Tidbits. Outline. Motorola has a long tradition as the leading provider of embedded technologies has produced revolutionary microprocessor and microcontroller solutions

amy
Download Presentation

Microprocessors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Microprocessors Introduction to PowerPC Architecture History & Interesting Tidbits

  2. Outline • Motorola has a long tradition as the leading provider of embedded technologies has produced revolutionary microprocessor and microcontroller solutions • And Motorola continues to build on that tradition of leadership and innovation with the ever-expanding family of microprocessors that implement the PowerPC instruction set architecture • In these slides, we’ll take a look at just how the PowerPC got to be in the place it is today.

  3. Background of POWER1 • Part of IBM’s first attempt at making a real workstation • POWER – Performance Optimization With Enhanced RISC • IBM redefined RISC to mean Reduced Instruction Set Cycle • Unlike classic RISC design, the POWER1 would be a complex processor • This meant more high level instructions and more memory-data processors • This goes against initial RISC philosophy!

  4. POWER1 Branch unit • Had three Instruction Caches – branch, integer, and floating point units • Branch unit unusually complex • Contained a program counter, condition code (CC) register, and loop register • CC register – 8 fields • First 2 reserved for fixed & float ops • The 7th for vector operations • and the rest could be set separately

  5. POWER1 Branch unit cont. • Loop register is a counter for ‘decrement and branch on zero’ loops with no branch penalty • Branch unit could dispatch multiple instructions while itself executing a program control op (up to four ops at once, and out of order) • This meant this is one of the first superscalar CPUs!

  6. Integer/Float units • Two 32-bit registers for the integer unit and all load/store operations • Register R0 treated as a constant zero for some instructions • Used an MQ register for extended precision mutiply/divides • Similar to the MIPS HI/LO registers • Thirty two 64-bit registers for floating point unit • Performed only double precision operations • Used a condition bit to catch float errors (no exceptions!)

  7. MQ register • The MQ Register is 36 bits • During a multiply instruction, MQ contains the multiplier • During a divide instruction, MQ receives the quotient • It can be shifted right or left, independently, or combined with AC into a 72-bit register

  8. PowerPC • Born out of a desire to produce a version of the POWER that would succeed both the Motorola 68000 & Intel x8086 • Most notable changes: • Elimination of the MQ register • Replaced by separate upper and lower half instructions (able to execute simultaneously) • Some complex instructions were removed • Emulated in the new PowerPC • Support for 32-bit floating point

  9. PowerPC 601 (G1) • Meant to bridge the POWER1 and PowerPC features • Geared towards consumers using workstations rather than high end • Essentially the same as the POWER1 except for a 32K cache (rather than separate I/D caches) • Held onto many of ‘legacy’ instructions from the POWER1

  10. The POWER2 is RISCy • The big selling point of the POWER2 was its ability to handle six instructions at one time • However, it came with the caveat “under ideal conditions” • They couldn’t be just any old instructions -- to maintain that performance, the POWER2 had to mix exactly two integer instructions, two floating-point instructions, and two branch or condition-code instructions

  11. POWER2 cont. • Other additions to the Power2 were: • Quad-word load and store instructions • Hardware square root instruction • New instructions for conversion of floating-point values to integers • Like the POWER1, this was targeted to high end systems, leaving average users to use the PowerPC

  12. PowerPC 603 (G2) • Separated the load/store ops from the integer unit • Split the branch unit into a fetch/branch unit, a dispatch unit, and a completion/exception unit • Added a ‘rename’ buffer in the dispatch unit for speculative execution using renamed integer & float registers

  13. The little processor that couldn’t • Strategy for reducing the size of the 603 – • Use a split cache design (instead of a more complex unified cache) • Remove "unused“ or “legacy” instructions • Reduced the cost and the power, so 603’s could be made much cheaper, and at higher speeds. • Had a slight performance penalty (per MHz) but the chips could be made at higher speeds -- which would more than make up for it. • A good idea, but marketing can be unpredictable

  14. 603’s Marketing Blunder • The 603 was compared to the 601 and other high end machines • MHz per dollar, the 603 beat out the 601 • But simply comparing MHz to MHz, the 601 was largely faster • So buyers got the impression that they were getting ripped off • A case of mistaken expectations!

  15. 603 – The Engergizer processor • Despite initial marketing problems, this processor became prolific and had far more variants than any other PowerPC • 603e (603+ / Stretch) – used to solve cache size problems • 603ev, 603p (Valiant), 603r, 603er (Goldeneye) – manufacturing optimization

  16. PowerPC 604 • The G2 processors were split into two different families (the 603's and the 604's). • The 604's were meant to be the ‘bad boys’ of the desktop - Power and cost were not as important as pure blinding speed. • Unlike the 601 and 603, the 604 can do as many as 4 simultaneous instructions

  17. PowerPC 604 float support • 604's also had tweaks to improve its ability to run inside of its larger L2 cache • Floating Point units can become very dependant on cache and memory performance • The results: • 20% faster than the 603 at integer • roughly 70% faster in floating point • Just over twice as fast as the Pentiums of the same time

  18. Dynamic Branch Prediction • Processors take big performance penalties if they can't preload the cache • Being able to accurately "guess" the most likely used path can help keep the cache "preloaded" and increase processor performance • The 604 was the first mainstream processor to use "Dynamic Branch Prediction“ • This greatly increased performance

  19. G3’s – The Next Generation • Initially, the plan had been to create a new chip ‘solely’ based on the 604 • But after the highly successful second generation of PowerPC's, IBM and Motorola decided to split out development and create more processors

  20. 740 (Arthur) • The first was the 603 derivative • This processor got some changes to the core (the way it executes instructions) • Optimized the processor for the Macintosh OS • This of course resulted in a large performance boost, even more so than the boosts offered by the new backside cache • The 740 was fast, extremely small and efficient • It was outperforming Pentium II's while using less than 1/5th the amount of power and size

  21. 750 (Typhoon) • A variant of the 740 that has a fast method of access to the L2 ‘backside’ cache • Allows higher performance • L2 cache runs much faster than most -- and at speeds up to the clock rate of the main processor • Cache system really speeds things up, but requires more electronics (and pins) than the 740 • So while the chip cost isn't much more, the added cache can drive the cost of the system up (and increase the total power usage). • Still has very good performance per cost

  22. Hardware Aside • Aluminum has long been the standard material used for semiconductor wiring • IBM managed to use copper technology in their G3’s • The result? • Enhance chip performance • Reduced die size and power consumption • 750 first created with standard aluminum design operating at up to 300 MHz • Applying IBM's copper manufacturing process to the same chip, the 750 featured speeds of at least 400MHz - a 33 percent performance improvement for the same chip!

  23. Make room for 4th Generation • The 603 derived G3 performed very well with its backside cache and was very cheap to make and quite scalable by just adding more L2 cache (or faster L2 cache) • Apple killed clones and focused the product lines, which all reduced demands for as many different high-end desktop PPC's • The end results being that the 604 derived G3's (code named Habanero), and some of the other flavors (like ones with better MP support) were scrapped in favor of focusing on the G4's. Which makes sense, considering these other processors wouldn't be coming out until basically the same time as the G4's anyway, and you shouldn't split into that many different development efforts (waste of money)

  24. G4 • In direct response to Intel’s MMX instructions, AltiVec extensions were added to the G4 PowerPC • AltiVec adds a new set of 128-bit registers • Separate vector execution unit & instruction set supported by branch unit • Allows multimedia instruction to be executed in parallel with both int and float ops • Added an addition VRSAVE register to track which vector registers are being used • Reduces the # of registers needed to be saved

  25. G4 cont. • Supports a 2 Megabyte L2 Cache which can help performance over the previous 1 MB L2 limit. • The mpx bus (used on the G4) is asynchronous and allows for up 4 outstanding accesses at the same time • The results are up to a 3 fold performance increase for memory bound operations. • This is why specs can be so deceptive. Without changing the speed of the bus at all, Apple/Motorola made it up to 3 times faster!

  26. Conclusion • Obviously, the PowerPC architecture will play a part in imbedded technology for years to come (due to low cost & energy) • As far as personal computers and workstations go, the PowerPCs generally outperform their Pentium counterparts • However, much of what’s holding the PowerPC back is consumer obsession with MHz

  27. MHz vs. Mega Bucks • “Only weeks ago, Motorola announced at a semiconductor conference that it would soon start shipping G4 processors operating close to the 1GHz mark. During his conference call, Jobs indicated that Apple would be working closely with Motorola to bridge the MHz gap, and introduce faster chips into the G4 systems. And in a rare preview of the future, Jobs indicated that new, faster G4 systems would begin shipping within the next 6 months.” - G4 Store Special Report

  28. Works Cited • http://www.g4store.com/news/ • http://www.mackido.com/Hardware/ • http://developer.apple.com/technotes/ • http://www.byte.com/art/9401/sec7/art2.htm • http://www3.sk.sympatico.ca/jbayko/cpu5.html • http://www.mot.com/SPS/PowerPC/

More Related