Topic 5 Processor Development AH Computing Computer Architecture
SQA arrangements • Description of the evolution of the following microprocessor architectures: the Power PC series, the Intel X86 series and the Intel IA-64 in terms, where appropriate, of the following features and techniques: • increasing clock speeds • data bus widths • pipelining • superscalar processing • branch prediction • speculative loading of data and executing of instructions • predication • the number and function of registers used • SIMD • RISC • CISC • Explanation of the relationship between these developments and system performance.
Introduction • From 1980s, microprocessor architecture has developed rapidly, as a result of • Increasing miniaturisation of microelectronic circuitry, which means that more and more complex chip designs have become possible and economically viable • The pressure form software developers to design microprocessors with ever increasing performance
Introduction • The first microprocessors were not general purpose processors but were designed for specific applications
Intel 4004 (1971) • the first complete CPU on one chip • the first commercially available microprocessor used in calculators, data terminals, numeric control systems etc. • 16 general purpose registers • 1KByte of data memory and 4Kbytes of instruction memory • 16 4-bit GP registers • Clock speed of 740 KHz • 45 instructions Development of Intel
Intel 8080 (1974) • 16-bit address bus, 8-bit data bus • PC was 16 bits long • 7 8-bit GP registers • Used in the first personal computer, the Altair 8800 • Others…Zilog Z-80, Motorola/MOS 6502
Processor Development Look at the evolution of families of processors • Power PC • Intel X86 • Intel I-64
Processor Development Compare the following features and techniques • Increasing clock speeds • Data bus widths • Pipelining • Superscalar processing • Branch prediction • Speculative loading of data • Predication • The number and function of registers used • SIMD • RISC • CISC
Pentium • Intel introduced superscalar architecture to the Pentium processor • 2 integer arithmetic and logic units • 1 Floating Point unit • 8 80-bit
Development of registers X86 8086 80286
Development of registers X86 Pentium 3
Summary of X86 The X86 series of microprocessors can be characterised as having: • a relatively small number of registers (8 GP, 8 FP and 8 SIMD) • a large instruction set • instructions of varying length • many addressing modes • These characteristics are typical of CISC (complex instruction set computer) architecture. Other CISC based processors include the IBM 370 and the VAX11/780.
Questions (Scholar page 128) • Sketch a graph of the increase in clock speeds from the 8086 to the Pentium processor • Which of the X86 processors was the first to use pipelining to improve performance? • How many registers has the (a) 8086, (b) 80286, (c) 80486 (d) Pentium • Which X86 chip was the first to have a superscalar architecture? • The X86 series are considered to be CISC processors. Justify this claim.
Background • Improvements in processor capability and operating systems led to the birth of the Wintel PC • Wintel is portmanteau of Windows and Intel. It usually means a computer based on an Intel x86 compatible processor and running the Microsoft Windowsoperating system. • Still dominates the laptop and desktop market
Motorola • At the same time Motorola was developing its own family of microprocessors, the 68000 series • These were developed as 32-bit processors from start • As a result, Apple was able to develop its Macintosh computers with true graphical OS from the start
Motorola 68000 (1979) • Same time as Intel 8086 • 8MHz clock speed • 32-bit architecture • 16-bit data bus, 24-bit address bus • 16 32-bit registers (8 data, 8 address) • No segment registers required as direct addressing used • Used pre-fetching to speed up execution
Motorola 68020 (1984) • 32-bit data and address buses • Pipeline had 3 stages • 256 cache added
Motorola 68040 (1991) • 32-bit data and address buses • Pipeline had 6 stages • Floating point unit added • 4Kbyte caches for data and programs added
Motorola 68060 (1994) • Superscalar – 3 execution units, 2 integer and 1 FP • 10 stage pipelines • 8Kbyte caches for data and programs
Motorola series • Used in Sun workstations, Apple Macintosh computers, and later Atari computers • No longer in use in main computer market • Still used in embedded systems • Motorola and IBM designed the first PowerPC chip to
Main Characteristics of Motorola series In the final years of the 68000 processors, Apple, Motorola and IBM defined a specification for open system software and hardware, and Motorola and IBM designed the first PowerPC chip to meet this specification.
PowerPC • Acronym for “performance optimised with enhanced RISC” • Compared with CISC-based X86 • More registers • A smaller, but more efficient, instruction set • Less addressing modes
PowerPC • First chip 601 in 1993 • 32-bit chip with a 64-bit data bus • Clock speed of 60MHz • Up to 4 Gb of memory • Superscalar architecture 3 independent execution units (integer, floating point and branch processing) – each with a 6 stage pipeline
Used in the XBox Used in the Nintendo Wii
Power PC overview • Used in • Controllers in cars • Networking – routers and servers • Honda’s Asimo • Vehicle-Management Computer for the F-35 fighter jet • Playstation 3, Wii, Nintendo DS
All Power PC processors have • two sets of 32 programmer accessible GP registers (64 bits wide) • And a small number of special purpose registers
Direct addressing for Load, Store and Branch instructions. All other instruction address internal registers Comparison of X86 with PowerPC
Summary of table • clock speeds have increased by a factor of 50 in 10 years • bus speeds have increased by a factor of 20 • the complexity (no. of transistors) has increased by a factor of 20 • on chip cache has increased • new features have been added.
Clock speeds • PowerPC chips had clock speeds lower than CISC based designs • But more efficient RISC based technology gave a better performance. • Clock speed alone cannot be used to compare processors
Questions (Page 133) • Which 3 companies cooperated in the design of the PowerPC specification? • What was the first PowerPC chip released, and when? • The 601 chip can be described as superscalar. How is this justified? • How many programmer accessible registers are there in all PowerPC chips? • Compare the X86 and PowerPC architectures in terms of • a) instructions set • b) instruction length • c) addressing modes • What new feature did the G3 chip have which improved performance? • Which was the first PowerPC chip to have SIMD instructions? • a) 601 • b) 604e • c) G3 • d) G4 • e) G5 • Why is clock speed not a good way of comparing a Windows PC with a Apple Macintosh? • Other than in Apple computers, what are PowerPC chips used for?
Answers Q10: Apple, Motorola, IBM Q11: the 601 in 1993 Q12: it has 3 independent processing units - the floating point unit (FPU), the integer ALU, and the system unit Q13: 2 sets of 32 registers, each 64 bits wide Q14: a) similar - X86 has 235 different instructions, PowerPC has 225 b) X86 has varied instruction lengths (1-11 bytes), the PowerPC instructions are all exactly 4 bytes c) the X86 has 11 addressing modes, the PowerPC has only 2 Q15: L2 "backside" cache on chip Q16: d) G4 Q17: because the Mac uses the more efficient RISC architecture, a Mac with a lower clock speed may outperform a Windows PC with a higher clock speed Q18: IBM servers, Nintendo Game Cube, and a range of embedded applications
Intel IA-64 • The X86 series reached its peak with the Pentium 3, Pentium 4 and Athlon processors. • These are essentially CISC processors, using pipelining and superscalar processing, but with some RISC-like features. In 1994, Intel and HP began work on designing a new 64-bit architecture to replace the X86 series.
EPIC • Combination of RISC and CISC features, and is given the description EPIC - explicitly parallel instruction computing. There are 4 key features to the design: • instruction level parallelism - the compiler creates code which uses the many parallel execution units of the processor • use of VLIW - very long instruction words • use of predication - executing both branches of a program, then discarding the "not chosen" branch results • use of speculative loading - use of large fast cache to load data and instructions in advance of when they will be required