1 / 41

Instruction Set Principles

Instruction Set Principles. ISA should reflect application characteristics: Desktop computing is compute-intensive , thus focusing on features favoring Integer and FP ops; Server computing is data-intensive , focusing on integers and char-strings (yet FP ops are still standard in them)

najwa
Download Presentation

Instruction Set Principles

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instruction Set Principles • ISA should reflect application characteristics: • Desktop computing is compute-intensive, thus focusing on features favoring Integer and FP ops; • Server computing is data-intensive, focusing on integers and char-strings (yet FP ops are still standard in them) • Embedded computing is time-sensitive, memory and power conciouse, thus focusing on code-density, real-time and media data streams.

  2. Instruction Set Principles • Taxonomy of ISA: • Stack: both operands are implicit on the top of the stack, a data structure in which items are accessed an a last in, first out fashion. • Accumulator: one operand is implicit in the accumulator, a special-purpose register. • General Purpose Register: all operands are explicit in specified registers or memory locations. Depending on where operands are specified and stored, there are three different ISA groups: • Register-Memory:one operand in register and one in memory.Examples: IBM 360/370, Intel 80x86 family, Mototola 68000; • Memory-Memory:both operands are in memory. Example: VAX. • Register=Register (load & store): all operands, except for those in load and store instructions, are in registers. Examples: SPARC (Sun Microsystems), MIPS, Precision Architecture (HP), PowerPC (IBM), Alpha (DEC).

  3. Instruction Set Principles CA+B Taxonomy of ISA: Examples (a) Stack (e) Memory-Memory (d) Reg-Reg/Load-Store (b) Accumulator (c) Register-Memory TOS Reg. Set Reg. Set Stack Accumulator ALU ALU ALU ALU ALU Memory Memory Memory Memory Memory Push A Push B Add Pop C Load A Add B Store C Load R1,A Add R1,B Store R1,C Load R1,A Load R2,B Add R3,R1,R2 Store R3,C Add C,A,B or Add A,B

  4. Instruction Set Principles • Comparisons:

  5. Instruction Set Principles • Addressing Memory: how to specify and interpret memory address is important since all data are initially in the memory. • Interpreting Memory Addresses • All computers, except DSPs, are byte-addressed, providing access for bytes, half-words (2 bytes), words (4 bytes), and double words (8 bytes) • Ordering bytes within a larger object: 8 bytes in a double word • Little Endian • Big Endian • Byte ordering can be a problem when exchanging data between computers with different ordering conventions • Alignment of bytes: an access to an object of size s bytes at byte address A is aligned if A mod s = 0. Memory is aligned on a multiple of a word or double-word boundary • Misalignment causes extra memory accesses and HW costs 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7

  6. Instruction Set Principles • Addressing Modes: how ISA specifies the address of an object to be accessed (fig. 2.6-2.7) • Operands: they can be found in registers, memory locations, and instructions themselves (instruction stream) • Effective Address: specifies the actual memory address when a memory location is used for an operand • PC-Relative Addressing: addressing modes that depend on the program counter • Immediates/Literals: considered as memory addressing modes, even though the value they access is in the instruction steam • Displacement Mode: must determine the range of displacement judiciously (via quantitative studies, fig. 2.8) • Immediate/literal Mode: must decide the level of support (all or a subset ops) and the range of values (fig. 2.9-10) • Modulo/Circular Mode for DSPs:handling infinite, continuous stream of data relies on circular buffers • Bit-Reverse Mode: used exclusively for FFTs

  7. Instruction Set Principles • Type and Size of Operands: encoding in opcode designates operand types in all modern day computers while tags were used to indicate types in old machines • Desktop and Server architectures: • Character: 8-bit, usually in ASCII • 16-bit Unicode: used in Java is gaining popularity • Integers: are almost universally represented as two’s complement binary numbers – short integer (half-word), integer (word), long integer (double-word) • Single-precision (1-word) and double-precision (2-word) floating point: the IEEE float-point standard, IEEE standard 754 • Architectures supporting business applications: • Packed decimal/binary-coded decimal: 4 bits are used to encode the values 0-9 and two decimal digits are packed into each byte, for getting results that exactly match decimal numbers (some decimal fractions do not have exact representation in binary) • Frequency of access to types helps determine what types are most important to support efficiently (fig. 2.12)

  8. Instruction Set Principles • Operands for Media and Signal Processing: • Graphics applications deal with 2D and 3D images • Vertex:usually of 32-bit floating-point values, isa data structure with four components for representing 3D images: x-coordinate, y-coordinate, z-coordinate, w-coordinate (color or hidden surfaces) • Pixel: consists of four 8-bit channels: R (red), G (green), B (blue), and A (transparency of the surface or pixel) • DSPs adds a unique data type: • fixed point:a binary point just to the right of the sign bit, thus representing a fraction between –1 and +1 • Blocked floating point: because the exponent variable is often shared among many fixed-point variables (the fixed point does not include an exponent in every word, thus relying on DSP programmer to keep the exponent in a separate variable and ensure that each result is shifted left or right to keep alignment).

  9. Instruction Set Principles • Operations in the Instruction Set (fig. 2.15): • Rule of thumb: the most widely executed instructions are the simple operations of an instruction set (fig 2.16) • Operations for Media and Signal Processing: less precision and narrower data width due to the tolerance of human perception • Partitioned add: 4 16-bit adds performed on a single 64-bit ALU in a single cycle (SIMD or vector instructions, fig2.17) • Paired operations: one instruction can launch two 32-bit operations on operands found side by side on a double-precision register • Saturated arithmetic: due to real-time requirement, DSP does not allow exception handling and must tolerate overflow by substituting it with the largest representable number • Multiply-accumulate (MAC): key to dot-product operations for vector and matrix multiplies (MACs/second is the primary peak performance metric for DSP)

  10. Instruction Set Principles • Instructions for Control Flow • There four different types of control flow change (fig 2.19): • Conditional branch: 75% integer and 82% fp • How to specify branch conditions? (fig 2.21-2.22) • Jump (or unconditional branch): 6% integer and 10% fp • Procedure calls and Procedure returns: 19% and 8% • Caller saving vs. callee saving • Addressing Modes for Control Flow Instructions: • PC-relative: advantageous for cases where targets are near the branch instruction and has the desirable property of position independence (fig 2.20) • Register indirect jumps: if the target is not known at compile time, PC cannot be used; rather, a location is used to dynamically specify the target • Case of switch: in most languages • Virtual functions or methods: in OO languages • High-order functions or function pointers: in C or C++ • Dynamically shared libraries

  11. Instruction Set Principles • Encoding an Instruction Set: there are three choices • Variable: allows virtually all addressing modes to be with all operations, enabling the smallest code representation • examples: VAX and Intel 80x86 (1-5 operands, each with 10 addressing modes) • Fixed: load-store ISA, with only one memory operand and only one or two addressing modes, thus being able to encode addressing mode as part of the opcode • Examples: Alpha, ARM, MIPS, PowerPC, SPARC, SuperH • Largest code size • Hybrid: IBM 360/370, MIPS16, Thumb, TI TMS320C54x (fig 2.23) • Competing forces:no. & size ofreg & addr modes, code, pipeline Operation and # of operands Address specifier 1 Address field 1 Address specifier n Address field n Address field 2 Address field 1 Address field 3 Operation

  12. Instruction Set Principles • The Role of Compilers: • The Structure of Recent Compilers: multi-phased (fig. 2.24) • Difficulties: compiler makes gross assumptions about the abilities of later phases, hence phase-ordering problem. For instance, it can not guarantee allocations of registers where they are most desirable. • Example: global common subexpression elimination -- replacing multiple computations of the same variable with a single computation and a temporary location for storing the value. If this temporary is not allocated a register, the slow accessing to memory may actually negate the gain from such optimization! • Register Allocation: plays a central role in compiler optimization both in speeding up the code and in making other optimizations useful. • graph coloring (≥16 general purpose registers) for simple cases and heuristics for more complicated cases;

  13. Instruction Set Principles • Impact of Optimizations on Performance: • Major types of optimizations and examples in each class • Change in instruction count for the programs lucas and mcf from the SPEC2000 as compiler optimization levels vary: • Level 0:unoptimized; • Level 1: local optimizations, code scheduling, and local register allocation; • Level 2: global optimizations, loop transformation, and global register allocation; and • Level 3: procedure integration

  14. Instruction Set Principles • The Impact of Compiler Technology on the Architect’s Decisions: • How are variables allocated and addressed? • How many registers are needed to allocate variables appropriately? • stack: procedure calls (grows) and returns (shrinks), activation of records; most effective with register; • global data area: statically declared objects -- arrays or aggregate data structure; difficult, if not impossible, to allocate registers if objects are aliased; • heap: dynamic objects -- accessed through pointers and typically non-scalar; almost impossible for register allocation due to pointers • Because of aliasing, a compiler must be conservative for it is impossible to know what a pointer may refer to, or inversely, what an object is referred to by.

  15. Instruction Set Principles • How the Architect Can Help the Compiler Writer: • Guiding principle for compiler designer: Make the frequent cases fast and the rare cases correct. • Other guide lines: • Regularity: orthogonality (independence among the 3 components of ISA: operation, data type, and addressing mode) helps to make decision early and correctly; • Provide primitives, not solutions: support for HLL should be in ways that's not language dependent; • Simplify trade-offs among alternatives: (optimizing objectives) help the compiler writer understand costs of various alternatives; • Provide instructions that bind the quantities known at compile time as constants • It is better to err on the side of simplicity: less is more!!

  16. Instruction Set Principles • The MIPS Architecture: • MIPS is a simple 64-bit load-store architecture. • 32 64-bit general purpose registers: • R0, R1, … R31 integer registers; Value of R0 is always 0. • 32 64-bit floating point registers: • F0, F1, … F31 floating point registers; • Data types: • 8-bit bytes, 16-bit half words, 32-bit words, and 64-bit double words for integers; • 32-bit single precision and 64-bit double precision for floating point. • Addressing modes: • Register; Immediate and displacement with 16-bit field. • Byte-addressable memory, a mode bit to allow software to select either Big Endian or Little Endian • Instruction encoding: fixed

  17. Instruction Set Principles • The MIPS Instruction Format:

  18. Instruction Set Principles • The MIPS Operations: • Load and store instructions

  19. Instruction Set Principles • The MIPS Operations: • ALU instructins

  20. Instruction Set Principles • The MIPS Operations: • Control flow instructions

  21. Instruction Set Principles –MIPS Example

  22. Instruction Set Principles –MIPS/DLX Example

  23. Instruction Set Principles –MIPS Example

  24. Instruction Set Principles –MIPS Example

  25. Instruction Set Principles –MIPS Example

  26. Instruction Set Principles

  27. Instruction Set Principles

  28. Instruction Set Principles

  29. Instruction Set Principles

  30. Instruction Set Principles

  31. Instruction Set Principles

  32. Instruction Set Principles

  33. Instruction Set Principles

  34. Instruction Set Principles

  35. Instruction Set Principles

  36. Instruction Set Principles

  37. Instruction Set Principles

  38. Instruction Set Principles

  39. Instruction Set Principles

  40. Instruction Set Principles

  41. Instruction Set Principles –MIPS/DLX Example

More Related