1 / 25

Crosscutting Issues: The R ô le of Compilers

Architecture. Compiler. Crosscutting Issues: The R ô le of Compilers. Architects must be aware of current compiler technology. Front End. High-level Optimisations. Global Optimiser. Code Generator. Modern Compilers. E.g. procedure inlining, loop transformations. Register allocation.

Download Presentation

Crosscutting Issues: The R ô le of Compilers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecture Compiler Crosscutting Issues: The Rôle of Compilers • Architects must be aware of current compiler technology

  2. Front End High-level Optimisations Global Optimiser Code Generator Modern Compilers E.g. procedure inlining, loop transformations Register allocation Machine dependent optimisations

  3. Compiler Technology • Multiple passes complicate matters • E.g. common subexpression elimination must assume that a register will be allocated for the temporary value • E.g. Procedure inlining before size is known • Register allocation is critical • Uses graph colouring techniques • Requires at least 16 registers to be effective

  4. Architectural Issues • How are variables allocated and addressed? • Stack: local variables, scalars • Global data area: global variables, constants, arrays • Heap: dynamic objects, not scalars • How many registers are needed? • Integer: 26 registers • FP: 20 registers

  5. Aiding Compiler Writers • Architectures should: • Be regular (orthogonal instruction set) • Provide primitives, not solutions • Simplify trade-offs among alternatives • Not require run-time interpretation of data known at compile-time • VAX CALLS Keep it simple!

  6. Compiler Support for Multimedia Instructions • SIMD instructions act on multiple smaller data items in a large “word” • Solutions, not primitives! • Too few registers! • Data types not found in programming languages! Result: Only used by low-level graphics libraries.

  7. Multimedia Instructions • These SIMD instructions act like a “mini-vector” architecture • E.g. MMX in 64 bits • 8 × 8-bit vectors • 4 × 16-bit vectors • 2 × 32-bit vectors • SSE: 128 bits • Much more limited than genuine vector processors

  8. Putting It All Together: MIPS • 64-bit load/store design • RISC features: • GPR, load-store architecture • Small, simple instruction set • Designed for efficient pipelining (fixed length instructions) • Efficient compiler target

  9. MIPS • 32 64-bit integer registers • R0…R31 • R0 fixed: 0 • 32 64-bit or 32-bit floating point registers • Supports “paired single” operations

  10. MIPS Data Types • Integer: • Bytes, 16-bit halfwords, 32-bit words, 64-bit double words • Operations are all 64-bit • Floating point: • 32-bit and 64-bit

  11. MIPS Addressing Modes • Only immediate and displacement • 16-bit displacements/immediates • Register-indirect: set displacement = 0 • 16-bit absolute: use R0 • Byte addressable with 64-bit addresses • Big-endian or little-endian • Alignment required

  12. 6 5 5 16 I-type opcode rs rt immediate 6 5 5 5 5 6 R-type opcode rs rt rd shamt funct 6 26 J-type opcode offset MIPS Instructions • Three instruction formats:

  13. MIPS Operations • Load-store • ALU operations • Add, subtract, multiply, divide, and, or, xor, LUI (load upper immediate), shifts • Control transfer • Set conditions • Branch (reg=0, reg0, reg1=reg2, reg1reg2), jump, jump-and-link (call) • Conditional move • Floating point • Paired single operations • Multiply-add (DSP)

  14. MIPS: Instruction Usage • Integer applications: • Load, add, branch, store, or, compare • FP applications: • Add (int), load (int), load, multiply, add, store Figure 2.34.

  15. Another View: Trimedia Media Processor • Embedded processor for multimedia applications • E.g. set-top boxes (decoders, etc.) and TVs • Very different architecture • 128 32-bit registers (FP or int) • Partitioned (SIMD) instructions • 2’s complement and saturating arithmetic • VLIW architecture

  16. Trimedia: VLIW Approach • Compiler can group up to five instructions for simultaneous execution • Must be independent • Use NOPs if there are insufficient independent instructions • Large program size • Trimedia uses memory compression • Programs are 2-3 times larger than MIPS (even with compression)!

  17. Fallacies and Pitfalls • Pitfall: Designing a “high-level” instruction set to support HLL’s • Seldom provide an exact match • Often too general (VAX CALLS)

  18. Fallacies and Pitfalls • Fallacy: There is such a thing as a typical program • Programs vary very significantly • Pitfall: Designing an architecture to reduce code size without considering compilers • Compilers have much greater impact on code size • Start with densest compiled code

  19. Fallacies and Pitfalls • Pitfall: Expecting good compiled performance for DSPs • Hand-tuned assembler is faster and more compact • Fallacy: An architecture without flaws cannot be successful • 80x86! • Segments, accumulators, stack-based FP

  20. Fallacies and Pitfalls • Fallacy: You can design a flawless architecture • All designs have trade-offs • VAX code size more important than easy decoding • Early RISCs: delayed branches • Address space

  21. 2.15. Concluding Remarks • 1960’s: Stack architectures • Matched the compiler technology of the day • 1970’s: CISC era • Tried to support HLL features in hardware • Today: RISC era • Simple, load-store architectures

  22. Concluding Remarks • Trends in the 1990’s: • Move to 64 bits • Conditional instructions • Eliminating branches • Optimisation of cache access (prefetch instructions) • Support for multimedia • Faster floating point

  23. The Future • Trend towards VLIW architectures • Increased use of conditional execution • Blending of general-purpose and DSP architectures • Emulating 80x86 architecture

More Related