Lecture 1 introduction
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

Lecture 1: Introduction PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on
  • Presentation posted in: General

Lecture 1: Introduction. CprE 58 1 Computer Systems Architecture, Fall 2005 Zhao Zhang. Traditional “Computer Architecture”.

Download Presentation

Lecture 1: Introduction

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Lecture 1 introduction

Lecture 1: Introduction

CprE 581 Computer Systems Architecture, Fall 2005

Zhao Zhang


Traditional computer architecture

Traditional “Computer Architecture”

The term architecture is used here to describe the attribute of a system as seen by the programmer, i.e., theconceptual structure and functional behavior as distinct from the organization of the data flow and controls, the logic design, and the physical implementation.

  • Gene Amdahl, IBM Journal R&D, April 1964


Contemporary computer architecture

Contemporary “Computer Architecture”

  • Instruction set architecture

  • Microarchitecture:

    • Pipeline structures

    • Cache memories

  • Implementations

    • Logic design and synthesis


Fundamentals

Fundamentals

  • Technology trends

  • Performance evaluation methodologies

  • Instruction Set Architecture


Technology drives for high performance

Technology Drives for High-Performance

VLSI technology: faster transistors and larger transistor budget


Cpu performance

CPU Performance

For sequential program:

CPU time = #Inst  CPI  Clock cycle time

To improve performance

  • Faster clock time

  • Reduce #inst

  • Reduce CPI or increase IPC


How to use one billion transistors

How to use one billion transistors?

  • Bit-level parallelism

    • Move from 32-bit to 64-bit

  • Instruction-level parallelism

    • Deep pipeline

    • Execute multiple instructions per cycle

  • Program locality

    • Large caches, more branch prediction resouces

  • Thread-level parallelism


Instruction level parallelism

Instruction-Level Parallelism

Pipeline + Multi-issue

IF

IF

IF

IF

IF

ID

ID

ID

ID

ID

EX

EX

EX

EX

EX

MEM

MEM

MEM

MEM

MEM

WB

WB

WB

WB

WB


Instruction level parallelism1

for (i=0; i<N; i++)

X[i] = a*X[i];

// let R3=&X[0],R4=&X[N]

// and F0=a

LOOP:LD.D F2, 0(R3)

MUL.D F2, F2, F0

S.D F2, 0(R3)

DADD R3, R3, 8

BNE R3, R4, LOOP

What instructions are parallel?

How to schedule those instructions?

Instruction-level Parallelism


Instruction level parallelism2

Instruction-Level Parallelism

Find independent instructions through dependence analysis

  • Hardware approaches => Dynamically scheduled superscalar

    • Most commonly used today: Intel Pentium, AMD, Sun UltraSparc, and MIPS families

  • Software approaches => (1) Static scheduled superscalar, or (2) VLIW


Modern superscalar processors

Modern Superscalar Processors

Example: Intel Pentium, IBM Power/PowerPC, Sun UltraSparc, SGI MIPS …

  • Multi-issue and Deep pipelining

  • Dynamic scheduling and speculative execution

  • High bandwidth L1 caches and large L2/L3 caches


Modern superscalar processor

Modern Superscalar Processor

Challenges: Complexity!!!

  • How

  • Understand how it brings high performance

    • Will see wield designs

    • Will use Verilog, simulation to help understanding

  • Have big pictures


  • Modern superscalar processor1

    Modern Superscalar Processor

    Maintain register data flow

    • Register renaming

    • Instruction scheduling

      Maintain control flow

    • Branch prediction

    • Speculative execution and recovery

      Maintain memory data flow

    • Load and store queues

    • Memory dependence speculation


    Memory system performance

    Memory System Performance

    Memory Stall CPI

    = Miss per inst × miss penalty

    = % Mem Inst × Miss rate × Miss Penalty

    Assume 20% memory instruction, 2% miss rate, 400-cycle miss penalty. How much is memory stall CPI?


    Memory system performance1

    Memory System Performance

    • A typical memory hierarchy today:

    • Here we focus on L1/L2/L3 caches, virtual memory and main memory

    Proc/Regs

    L1-Cache

    Bigger

    Faster

    L2-Cache

    L3-Cache (optional)

    Memory

    Disk, Tape, etc.


    Cache design

    Cache Design

    Many applications are memory-bound

    • CPU speeds increases fast; memory speed cannot match up

      Cache hierarchy: exploits program locality

    • Basic principles of cache designs

    • Hardware cache optimizations

    • Application cache optimizations

    • Prefetching techniques

      Also talk about virtual memory


    High performance storage systems

    High Performance Storage Systems

    What limits the performance of web servers? Storage!

    • Storage technology trends

    • RAID: Redundant array of inexpensive disks


    Multiprocessor systems

    Multiprocessor Systems

    Must exploit thread-level parallelism for further performance improvement

    Shared-memory multiprocessors: Cooperating programs see the same memory address

    How to build them?

    • Cache coherence

    • Memory consistency


    Emerging techniques

    Emerging Techniques

    • Low power design

    • Multicore and multithreaded processors

    • Secure processor

    • Reliable design


    Why study computer architecture

    Why Study Computer Architecture

    As a hardware designer/researcher – know how to design processor, cache, storage, graphics, interconnect, and so on

    As a system designer – know how to build a computer system using the best components available

    As a software designer – know how to get the best performance from the hardware


    Class web site

    Class Web Site

    www.ece.iastate.edu/~zzhang/cpre585/

    • Syllabus

    • Schedule

    • Homework assignments

    • Readings

      WebCT: Grades, Assignments and Discussions


  • Login