0368 2159 lecture 1
This presentation is the property of its rightful owner.
Sponsored Links
1 / 80

מה זה מבנה מחשבים? PowerPoint PPT Presentation


  • 123 Views
  • Uploaded on
  • Presentation posted in: General

מבנה מחשבים 0368-2159 Lecture 1 הקדמה נתן אינטרטור ויהודה אפק מתרגלים: הילל אבני נועה בן-עמוס. מה זה מבנה מחשבים?. חומרה - טרנזיסטורים מעגלים לוגיים ארכיטקטורת מחשבים. על מה נדבר היום:. Introduction : Computer Architecture Administrative Matters History

Download Presentation

מה זה מבנה מחשבים?

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


0368 2159 lecture 1

מבנה מחשבים0368-2159Lecture 1הקדמהנתן אינטרטור ויהודה אפק מתרגלים: הילל אבנינועה בן-עמוס


4414508

מה זה מבנה מחשבים?

חומרה - טרנזיסטורים

מעגלים לוגיים

ארכיטקטורת מחשבים


4414508

על מה נדבר היום:

  • Introduction : Computer Architecture

  • Administrative Matters

  • History

  • ממוליכים וחשמל ועד פעולות בינריות בסיסיות במחשב

    • מתח חשמלי

    • מוליכים

    • סיליקון: מוליך למחצה

    • טרנזיסטור

    • פעולות בינריות ברכיבים אלקטרוניים


Computing devices then

Computing Devices Then…

EDSAC, University of Cambridge, UK, 1949


Computing devices now

Computing Devices Now

Sensor Nets

Cameras

Games

Set-top boxes

Media Players

Laptops

Servers

Robots

Routers

Smart phones

Automobiles

Supercomputers


4414508

מבנה מחשבים,

מה זה?


4414508

Mother board


4414508

First Pacemaker, 1957


The paradigm patterson

The paradigm (Patterson)

Every Computer Scientist should master the “AAA”

  • Architecture

  • Algorithms

  • Applications


Computer architecture goal

Computer Architecture: GOAL

Fast, Effective and Cheap

The goal of Computer Architecture

  • To build “cost effective systems”

    • How do we calculate the cost of a system ?

    • How we evaluate the effectiveness of the system?

  • To optimize the system

    • What are the optimization points ?

      Fact: most of the computer systems still use Von-Neumann principle of operation, even though, internally, they are much different from the computer of that time.


Anatomy 5 components of any computer since 1946

Anatomy: 5 components of any Computer (since 1946)

Personal Computer

Keyboard, Mouse

Computer

Processor

Memory

(where

programs,

data

live when

running)

Devices

Disk(where

programs,

data

live when

not running)

Input

Control

(“brain”)

Datapath

(“brawn”)

Output

Display, Printer


Computer system structure

Computer System Structure

Cache

Mem BUS

Memory

CPU BUS

CPU

Bridge

I/O BUS

Scsi/IDE

Adap

Lan

Adap

USB

Hub

Graphic

Adapt

Scsi Bus

KeyBoard

Mouse

Scanner

Hard

Disk

LAN

Video

Buffer


The instruction set a critical interface

The Instruction Set: a Critical Interface

software

instruction set

hardware


Computer architecture

מה זה “Computer Architecture” ?

Computer Architecture =

  • Instruction Set Architecture +

  • Machine Organization + …

  • = הנדסה + ארכיטקטורה


What are machine structures

What are “Machine Structures”?

מבנה מחשבים

Application (ex: browser)

  • Coordination of many

    levels (layers) of abstraction

Operating

Compiler

System

(Linux, Win, ..)

Software

Assembler

Instruction Set

Architecture

Hardware

Processor

Memory

I/O system

Datapath & Control

Digital Design

Circuit Design

transistors

Physics


Levels of representation

lw$15,0($2)

lw$16,4($2)

sw$16,0($2)

sw$15,4($2)

Levels of Representation

temp = v[k];

v[k] = v[k+1];

v[k+1] = temp;

High Level Language Program

Compiler

Assembly Language Program

Assembler

0000 1001 1100 0110 1010 1111 0101 1000

1010 1111 0101 1000 0000 1001 1100 0110

1100 0110 1010 1111 0101 1000 0000 1001

0101 1000 0000 1001 1100 0110 1010 1111

Machine Language Program

Machine Interpretation

Control Signal Specification

ALUOP[0:3] <= InstReg[9:11] & MASK

°

°


Computer architecture s changing definition

Computer Architecture’s Changing Definition

  • 1950s to 1960s Computer Architecture Course

    • Computer Arithmetic

  • 1970s to mid 1980s Computer Architecture Course

    • Instruction Set Design, especially ISA appropriate for compilers

  • 1990s Computer Architecture Course

    • Design of CPU, memory system, I/O system, Multi-processors, Networks

  • 2000s Computer Architecture Course:

    • Special purpose architectures, Functionally reconfigurable, Special considerations for low power/mobile processing

  • 2005 – futue (?) Multi processors, Parallelism

    • Synchronization, Speed-up, How to Program ??? !!!


Forces on computer architecture

Forces on Computer Architecture

Technology

Programming

Languages

Applications

Computer

Architecture

Cleverness

Operating

Systems

History


Computers in the news sony playstation 2000

As reported in Microprocessor Report, Vol 13, No. 5:

Emotion Engine: 6.2 GFLOPS, 75 million polygons per second

Graphics Synthesizer: 2.4 Billion pixels per second

Claim: Toy Story realism brought to games!

Computers in the News: Sony Playstation 2000

The Playstation 3 will deliver nearly 2 teraflops

overall performance, said Ken Kutaragi, president

and group CEO of Sony Computer Entertainment


4414508

Ray Kurzweil: By 2029 reverse engineer the Human Brain

http://singules-atarityhub.com/2010/01/25/kurzweil-discusses-the-future-of-brain-computer-interfac-x-prize-lab-video/


Where are we going

Where are We Going??

Arithmetic

Single/multicycle

Datapaths

IFetch

Dcd

Exec

Mem

WB

µProc

60%/yr.

(2X/1.5yr)

1000

CPU

IFetch

Dcd

Exec

Mem

WB

“Moore’s Law”

IFetch

Dcd

Exec

Mem

WB

100

Processor-Memory

Performance Gap:(grows 50% / year)

IFetch

Dcd

Exec

Mem

WB

10

Performance

DRAM

9%/yr.

(2X/10 yrs)

DRAM

1

Pipelining

1980

1982

1984

1987

1988

1989

1990

1991

1993

1996

2000

1981

1983

1985

1986

1992

1994

1995

1997

1998

1999

I/O

Time

Memory Systems

מבנה

מחשבים


4414508

שקופית מאחת ההרצאות לקראת סוף הסמסטר


Course administration

Course Administration

  • Instructors:

    Nathan Intrator ([email protected])

  • TA:KirilSolovey([email protected])

    http://cs.tau.ac.il/~nin/Courses/CompStruct/CompStruct.htm

    http://virtual.tau.ac.il

    Books:

  • V. C. Hamacher, Z. G. Vranesic, S. G. ZakyComputer Organization.McGraw-Hill, 1982

  • H. Taub Digital Circuits and Microporcessors. McGraw-Hill 1982

  • מערכות ספרתיות בהוצאות האוניברסיטה הפתוחה

  • Hennessy and Patterson, Computer Organization Design, the hardware/software interface, Morgan Kaufman 1998


Grading

Grading

ציון:

  • מבחן סופי 80%

  • תרגילים20%

    6-7 תרגילים


Architecture microarchitecture elements

Architecture & Microarchitecture Elements

  • Architecture:

    • Registers data width (8/16/32/64)

    • Instruction set

    • Addressing modes

    • Addressing methods (Segmentation, Paging, etc...)

  • Architecture:

    • Physical memory size

    • Caches size and structure

    • Number of execution units, number of execution pipelines

    • Branch prediction

    • TLB

  • Timing is considered Arch (though it is user visible!)

  • Processors with the same arch may have different Arch


4414508

Compatibility

  • Backward compatibility

    • New hardware can run existing software

    • Example: Pentium 4 can run software originally written for Pentium III, Pentium II, Pentium , 486, 386, 286

  • Forward compatibility

    • New software can run on existing (old) hardware

    • Example: new software written with MMXTM must still run on older Pentium processors which do not support MMXTM

    • Less important than backward compatibility

  • New ideas: architecture independent

    • JIT – just in time compiler: Java and .NET

    • Binary translation


How to compare between different systems

How to compare between different systems?


4414508

Benchmarks – Programs for Evaluating Processor Performance

  • Toy Benchmarks

    • 10-100 line programs

    • e.g.: sieve, puzzle, quicksort

  • Synthetic Benchmarks

    • Attempt to match average frequencies of real workloads

    • e.g., Winstone, Dhrystone

  • Real programs

    • e.g., gcc, spice

  • SPEC: System Performance Evaluation Cooperative

    • SPECint (8 integer programs)

    • and SPECfp (10 floating point)


Cpi to compare systems with same instruction set architecture isa

CPI – to compare systems with same instruction set architecture (ISA)

#cycles required to execute the program

#instruction executed in the program

CPI =

  • The CPU is synchronous - it works according to a clock signal.

    • Clock cycle is measured in nsec (10-9 of a second).

    • Clock rate (= 1/clock cycle) is measured in MHz (106 cycles/second).

  • CPI - cycles per instruction

    • Average #cycles per Instruction (in a given program)

    • IPC (= 1/CPI) : Instructions per cycles

  • Clock rate is mainly affected by technology, CPI by the architecture

  • CPI breakdown: how many cycles (on average) the program spends for different causes; e.g., in executing, memory I/O etc.


4414508

CPU Time

  • CPU Time

    • The time required by the CPU to execute a given program:

      CPU Time = clock cycle  #cyc = clock cycle CPI IC

  • Our goal: minimize CPU Time

    • Minimize clock cycle:more MHz (process, circuit, Arch)

    • Minimize CPI: Arch (e.g.: more execution units)

    • Minimize IC:architecture (e.g.: MMXTM technology)

  • Speedup due to enhancement E


Amdahl s law

Amdahl’s Law

Fractionenhanced

ExTimenew = ExTimeold x

(1 - Fractionenhanced) +

Speedupenhanced

Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected, then:

ExTimeold

ExTimenew

1

=

Speedupoverall =

Fractionenhanced

(1 - Fractionenhanced) +

Speedupenhanced


Amdahl s law example

Amdahl’s Law: Example

1

Speedupoverall

=

=

1.053

0.95

  • Floating point instructions improved to run 2X; but only 10% of actual instructions are FP

ExTimenew= ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold

Corollary:

Make The Common Case Fast


Instruction set design

Instruction Set Design

software

instruction set

hardware

The ISA is what the user and the compiler sees

The ISA is what the hardware needs to implement


Why isa is important

Why ISA is important?

  • Code size

    • long instructions may take more time to be fetched

    • Requires large memory (important in small devices, e.g., cell phones)

  • Number of instructions (IC)

    • Reducing IC reduce execution time (assuming same CPI and frequency)

  • Code “simplicity”

    • Simple HW implementation which leads to higher frequency and lower power

    • Code optimization can better be applied to “simple code”


The impact of the isa

The impact of the ISA

RISC vs CISC


Cisc processors

CISC Processors

  • CISC - Complex Instruction Set Computer

  • The idea: a high level machine language

  • Characteristic

    • Many instruction types, with many addressing modes

    • Some of the instructions are complex:

      • Perform complex tasks

      • Require many cycles

    • ALU operations directly on memory

      • Usually uses limited number of registers

    • Variable length instructions

      • Common instructions get short codes  save code length

  • Example: x86


Cisc drawbacks

CISC Drawbacks

  • Compilers do not take advantage of the complex instructions and the complex indexing methods

  • Implement complex instructions and complex addressing modes

     complicate the processor

     slow down the simple, common instructions

     contradict Amdahl’s law corollary:

    Make The Common Case Fast

  • Variable length instructions are real pain in the neck:

    • It is difficult to decode few instructions in parallel

      • As long as instruction is not decoded, its length is unknown

         It is unknown where the instruction ends

         It is unknown where the next instruction starts

    • An instruction may not fit into the “right behavior” of the memory hierarchy (will be discussed next lectures)

  • Examples: VAX, x86 (!?!)


Risc processors

RISC Processors

  • RISC - Reduced Instruction Set Computer

  • The idea: simple instructions enable fast hardware

  • Characteristic

    • A small instruction set, with only a few instructions formats

    • Simple instructions

      • execute simple tasks

      • require a single cycle (with pipeline)

    • A few indexing methods

    • ALU operations on registers only

      • Memory is accessed using Load and Store instructions only.

      • Many orthogonal registers

      • Three address machine: Add dst, src1, src2

    • Fixed length instructions

  • Examples: MIPSTM, SparcTM, AlphaTM, PowerPCTM


Risc processors cont

RISC Processors (Cont.)

  • Simple architecture  Simple micro-architecture

    • Simple, small and fast control logic

    • Simpler to design and validate

    • Room for on die caches: instruction cache + data cache

      • Parallelize data and instruction access

    • Shorten time-to-market

  • Using a smart compiler

    • Better pipeline usage

    • Better register allocation

  • Existing RISC processor are not “pure” RISC

    • e.g., support division which takes many cycles


Risc and amdhal s law example

RISC and Amdhal’s Law (Example)

  • In comparison to the CISC architecture:

    • 10% of the static code, that executes 90% of the dynamic has the same CPI

    • 90% of the static code, which is only 10% of the dynamic, increases in 60%

    • The number of instruction being executed is increased in 50%

    • The speed of the processor is doubled

      • This was true for the time the RISC processors were invented

  • We get

  • And then


So what is better risc or cisc

So, what is better, RISC or CISC

  • Today CISC architectures (X86) are running as fast as RISC (or even faster)

  • The main reasons are:

    • Translates CISC instructions into RISC instructions (ucode)

    • CISC architecture are using “RISC like engine”

  • We will discuss this kind of solutions later on in this course.


Technology trends microprocessor complexity

Technology Trends: Microprocessor Complexity

Itanium 2: 410 Million

Athlon (K7): 22 Million

Alpha 21264: 15 million

PentiumPro: 5.5 million

PowerPC 620: 6.9 million

Alpha 21164: 9.3 million

Sparc Ultra: 5.2 million

Moore’s Law

2X transistors/Chip

Every 1.5 years

Called

“Moore’s Law”


Technology trends processor performance

Technology Trends: Processor Performance

Intel P4 2000 MHz

(Fall 2001)

1.54X/yr

Performance measure

year


Technology trends memory capacity single chip dram

Technology Trends: Memory Capacity(Single-Chip DRAM)

year size (Mbit)

19800.0625

19830.25

19861

19894

199216

199664

1998128

2000256

2002512

  • Now 1.4X/yr, or 2X every 2 years.

  • 8000X since 1980!


Technology trends imply dramatic change

Technology Trends Imply Dramatic Change

  • Processor

    • Logic capacity:about 30% per year

    • Clock rate:about 20% per year

  • Memory

    • DRAM capacity:about 60% per year (4x every 3 years)

    • Memory speed:about 10% per year

    • Cost per bit:improves about 25% per year

  • Disk

    • Capacity:about 60% per year

    • Total data use:100% per 9 months!

  • Network Bandwidth

    • Bandwidth increasing more than 100% per year!


1980 2003 cpu dram speed gap

1980-2003, CPU--DRAM Speed gap

The

power

wall

CPU

60% per yr

2X in 1.5 yrs

Gap grew 50% per year

DRAM

9% per yr

2X in 10 yrs

Q. How do architects address this gap?

A. Put smaller, faster “cache” memories between CPU and DRAM.

Performance

(1/latency)

10000

CPU

1000

100

10

DRAM

2005

1980

2000

1990

Year


Dimensions

Dimensions

2006: 0.04 10e-6

2005: 0.12 10e-6 = 1.2 10e-7

1 cm

1 mm

0.1 mm

10µm

1 µm

0.1 µm

10 nm

1 nm

1 Å

2001 devices

(0.18 µm)

Chip size

(1 cm)

Diameter of

Human Hair

(25 µm)

1996 devices

(0.35 µm)

2007 devices

(0.01 µm)

Silicon

atom

radius

(1.17 Å)

Deep UV

Wavelength

(0.248 µm)

X-ray

Wavelength

(0.6 nm)

Demo


4414508

ארכיטקטורת מחשבים בשנים הבאות

  • בעבר: אנרגיה / צריכת חשמל non issue.

  • היום:Power Wall חשמל יקר. טרנזיסטורים הם בחינם.

  • בעבר: ביצועים משתפרים ע"י מיקבול ברמת פקודות המכונה, קומפיילרים חכמים, וארכיטקטורות CPU יחיד (pipelining, superscalar, out-of-order execution, speculations)

  • היום:ILP Wall שיפורי חומרה לשיפור ביצועים לא משתלם.

  • בעבר: כפל איטי, גישה לזיכרון מהירה.

  • היום:Memory Wall כפל מהיר גישות לזיכרון איטיות.

    (200 מחזורי שעון לDRAM 4 מחזורים לכפל)

  • בעבר: ביצועי מעבד יחיד X 2 כל 1.5 שנים.

  • היום:כל הנ"ל: אולי X 2 כל 5 שנים??

    אבל X 2 מעבדים (ליבות Cores) כל שנתיים. היום 4 עד 40 ליבות למעבד


Physics transistor s history

Physics / Transistor’s History

1906

1947

Audion (Triode), 1906

Lee De Forest

First point contact transistor (germanium), 1947

John Bardeen and Walter Brattain

Bell Laboratories


History

History

1958

1997

First integrated circuit (germanium), 1958

Jack S. Kilby, Texas Instruments

Contained five components, three types:

transistors resistors and capacitors

Intel Pentium II, 1997

Clock: 233MHz

Number of transistors: 7.5 M

Gate Length: 0.35


Annual sales

Annual Sales

  • 1018 transistors manufactured in 2003 alone

    • 100 million for every human on the planet


Integrated circuits 2003 state of the art

Primarily Crystalline Silicon

1mm - 25mm on a side

2003 - feature size ~ 0.13µm = 0.13 x 10-6 m

100 - 400M transistors

(25 - 100M “logic gates")

3 - 10 conductive layers

“CMOS” (complementary metal oxide semiconductor) - most common.

Integrated Circuits (2003 state-of-the-art)

Bare Die

Chip in Package

  • Package provides:

    • spreading of chip-level signal paths to board-level

    • heat dissipation.

  • Ceramic or plastic with gold wires.


Printed circuit boards

Printed Circuit Boards

  • fiberglass or ceramic

  • 1-20 conductive layers

  • 1-20in on a side

  • IC packages are soldered down.


Nmos transistor

nMOS Transistor

  • Four terminals: gate, source, drain, body

  • Gate – oxide – body stack looks like a capacitor

    • Gate and body are conductors

    • SiO2 (oxide) is a very good insulator

    • Called metal – oxide – semiconductor (MOS) capacitor

    • Even though gate is

      no longer made of metal

Off

On


Nmos operation

nMOS Operation

  • Body is commonly tied to ground (0 V)

  • When the gate is at a low voltage:

    • P-type body is at low voltage

    • Source-body and drain-body diodes are OFF

    • No current flows, transistor is OFF

Off


Nmos operation cont

nMOS Operation Cont.

  • When the gate is at a high voltage:

    • Positive charge on gate of MOS capacitor

    • Negative charge attracted to body

    • Inverts a channel under gate to n-type

    • Now current can flow through n-type silicon from source through channel to drain, transistor is ON

On


Pmos transistor

pMOS Transistor

  • Similar, but doping and voltages reversed

    • Body tied to high voltage (VDD)

    • Gate low: transistor ON

    • Gate high: transistor OFF

    • Bubble indicates inverted behavior


Example inverter

Example: Inverter


Example nand3

Example: NAND3

  • Horizontal N-diffusion and p-diffusion strips

  • Vertical polysilicon gates

  • Metal1 VDD rail at top

  • Metal1 GND rail at bottom

  • 32 l by 40 l


Cmos inverter

CMOS Inverter


Cmos inverter1

CMOS Inverter


Cmos inverter2

CMOS Inverter


Multiplexers

Multiplexers

  • 2:1 multiplexer chooses between two inputs


Multiplexers1

Multiplexers

  • 2:1 multiplexer chooses between two inputs


Transmission gate mux

Transmission Gate Mux

  • Nonrestoring mux uses two transmission gates

    • Only 4 transistors


4414508

out


4414508

מה למדנו היום

  • Computer Architecture: integrates few levels, from programming languages to logic design.

  • Instruction Set Architecture (ISA)

  • Amdahl’s law

  • Moor’s law

  • Processor (CPU) --- Memory speed gap

  • History

  • Transistors. What, and how.

  • From transistors to logic design


  • Login