0368 2159 lecture 1
Download
Skip this Video
Download Presentation
מה זה מבנה מחשבים?

Loading in 2 Seconds...

play fullscreen
1 / 80

מה זה מבנה מחשבים? - PowerPoint PPT Presentation


  • 193 Views
  • Uploaded on

מבנה מחשבים 0368-2159 Lecture 1 הקדמה נתן אינטרטור ויהודה אפק מתרגלים: הילל אבני נועה בן-עמוס. מה זה מבנה מחשבים?. חומרה - טרנזיסטורים מעגלים לוגיים ארכיטקטורת מחשבים. על מה נדבר היום:. Introduction : Computer Architecture Administrative Matters History

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'מה זה מבנה מחשבים?' - emory


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide2

מה זה מבנה מחשבים?

חומרה - טרנזיסטורים

מעגלים לוגיים

ארכיטקטורת מחשבים

slide3
על מה נדבר היום:
  • Introduction : Computer Architecture
  • Administrative Matters
  • History
  • ממוליכים וחשמל ועד פעולות בינריות בסיסיות במחשב
    • מתח חשמלי
    • מוליכים
    • סיליקון: מוליך למחצה
    • טרנזיסטור
    • פעולות בינריות ברכיבים אלקטרוניים
computing devices then
Computing Devices Then…

EDSAC, University of Cambridge, UK, 1949

computing devices now
Computing Devices Now

Sensor Nets

Cameras

Games

Set-top boxes

Media Players

Laptops

Servers

Robots

Routers

Smart phones

Automobiles

Supercomputers

the paradigm patterson
The paradigm (Patterson)

Every Computer Scientist should master the “AAA”

  • Architecture
  • Algorithms
  • Applications
computer architecture goal
Computer Architecture: GOAL

Fast, Effective and Cheap

The goal of Computer Architecture

  • To build “cost effective systems”
    • How do we calculate the cost of a system ?
    • How we evaluate the effectiveness of the system?
  • To optimize the system
    • What are the optimization points ?

Fact: most of the computer systems still use Von-Neumann principle of operation, even though, internally, they are much different from the computer of that time.

anatomy 5 components of any computer since 1946
Anatomy: 5 components of any Computer (since 1946)

Personal Computer

Keyboard, Mouse

Computer

Processor

Memory

(where

programs,

data

live when

running)

Devices

Disk(where

programs,

data

live when

not running)

Input

Control

(“brain”)

Datapath

(“brawn”)

Output

Display, Printer

computer system structure
Computer System Structure

Cache

Mem BUS

Memory

CPU BUS

CPU

Bridge

I/O BUS

Scsi/IDE

Adap

Lan

Adap

USB

Hub

Graphic

Adapt

Scsi Bus

KeyBoard

Mouse

Scanner

Hard

Disk

LAN

Video

Buffer

the instruction set a critical interface
The Instruction Set: a Critical Interface

software

instruction set

hardware

computer architecture
מה זה “Computer Architecture” ?

Computer Architecture =

  • Instruction Set Architecture +
  • Machine Organization + …
  • = הנדסה + ארכיטקטורה
what are machine structures
What are “Machine Structures”?

מבנה מחשבים

Application (ex: browser)

  • Coordination of many

levels (layers) of abstraction

Operating

Compiler

System

(Linux, Win, ..)

Software

Assembler

Instruction Set

Architecture

Hardware

Processor

Memory

I/O system

Datapath & Control

Digital Design

Circuit Design

transistors

Physics

levels of representation
lw $15, 0($2)

lw $16, 4($2)

sw $16, 0($2)

sw $15, 4($2)

Levels of Representation

temp = v[k];

v[k] = v[k+1];

v[k+1] = temp;

High Level Language Program

Compiler

Assembly Language Program

Assembler

0000 1001 1100 0110 1010 1111 0101 1000

1010 1111 0101 1000 0000 1001 1100 0110

1100 0110 1010 1111 0101 1000 0000 1001

0101 1000 0000 1001 1100 0110 1010 1111

Machine Language Program

Machine Interpretation

Control Signal Specification

ALUOP[0:3] <= InstReg[9:11] & MASK

°

°

computer architecture s changing definition
Computer Architecture’s Changing Definition
  • 1950s to 1960s Computer Architecture Course
    • Computer Arithmetic
  • 1970s to mid 1980s Computer Architecture Course
    • Instruction Set Design, especially ISA appropriate for compilers
  • 1990s Computer Architecture Course
    • Design of CPU, memory system, I/O system, Multi-processors, Networks
  • 2000s Computer Architecture Course:
    • Special purpose architectures, Functionally reconfigurable, Special considerations for low power/mobile processing
  • 2005 – futue (?) Multi processors, Parallelism
    • Synchronization, Speed-up, How to Program ??? !!!
forces on computer architecture
Forces on Computer Architecture

Technology

Programming

Languages

Applications

Computer

Architecture

Cleverness

Operating

Systems

History

computers in the news sony playstation 2000
As reported in Microprocessor Report, Vol 13, No. 5:

Emotion Engine: 6.2 GFLOPS, 75 million polygons per second

Graphics Synthesizer: 2.4 Billion pixels per second

Claim: Toy Story realism brought to games!

Computers in the News: Sony Playstation 2000

The Playstation 3 will deliver nearly 2 teraflops

overall performance, said Ken Kutaragi, president

and group CEO of Sony Computer Entertainment

slide23
Ray Kurzweil: By 2029 reverse engineer the Human Brain

http://singules-atarityhub.com/2010/01/25/kurzweil-discusses-the-future-of-brain-computer-interfac-x-prize-lab-video/

where are we going
Where are We Going??

Arithmetic

Single/multicycle

Datapaths

IFetch

Dcd

Exec

Mem

WB

µProc

60%/yr.

(2X/1.5yr)

1000

CPU

IFetch

Dcd

Exec

Mem

WB

“Moore’s Law”

IFetch

Dcd

Exec

Mem

WB

100

Processor-Memory

Performance Gap:(grows 50% / year)

IFetch

Dcd

Exec

Mem

WB

10

Performance

DRAM

9%/yr.

(2X/10 yrs)

DRAM

1

Pipelining

1980

1982

1984

1987

1988

1989

1990

1991

1993

1996

2000

1981

1983

1985

1986

1992

1994

1995

1997

1998

1999

I/O

Time

Memory Systems

מבנה

מחשבים

course administration
Course Administration
  • Instructors:

Nathan Intrator ([email protected])

http://cs.tau.ac.il/~nin/Courses/CompStruct/CompStruct.htm

http://virtual.tau.ac.il

Books:

  • V. C. Hamacher, Z. G. Vranesic, S. G. ZakyComputer Organization.McGraw-Hill, 1982
  • H. Taub Digital Circuits and Microporcessors. McGraw-Hill 1982
  • מערכות ספרתיות בהוצאות האוניברסיטה הפתוחה
  • Hennessy and Patterson, Computer Organization Design, the hardware/software interface, Morgan Kaufman 1998
grading
Grading

ציון:

  • מבחן סופי 80%
  • תרגילים 20%

6-7 תרגילים

architecture microarchitecture elements
Architecture & Microarchitecture Elements
  • Architecture:
    • Registers data width (8/16/32/64)
    • Instruction set
    • Addressing modes
    • Addressing methods (Segmentation, Paging, etc...)
  • Architecture:
    • Physical memory size
    • Caches size and structure
    • Number of execution units, number of execution pipelines
    • Branch prediction
    • TLB
  • Timing is considered Arch (though it is user visible!)
  • Processors with the same arch may have different Arch
slide29
Compatibility
  • Backward compatibility
    • New hardware can run existing software
    • Example: Pentium 4 can run software originally written for Pentium III, Pentium II, Pentium , 486, 386, 286
  • Forward compatibility
    • New software can run on existing (old) hardware
    • Example: new software written with MMXTM must still run on older Pentium processors which do not support MMXTM
    • Less important than backward compatibility
  • New ideas: architecture independent
    • JIT – just in time compiler: Java and .NET
    • Binary translation
slide31
Benchmarks – Programs for Evaluating Processor Performance
  • Toy Benchmarks
    • 10-100 line programs
    • e.g.: sieve, puzzle, quicksort
  • Synthetic Benchmarks
    • Attempt to match average frequencies of real workloads
    • e.g., Winstone, Dhrystone
  • Real programs
    • e.g., gcc, spice
  • SPEC: System Performance Evaluation Cooperative
    • SPECint (8 integer programs)
    • and SPECfp (10 floating point)
cpi to compare systems with same instruction set architecture isa
CPI – to compare systems with same instruction set architecture (ISA)

#cycles required to execute the program

#instruction executed in the program

CPI =

  • The CPU is synchronous - it works according to a clock signal.
    • Clock cycle is measured in nsec (10-9 of a second).
    • Clock rate (= 1/clock cycle) is measured in MHz (106 cycles/second).
  • CPI - cycles per instruction
    • Average #cycles per Instruction (in a given program)
    • IPC (= 1/CPI) : Instructions per cycles
  • Clock rate is mainly affected by technology, CPI by the architecture
  • CPI breakdown: how many cycles (on average) the program spends for different causes; e.g., in executing, memory I/O etc.
slide33
CPU Time
  • CPU Time
    • The time required by the CPU to execute a given program:

CPU Time = clock cycle  #cyc = clock cycle CPI IC

  • Our goal: minimize CPU Time
    • Minimize clock cycle: more MHz (process, circuit, Arch)
    • Minimize CPI: Arch (e.g.: more execution units)
    • Minimize IC: architecture (e.g.: MMXTM technology)
  • Speedup due to enhancement E
amdahl s law
Amdahl’s Law

Fractionenhanced

ExTimenew = ExTimeold x

(1 - Fractionenhanced) +

Speedupenhanced

Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected, then:

ExTimeold

ExTimenew

1

=

Speedupoverall =

Fractionenhanced

(1 - Fractionenhanced) +

Speedupenhanced

amdahl s law example
Amdahl’s Law: Example

1

Speedupoverall

=

=

1.053

0.95

  • Floating point instructions improved to run 2X; but only 10% of actual instructions are FP

ExTimenew= ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold

Corollary:

Make The Common Case Fast

instruction set design
Instruction Set Design

software

instruction set

hardware

The ISA is what the user and the compiler sees

The ISA is what the hardware needs to implement

why isa is important
Why ISA is important?
  • Code size
    • long instructions may take more time to be fetched
    • Requires large memory (important in small devices, e.g., cell phones)
  • Number of instructions (IC)
    • Reducing IC reduce execution time (assuming same CPI and frequency)
  • Code “simplicity”
    • Simple HW implementation which leads to higher frequency and lower power
    • Code optimization can better be applied to “simple code”
cisc processors
CISC Processors
  • CISC - Complex Instruction Set Computer
  • The idea: a high level machine language
  • Characteristic
    • Many instruction types, with many addressing modes
    • Some of the instructions are complex:
      • Perform complex tasks
      • Require many cycles
    • ALU operations directly on memory
      • Usually uses limited number of registers
    • Variable length instructions
      • Common instructions get short codes  save code length
  • Example: x86
cisc drawbacks
CISC Drawbacks
  • Compilers do not take advantage of the complex instructions and the complex indexing methods
  • Implement complex instructions and complex addressing modes

 complicate the processor

 slow down the simple, common instructions

 contradict Amdahl’s law corollary:

Make The Common Case Fast

  • Variable length instructions are real pain in the neck:
    • It is difficult to decode few instructions in parallel
      • As long as instruction is not decoded, its length is unknown

 It is unknown where the instruction ends

 It is unknown where the next instruction starts

    • An instruction may not fit into the “right behavior” of the memory hierarchy (will be discussed next lectures)
  • Examples: VAX, x86 (!?!)
risc processors
RISC Processors
  • RISC - Reduced Instruction Set Computer
  • The idea: simple instructions enable fast hardware
  • Characteristic
    • A small instruction set, with only a few instructions formats
    • Simple instructions
      • execute simple tasks
      • require a single cycle (with pipeline)
    • A few indexing methods
    • ALU operations on registers only
      • Memory is accessed using Load and Store instructions only.
      • Many orthogonal registers
      • Three address machine: Add dst, src1, src2
    • Fixed length instructions
  • Examples: MIPSTM, SparcTM, AlphaTM, PowerPCTM
risc processors cont
RISC Processors (Cont.)
  • Simple architecture  Simple micro-architecture
    • Simple, small and fast control logic
    • Simpler to design and validate
    • Room for on die caches: instruction cache + data cache
      • Parallelize data and instruction access
    • Shorten time-to-market
  • Using a smart compiler
    • Better pipeline usage
    • Better register allocation
  • Existing RISC processor are not “pure” RISC
    • e.g., support division which takes many cycles
risc and amdhal s law example
RISC and Amdhal’s Law (Example)
  • In comparison to the CISC architecture:
    • 10% of the static code, that executes 90% of the dynamic has the same CPI
    • 90% of the static code, which is only 10% of the dynamic, increases in 60%
    • The number of instruction being executed is increased in 50%
    • The speed of the processor is doubled
      • This was true for the time the RISC processors were invented
  • We get
  • And then
so what is better risc or cisc
So, what is better, RISC or CISC
  • Today CISC architectures (X86) are running as fast as RISC (or even faster)
  • The main reasons are:
    • Translates CISC instructions into RISC instructions (ucode)
    • CISC architecture are using “RISC like engine”
  • We will discuss this kind of solutions later on in this course.
technology trends microprocessor complexity
Technology Trends: Microprocessor Complexity

Itanium 2: 410 Million

Athlon (K7): 22 Million

Alpha 21264: 15 million

PentiumPro: 5.5 million

PowerPC 620: 6.9 million

Alpha 21164: 9.3 million

Sparc Ultra: 5.2 million

Moore’s Law

2X transistors/Chip

Every 1.5 years

Called

“Moore’s Law”

technology trends processor performance
Technology Trends: Processor Performance

Intel P4 2000 MHz

(Fall 2001)

1.54X/yr

Performance measure

year

technology trends memory capacity single chip dram
Technology Trends: Memory Capacity(Single-Chip DRAM)

year size (Mbit)

1980 0.0625

1983 0.25

1986 1

1989 4

1992 16

1996 64

1998 128

2000 256

2002 512

  • Now 1.4X/yr, or 2X every 2 years.
  • 8000X since 1980!
technology trends imply dramatic change
Technology Trends Imply Dramatic Change
  • Processor
    • Logic capacity: about 30% per year
    • Clock rate: about 20% per year
  • Memory
    • DRAM capacity: about 60% per year (4x every 3 years)
    • Memory speed: about 10% per year
    • Cost per bit: improves about 25% per year
  • Disk
    • Capacity: about 60% per year
    • Total data use: 100% per 9 months!
  • Network Bandwidth
    • Bandwidth increasing more than 100% per year!
1980 2003 cpu dram speed gap
1980-2003, CPU--DRAM Speed gap

The

power

wall

CPU

60% per yr

2X in 1.5 yrs

Gap grew 50% per year

DRAM

9% per yr

2X in 10 yrs

Q. How do architects address this gap?

A. Put smaller, faster “cache” memories between CPU and DRAM.

Performance

(1/latency)

10000

CPU

1000

100

10

DRAM

2005

1980

2000

1990

Year

dimensions
Dimensions

2006: 0.04 10e-6

2005: 0.12 10e-6 = 1.2 10e-7

1 cm

1 mm

0.1 mm

10µm

1 µm

0.1 µm

10 nm

1 nm

1 Å

2001 devices

(0.18 µm)

Chip size

(1 cm)

Diameter of

Human Hair

(25 µm)

1996 devices

(0.35 µm)

2007 devices

(0.01 µm)

Silicon

atom

radius

(1.17 Å)

Deep UV

Wavelength

(0.248 µm)

X-ray

Wavelength

(0.6 nm)

Demo

slide53
ארכיטקטורת מחשבים בשנים הבאות
  • בעבר: אנרגיה / צריכת חשמל non issue.
  • היום:Power Wall חשמל יקר. טרנזיסטורים הם בחינם.
  • בעבר: ביצועים משתפרים ע"י מיקבול ברמת פקודות המכונה, קומפיילרים חכמים, וארכיטקטורות CPU יחיד (pipelining, superscalar, out-of-order execution, speculations)
  • היום:ILP Wall שיפורי חומרה לשיפור ביצועים לא משתלם.
  • בעבר: כפל איטי, גישה לזיכרון מהירה.
  • היום:Memory Wall כפל מהיר גישות לזיכרון איטיות.

(200 מחזורי שעון לDRAM 4 מחזורים לכפל)

  • בעבר: ביצועי מעבד יחיד X 2 כל 1.5 שנים.
  • היום:כל הנ"ל: אולי X 2 כל 5 שנים??

אבל X 2 מעבדים (ליבות Cores) כל שנתיים. היום 4 עד 40 ליבות למעבד

physics transistor s history
Physics / Transistor’s History

1906

1947

Audion (Triode), 1906

Lee De Forest

First point contact transistor (germanium), 1947

John Bardeen and Walter Brattain

Bell Laboratories

history
History

1958

1997

First integrated circuit (germanium), 1958

Jack S. Kilby, Texas Instruments

Contained five components, three types:

transistors resistors and capacitors

Intel Pentium II, 1997

Clock: 233MHz

Number of transistors: 7.5 M

Gate Length: 0.35

annual sales
Annual Sales
  • 1018 transistors manufactured in 2003 alone
    • 100 million for every human on the planet
integrated circuits 2003 state of the art
Primarily Crystalline Silicon

1mm - 25mm on a side

2003 - feature size ~ 0.13µm = 0.13 x 10-6 m

100 - 400M transistors

(25 - 100M “logic gates")

3 - 10 conductive layers

“CMOS” (complementary metal oxide semiconductor) - most common.

Integrated Circuits (2003 state-of-the-art)

Bare Die

Chip in Package

  • Package provides:
    • spreading of chip-level signal paths to board-level
    • heat dissipation.
  • Ceramic or plastic with gold wires.
printed circuit boards
Printed Circuit Boards
  • fiberglass or ceramic
  • 1-20 conductive layers
  • 1-20in on a side
  • IC packages are soldered down.
nmos transistor
nMOS Transistor
  • Four terminals: gate, source, drain, body
  • Gate – oxide – body stack looks like a capacitor
    • Gate and body are conductors
    • SiO2 (oxide) is a very good insulator
    • Called metal – oxide – semiconductor (MOS) capacitor
    • Even though gate is

no longer made of metal

Off

On

nmos operation
nMOS Operation
  • Body is commonly tied to ground (0 V)
  • When the gate is at a low voltage:
    • P-type body is at low voltage
    • Source-body and drain-body diodes are OFF
    • No current flows, transistor is OFF

Off

nmos operation cont
nMOS Operation Cont.
  • When the gate is at a high voltage:
    • Positive charge on gate of MOS capacitor
    • Negative charge attracted to body
    • Inverts a channel under gate to n-type
    • Now current can flow through n-type silicon from source through channel to drain, transistor is ON

On

pmos transistor
pMOS Transistor
  • Similar, but doping and voltages reversed
    • Body tied to high voltage (VDD)
    • Gate low: transistor ON
    • Gate high: transistor OFF
    • Bubble indicates inverted behavior
example nand3
Example: NAND3
  • Horizontal N-diffusion and p-diffusion strips
  • Vertical polysilicon gates
  • Metal1 VDD rail at top
  • Metal1 GND rail at bottom
  • 32 l by 40 l
multiplexers
Multiplexers
  • 2:1 multiplexer chooses between two inputs
multiplexers1
Multiplexers
  • 2:1 multiplexer chooses between two inputs
transmission gate mux
Transmission Gate Mux
  • Nonrestoring mux uses two transmission gates
    • Only 4 transistors
slide80
מה למדנו היום
  • Computer Architecture: integrates few levels, from programming languages to logic design.
  • Instruction Set Architecture (ISA)
  • Amdahl’s law
  • Moor’s law
  • Processor (CPU) --- Memory speed gap
  • History
  • Transistors. What, and how.
  • From transistors to logic design
ad