Microprocessors
This presentation is the property of its rightful owner.
Sponsored Links
1 / 49

Microprocessors PowerPoint PPT Presentation


  • 91 Views
  • Uploaded on
  • Presentation posted in: General

Microprocessors. source. gate. Conducts if gate=1. drain. 1. gate. oxide. IC package. IC. source. channel. drain. Silicon substrate. CMOS transistor on silicon. Transistor The basic electrical component in digital systems Acts as an on/off switch

Download Presentation

Microprocessors

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Microprocessors

Microprocessors


Cmos transistor on silicon

source

gate

Conducts

if gate=1

drain

1

gate

oxide

IC package

IC

source

channel

drain

Silicon substrate

CMOS transistor on silicon

  • Transistor

    • The basic electrical component in digital systems

    • Acts as an on/off switch

    • Voltage at “gate” controls whether current flows from source to drain

    • Don’t confuse this “gate” with a logic gate


Cmos transistor implementations

source

source

gate

Conducts

if gate=0

gate

Conducts

if gate=1

drain

drain

pMOS

nMOS

1

1

1

x

x

y

x

F = x'

y

F = (xy)'

x

F = (x+y)'

y

0

x

y

0

0

NOR gate

inverter

NAND gate

CMOS transistor implementations

  • Complementary Metal Oxide Semiconductor

  • We refer to logic levels

    • Typically 0 is 0V, 1 is 5V

  • Two basic CMOS types

    • nMOS conducts if gate=1

    • pMOS conducts if gate=0

    • Hence “complementary”

  • Basic gates

    • Inverter, NAND, NOR


Basic logic gates

x

x

F

F

x

x

x

F

F

y

x

F

x

x

x

x

x

x

y

y

y

y

y

y

F

F

F

F

F

F

y

0

0

0

1

F

y

0

0

0

0

0

0

0

0

0

0

0

0

1

0

1

0

0

1

1

1

1

0

0

0

0

0

0

0

1

1

1

1

1

1

0

1

1

0

0

1

1

1

1

1

1

1

0

0

0

0

0

0

1

1

0

0

1

0

1

1

1

1

1

1

1

1

1

1

1

1

0

0

1

1

0

1

x

x

x

F

x

F

F

F

y

y

y

F = x y

XNOR

Basic logic gates

F = x

Driver

F = x y

AND

F = x + y

OR

F = x  y

XOR

F = x’

Inverter

F = (x y)’

NAND

F = (x+y)’

NOR


Combinational logic design

B) Truth table

C) Output equations

D) Minimized output equations

Outputs

Inputs

y

bc

y = a'bc + ab'c' + ab'c + abc' + abc

a

b

c

y

z

00

01

11

10

a

0

0

0

1

0

0

0

0

0

0

0

0

1

0

1

1

1

1

1

1

z = a'b'c + a'bc' + ab'c + abc' + abc

0

1

0

0

1

0

1

1

1

0

y = a + bc

z

1

0

0

1

0

bc

00

01

11

10

1

0

1

1

1

a

0

0

1

0

1

1

1

0

1

1

1

1

1

1

1

1

0

1

1

1

E) Logic Gates

z = ab + b’c + bc’

a

y

b

c

z

Combinational logic design

A) Problem description

y is 1 if a is to 1, or b and c are 1. z is 1 if b or c is to 1, but not both, or if all are 1.


Combinational components

A

B

I1

I0

I(m-1)

n

n

n

n bit,

m function

ALU

S0

n-bit, m x 1

Multiplexor

S0

S(log m)

n

S(log m)

n

O

O

O =

I0 if S=0..00

I1 if S=0..01

I(m-1) if S=1..11

less = 1 if A<B

equal =1 if A=B

greater=1 if A>B

O = A op B

op determined

by S.

O0 =1 if I=0..00

O1 =1 if I=0..01

O(n-1) =1 if I=1..11

sum = A+B

(first n bits)

carry = (n+1)’th

bit of A+B

A

B

I0

A

I(log n -1)

B

n

n

n

log n x n

Decoder

n-bit

Adder

n-bit

Comparator

With enable input e  all O’s are 0 if e=0

With carry-in input Ci

sum = A + B + Ci

May have status outputs carry, zero, etc.

n

O(n-1)

O1

O0

carry

sum

less

equal

greater

Combinational components


Sequential components

I

n

load

shift

n-bit

Register

n-bit

Shift register

n-bit

Counter

clear

I

Q

n

n

Q

Q

Sequential components

Q = lsb

- Content shifted

- I stored in msb

Q =

0 if clear=1,

I if load=1 and clock=1,

Q(previous) otherwise.

Q =

0 if clear=1,

Q(prev)+1 if count=1 and clock=1.


Gated r s latch clocked s r flip flop

Gated R-S Latch (clocked S-R flip-flop)

Enb = 1, latch closed (outputs unchanged)

Enb = 0, enabled (outputs depend on inputs)


J k flip flop

J-K Flip-flop

How to eliminate the forbidden state?

Idea: use output feedback to

guarantee that R and S are

never both one

J, K both one yields toggle

Characteristic Equation:

Q+ = Q K + Q J


Sequential logic design

D) State Table (Moore-type)

C) Implementation Model

B) State Diagram

Outputs

Inputs

Q1

Q0

a

I1

I0

x

x

a

Combinational logic

0

0

0

0

0

x=1

x=0

a=0

a=0

0

I1

0

0

1

0

1

0

3

a=1

0

1

0

0

1

I0

0

0

1

1

1

0

a=1

a=1

1

0

0

1

0

0

Q1

Q0

1

0

1

1

1

1

2

1

1

0

1

1

1

a=1

1

1

1

0

0

State register

a=0

a=0

x=0

x=0

I0

I1

Sequential logic design

  • Given this implementation model

    • Sequential logic design quickly reduces to combinational logic design

A) Problem Description

You want to construct a clock divider. Slow down your pre-existing clock so that you output a 1 for every four clock cycles


Sequential logic design cont

E) Minimized Output Equations

F) Combinational Logic

Q1Q0

I1

00

01

11

10

a

0

0

1

1

a

0

I1 = Q1’Q0a + Q1a’ + Q1Q0’

x

0

1

0

1

1

Q1Q0

I0

00

01

11

10

a

0

1

1

0

I1

0

I0 = Q0a’ + Q0’a

1

0

0

1

1

x

I0

Q1Q0

00

01

11

10

a

0

0

1

0

x = Q1Q0

0

0

0

1

0

Q1

Q0

1

Sequential logic design (cont.)


Basic architecture

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

PC

IR

I/O

Memory

Basic Architecture

  • Control unit and datapath

    • Note similarity to single-purpose processor

  • Key differences

    • Datapath is general

    • Control unit doesn’t store the algorithm – the algorithm is “programmed” into the memory


Datapath operations

+1

Datapath Operations

  • Load

    • Read memory location into register

Processor

Control unit

Datapath

ALU

  • ALU operation

    • Input certain registers through ALU, store back in register

Controller

Control

/Status

Registers

  • Store

    • Write register to memory location

10

11

PC

IR

I/O

...

Memory

10

11

...


Control unit

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

PC

IR

R0

R1

I/O

...

Memory

100

load R0, M[500]

500

10

101

inc R1, R0

501

...

102

store M[501], R1

Control Unit

  • Control unit: configures the datapath operations

    • Sequence of desired operations (“instructions”) stored in memory – “program”

  • Instruction cycle – broken into several sub-operations, each one clock cycle, e.g.:

    • Fetch: Get next instruction into IR

    • Decode: Determine what the instruction means

    • Fetch operands: Move data from memory to datapath register

    • Execute: Move data through the ALU

    • Store results: Write data from register to memory


Control unit sub operations

Control Unit Sub-Operations

  • Fetch

    • Get next instruction into IR

    • PC: program counter, always points to next instruction

    • IR: holds the fetched instruction

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

PC

IR

100

R0

R1

load R0, M[500]

I/O

...

Memory

100

load R0, M[500]

500

10

101

inc R1, R0

501

...

102

store M[501], R1


Control unit sub operations1

Control Unit Sub-Operations

  • Decode

    • Determine what the instruction means

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

PC

IR

100

R0

R1

load R0, M[500]

I/O

...

Memory

100

load R0, M[500]

500

10

101

inc R1, R0

501

...

102

store M[501], R1


Control unit sub operations2

Control Unit Sub-Operations

  • Fetch operands

    • Move data from memory to datapath register

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

10

PC

IR

100

R0

R1

load R0, M[500]

I/O

...

Memory

100

load R0, M[500]

500

10

101

inc R1, R0

501

...

102

store M[501], R1


Control unit sub operations3

Control Unit Sub-Operations

  • Execute

    • Move data through the ALU

    • This particular instruction does nothing during this sub-operation

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

10

PC

IR

100

R0

R1

load R0, M[500]

I/O

...

Memory

100

load R0, M[500]

500

10

101

inc R1, R0

501

...

102

store M[501], R1


Control unit sub operations4

Control Unit Sub-Operations

  • Store results

    • Write data from register to memory

    • This particular instruction does nothing during this sub-operation

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

10

PC

IR

100

R0

R1

load R0, M[500]

I/O

...

Memory

100

load R0, M[500]

500

10

101

inc R1, R0

501

...

102

store M[501], R1


Instruction cycles

Processor

Fetch ops

Store results

Control unit

Datapath

Fetch

Decode

Exec.

ALU

Controller

Control

/Status

Registers

10

PC

IR

R0

R1

load R0, M[500]

I/O

...

Memory

100

load R0, M[500]

500

10

101

inc R1, R0

501

...

102

store M[501], R1

Instruction Cycles

PC=100

clk

100


Instruction cycles1

Processor

Control unit

Datapath

ALU

Controller

+1

Control

/Status

Registers

Fetch ops

Store results

Fetch

Decode

Exec.

11

PC

IR

R0

R1

inc R1, R0

I/O

...

Memory

100

load R0, M[500]

500

10

101

inc R1, R0

501

...

102

store M[501], R1

Instruction Cycles

PC=100

Fetch ops

Store results

Fetch

Decode

Exec.

clk

PC=101

clk

10

101


Instruction cycles2

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

PC

IR

R0

R1

store M[501], R1

Fetch ops

Store results

Fetch

Decode

Exec.

I/O

...

Memory

100

load R0, M[500]

500

10

101

inc R1, R0

501

11

...

102

store M[501], R1

Instruction Cycles

PC=100

Fetch ops

Store results

Fetch

Decode

Exec.

clk

PC=101

Fetch ops

Store results

Fetch

Decode

Exec.

clk

10

11

102

PC=102

clk


Architectural considerations

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

PC

IR

I/O

Memory

Architectural Considerations

  • N-bit processor

    • N-bit ALU, registers, buses, memory data interface

    • Embedded: 8-bit, 16-bit, 32-bit common

    • Desktop/servers: 32-bit, even 64

  • PC size determines address space


Architectural considerations1

Processor

Control unit

Datapath

ALU

Controller

Control

/Status

Registers

PC

IR

I/O

Memory

Architectural Considerations

  • Clock frequency

    • Inverse of clock period

    • Must be longer than longest register to register delay in entire processor

    • Memory access is often the longest


Pipelining increasing instruction throughput

Pipelining: Increasing Instruction Throughput

Wash

1

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

Non-pipelined

Pipelined

Dry

1

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

non-pipelined dish cleaning

Time

pipelined dish cleaning

Time

Fetch-instr.

1

2

3

4

5

6

7

8

Decode

1

2

3

4

5

6

7

8

Fetch ops.

1

2

3

4

5

6

7

8

Pipelined

Execute

1

2

3

4

5

6

7

8

Instruction 1

Store res.

1

2

3

4

5

6

7

8

Time

pipelined instruction execution


Superscalar and vliw architectures

Superscalar and VLIW Architectures

  • Performance can be improved by:

    • Faster clock (but there’s a limit)

    • Pipelining: slice up instruction into stages, overlap stages

    • Multiple ALUs to support more than one instruction stream

      • Superscalar

        • Scalar: non-vector operations

        • Fetches instructions in batches, executes as many as possible

          • May require extensive hardware to detect independent instructions

        • VLIW: each word in memory has multiple independent instructions

          • Relies on the compiler to detect and schedule instructions

          • Currently growing in popularity


Two memory architectures

Processor

Processor

Program memory

Data memory

Memory

(program and data)

Harvard

Princeton

Two Memory Architectures

  • Princeton

    • Fewer memory wires

  • Harvard

    • Simultaneous program and data memory access


Cache memory

Fast/expensive technology, usually on the same chip

Processor

Cache

Memory

Slower/cheaper technology, usually on a different chip

Cache Memory

  • Memory access may be slow

  • Cache is small but fast memory close to processor

    • Holds copy of part of memory

    • Hits and misses


Programmer s view

Programmer’s View

  • Programmer doesn’t need detailed understanding of architecture

    • Instead, needs to know what instructions can be executed

  • Two levels of instructions:

    • Assembly level

    • Structured languages (C, C++, Java, etc.)

  • Most development today done using structured languages

    • But, some assembly level programming may still be necessary

    • Drivers: portion of program that communicates with and/or controls (drives) another device

      • Often have detailed timing considerations, extensive bit manipulation

      • Assembly level may be best for these


Assembly level instructions

Instruction 1

opcode

operand1

operand2

Instruction 2

opcode

operand1

operand2

Instruction 3

opcode

operand1

operand2

Instruction 4

opcode

operand1

operand2

...

Assembly-Level Instructions

  • Instruction Set

    • Defines the legal set of instructions for that processor

      • Data transfer: memory/register, register/register, I/O, etc.

      • Arithmetic/logical: move register through ALU and back

      • Branches: determine next PC value when not just PC+1


A simple trivial instruction set

A Simple (Trivial) Instruction Set

Assembly instruct.

First byte

Second byte

Operation

MOV Rn, direct

0000

Rn

direct

Rn = M(direct)

MOV direct, Rn

0001

Rn

direct

M(direct) = Rn

Rm

MOV @Rn, Rm

0010

Rn

M(Rn) = Rm

MOV Rn, #immed.

0011

Rn

immediate

Rn = immediate

ADD Rn, Rm

0100

Rn

Rm

Rn = Rn + Rm

SUB Rn, Rm

0101

Rn

Rm

Rn = Rn - Rm

JZ Rn, relative

0110

Rn

relative

PC = PC+ relative

(only if Rn is 0)

opcode operands


Addressing modes

Addressing

mode

Register-file

contents

Memory

contents

Operand field

Immediate

Data

Register-direct

Register address

Data

Register

indirect

Register address

Memory address

Data

Direct

Memory address

Data

Indirect

Memory address

Memory address

Data

Addressing Modes


Sample programs

C program

Equivalent assembly program

0

MOV R0, #0; // total = 0

1

MOV R1, #10; // i = 10

2

MOV R2, #1; // constant 1

int total = 0;

for (int i=10; i!=0; i--)

total += i;

// next instructions...

3

MOV R3, #0; // constant 0

Loop:

JZ R1, Next; // Done if i=0

5

ADD R0, R1; // total += i

6

SUB R1, R2; // i--

7

JZ R3, Loop; // Jump always

Next:

// next instructions...

Sample Programs

  • Try some others

    • Handshake: Wait until the value of M[254] is not 0, set M[255] to 1, wait until M[254] is 0, set M[255] to 0 (assume those locations are ports).

    • (Harder) Count the occurrences of zero in an array stored in memory locations 100 through 199.


Application specific instruction set processors asips

Application-Specific Instruction-Set Processors (ASIPs)

  • General-purpose processors

    • Sometimes too general to be effective in demanding application

      • e.g., video processing – requires huge video buffers and operations on large arrays of data, inefficient on a GPP

    • But single-purpose processor has high NRE, not programmable

  • ASIPs – targeted to a particular domain

    • Contain architectural features specific to that domain

      • e.g., embedded control, digital signal processing, video processing, network processing, telecommunications, etc.

    • Still programmable


A common asip microcontroller

A Common ASIP: Microcontroller

  • For embedded control applications

    • Reading sensors, setting actuators

    • Mostly dealing with events (bits): data is present, but not in huge amounts

    • e.g., VCR, disk drive, digital camera (assuming SPP for image compression), washing machine, microwave oven

  • Microcontroller features

    • On-chip peripherals

      • Timers, analog-digital converters, serial communication, etc.

      • Tightly integrated for programmer, typically part of register space

    • On-chip program and data memory

    • Direct programmer access to many of the chip’s pins

    • Specialized instructions for bit-manipulation and other low-level operations


Another common asip digital signal processors dsp

Another Common ASIP: Digital Signal Processors (DSP)

  • For signal processing applications

    • Large amounts of digitized data, often streaming

    • Data transformations must be applied fast

    • e.g., cell-phone voice filter, digital TV, music synthesizer

  • DSP features

    • Several instruction execution units

    • Multiple-accumulate single-cycle instruction, other instrs.

    • Efficient vector operations – e.g., add two arrays

      • Vector ALUs, loop buffers, etc.


Trend even more customized asips

Trend: Even More Customized ASIPs

  • In the past, microprocessors were acquired as chips

  • Today, we increasingly acquire a processor as Intellectual Property (IP)

    • e.g., synthesizable VHDL model

  • Opportunity to add a custom datapath hardware and a few custom instructions, or delete a few instructions

    • Can have significant performance, power and size impacts

    • Problem: need compiler/debugger for customized ASIP

      • Remember, most development uses structured languages

      • One solution: automatic compiler/debugger generation

        • e.g., www.tensillica.com

      • Another solution: retargettable compilers

        • e.g., www.improvsys.com (customized VLIW architectures)


Programmer considerations

Programmer Considerations

  • Program and data memory space

    • Embedded processors often very limited

      • e.g., 64 Kbytes program, 256 bytes of RAM (expandable)

  • Registers: How many are there?

    • Only a direct concern for assembly-level programmers

  • I/O

    • How communicate with external signals?

  • Interrupts


Selecting a microprocessor

Selecting a Microprocessor

  • Issues

    • Technical: speed, power, size, cost

    • Other: development environment, prior expertise, licensing, etc.

  • Speed: how evaluate a processor’s speed?

    • Clock speed – but instructions per cycle may differ

    • Instructions per second – but work per instr. may differ

    • Dhrystone: Synthetic benchmark, developed in 1984. Dhrystones/sec.

      • MIPS: 1 MIPS = 1757 Dhrystones per second (based on Digital’s VAX 11/780). A.k.a. Dhrystone MIPS. Commonly used today.

        • So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per second

    • SPEC: set of more realistic benchmarks, but oriented to desktops

    • EEMBC – EDN Embedded Benchmark Consortium, www.eembc.org

      • Suites of benchmarks: automotive, consumer electronics, networking, office automation, telecommunications


General purpose processors

General Purpose Processors

Sources: Intel, Motorola, MIPS, ARM, TI, and IBM Website/Datasheet; Embedded Systems Programming, Nov. 1998


Microprocessor architecture overview

Microprocessor Architecture Overview

  • If you are using a particular microprocessor, now is a good time to review its architecture


Microcontroller catalogue

Microcontroller catalogue


Microcontroller packaging

Microcontroller packaging


  • Login