embedded system hw n.
Download
Skip this Video
Download Presentation
Embedded System HW

Loading in 2 Seconds...

play fullscreen
1 / 131

Embedded System HW - PowerPoint PPT Presentation


  • 118 Views
  • Uploaded on

Embedded System HW. Why use microprocessors?. Alternatives: field-programmable gate arrays (FPGAs), custom logic, etc. (dedicated Single-purpose Processor or HW Logic) Microprocessors are often very efficient: can use same logic to perform many different functions.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Embedded System HW' - eden-richardson


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
why use microprocessors
Why use microprocessors?
  • Alternatives: field-programmable gate arrays (FPGAs), custom logic, etc. (dedicated Single-purpose Processor or HW Logic)
  • Microprocessors are often very efficient: can use same logic to perform many different functions.
  • Microprocessors simplify the design of families of products.
the performance paradox
The performance paradox
  • Microprocessors use much more logic to implement a function than does custom logic.
  • But microprocessors are often at least as fast:
    • heavily pipelined;
    • large design teams;
    • aggressive VLSI technology.
power
Power
  • Custom logic is a clear winner for low power devices.
  • Modern microprocessors offer features to help control power consumption.
  • Software design techniques can help reduce power consumption.
microprocessor varieties
Microprocessor varieties
  • Microcontroller: includes I/O devices, on-board memory.
  • Digital signal processor (DSP): microprocessor optimized for digital signal processing.
  • Typical embedded word sizes: 8-bit, 16-bit, 32-bit.
embedded processors
Embedded Processors
  • 임베디드 프로세서
    • 원래는 마이크로컨트롤러를 의미
    • 마이크로컨트롤러를 확장한 개념으로도 사용
    • CPU 코어, 메모리, 주변 장치, 입출력장치에 다양한 종류의 네트워크 장치가 추가되는 형태

Netsilicon NET+ARM Embedded Processor

many types of programmable processors
Many Types of Programmable Processors
  • Past
    • Microprocessor
    • Microcontroller
    • DSP
    • Graphics Processor
  • Now / Future
    • Network Processor
    • Sensor Processor
    • Cryptoprocessor
    • Game Processor
    • Wearable Processor
    • Mobile Processor
application specific instruction processors asips
Application-Specific InstructionProcessors (ASIPs)
  • Processors with instruction-sets tailored to specific applications or application domains
      • instruction-set generation as part of synthesis
  • Pluses:
      • customization yields lower area, power etc.
  • Minuses:
      • higher h/w & s/w development overhead
          • design, compilers, debuggers
          • higher time to market
reconfigurable soc
Reconfigurable SoC

Triscend’s A7 CSoC

Other Examples

Atmel’s FPSLIC(AVR + FPGA)

Altera’s Nios(configurable RISC on a PLD)

von neumann architecture
von Neumann architecture
  • Memory holds data, instructions.
  • Central processing unit (CPU) fetches instructions from memory.
    • Separate CPU and memory distinguishes programmable computer.
  • CPU registers help out: program counter (PC), instruction register (IR), general-purpose registers, etc.
cpu memory
CPU + memory

memory

CPU

PC

200

address

ADD r5,r1,r3

ADD r5,r1,r3

IR

200

data

harvard architecture
Harvard architecture

address

CPU

data memory

PC

data

address

program memory

data

von neumann vs harvard
von Neumann vs. Harvard
  • Harvard can’t use self-modifying code.
  • Harvard allows two simultaneous memory fetches.
  • Most DSPs use Harvard architecture for streaming data:
    • greater memory bandwidth;
    • more predictable bandwidth.
risc vs cisc
RISC vs. CISC
  • Complex instruction set computer (CISC):
    • many addressing modes;
    • many operations.
  • Reduced instruction set computer (RISC):
    • load/store;
    • pipelinable instructions.
slide16
CISC 프로세서
  • Intel 계열 마이크로프로세서의 종류 및 역사
instruction set characteristics
Instruction set characteristics
  • Fixed vs. variable length.
  • Addressing modes.
  • Number of operands.
  • Types of operands.
programming model
Programming model
  • Programming model: registers visible to the programmer.
  • Some registers are not visible (IR).
multiple implementations
Multiple implementations
  • Successful architectures have several implementations:
    • varying clock speeds;
    • different bus widths;
    • different cache sizes;
    • etc.
arm architecture
ARM Architecture
  • ARM versions.
  • ARM assembly language.
  • ARM programming model.
arm versions
ARM versions
  • ARM architecture has been extended over several versions.
  • We will concentrate on ARMv5
armv6 improvement
ARMv6 Improvement
  • Memory management
  • Multiprocessing
  • Multimedia support: SIMD capability
introduction
Introduction
  • To allow very small, yet high-performance implementations
  • RISC
    • Large uniform register file
    • Load/store architecture
    • Simple addressing modes
    • Uniform and fixed-length instr fields
    • Auto-increment and auto-decrement addr modes
    • Conditional execution of all instrcutions
arm assembly language
ARM assembly language
  • Fairly standard assembly language:

LDR r0,[r8] ; a comment

label ADD r4,r0,r1

arm data types
ARM data types
  • Byte :
  • Halfword : 16 bits
    • Must be aligned to two-byte boundaries
  • Word : 32 bits
    • Must be aligned to four-byte boundaries
  • ARM addresses canbe 32 bits long.
  • Address refers to byte.
    • Address 4 starts at byte 4.
  • Can be configured at power-up as either little- or bit-endian mode.
processor modes
Processor modes
  • User: usr – Normal program execution modes
  • FIQ: fiq – Supports a high-speed data transfer or channel process
  • IRQ: irq – Used for general-purpose interrupt handling
  • Supervisor: svc – A protected mode for OS
  • Abort: abt – Implements VM and/or memory protection
  • Undefined: und – Supports software emulation of HW coprocessors
  • System: sys – Runs privileged OS tasks
  • fiq, irq, svc, abt, und – exception modes
registers

N Z C V

Registers

r0

r8

r1

r9

0

31

r2

r10

CPSR

r3

r11

r4

r12

r5

r13

r6

r14

r7

r15 (PC)

Link register

unbanked registers

banked registers

endianness
Endianness
  • Relationship between bit and byte/word ordering defines endianness:

bit 31

bit 0

bit 0

bit 31

byte 3

byte 2

byte 1

byte 0

byte 0

byte 1

byte 2

byte 3

little-endian

big-endian

arm status bits
ARM status bits
  • Every arithmetic, logical, or shifting operation may set CPSR (current program statues register) bits:
    • N (negative), Z (zero), C (carry), V (overflow).
  • Examples:
    • -1 + 1 = 0: NZCV = 0110.
    • 231-1+1 = -231: NZCV = 0101.
arm data processing operand addressing
ARM data processing – operand addressing
  • Instruction syntax
    • <opcode>{<cond>}{S} <Rd>, <Rn>, <shifter-operand>
  • <shifter-operand> has 11 options
condition field
Condition field
  • Almost all ARM instrs. – conditionally executed
arm data processing operand addressing1

31

31

28

28

25

25

21

21

19

19

16

16

12

12

7

7

5

5

4

4

3

3

0

0

cond

cond

000

000

opcode

opcode

S

S

Rn

Rn

Rd

Rd

shift amount

shift

shift

0

1

Rm

Rm

Rs

0

31

28

25

21

19

16

12

7

5

4

3

0

cond

001

opcode

S

Rn

Rd

rotate

immediate-8

ARM data processing – operand addressing

Data processing immediate shift

Data processing register shift

Data processing 32-bit immediate

shifter operand
Shifter operand
  • Immediate
    • 8-bit constant and a 4-bit rotate (0,2,4,8,…,30)
      • mov r0, #0
      • add r9, r9,#1
  • Register operand
      • mov r2, r0
  • Shifted register operand
    • ASR, LSL, LSR, ROR, RRX (by one bit)
      • mov r2, r0, LSL#2 ; shift r0 left by 2, write to r2 (r2=r0x4)
      • sub r10,r9,r8, LSR #4 ; r10 = r9 - r8/16
      • sov r10,r9,r8, ROR r3 ; r10 = r9 - (r8 rotated by value of r3)
arm data processing
AND

EOR

SUB : Rd:= Rn - shifter operand

RSB : Rd:= shifter operand - Rn

ADD

ADC (with carry)

SBC

RSC (reverse SBC)

TST : update flags after Rn AND shifter operand

TEQ

CMP

CMN: copmare negated

ORR (logical OR)

MOV

BIC

MVN (mov not)

ARM data-processing
arm data processing1
ARM data-processing
  • Shift, Rotate ? – shifter-operand
    • LSL, LSR : logical shift left/right
    • ASR : arithmetic shift left/right
    • ROR : rotate right
    • RRX : rotate right extended with C
data operation varieties
Data operation varieties
  • Logical shift:
    • fills with zeroes.
  • Arithmetic shift:
    • fills with sign extension
  • RRX performs 33-bit rotate, including C bit from CPSR above sign bit.
load and store instructions
Load and Store instructions
  • Two types
    • 32-bit word or an 8-bit unsigned byte
    • Load and store halfword and load signed byte
  • Addressing modes
    • Base register
      • Any one of GPR (including the PC)
    • Offset
      • Three format
addressing modes
Addressing modes
  • Offset
    • Immediate: unsigned number (12 bits or 8 bits)
    • Register: GPR (not the PC)
    • Scaled register: shifted by an immediate value
      • LSL, LSR, ASR, ROR, RRX
  • Three ways to form the memory address
      • EA := Base register + or – Offset
    • Offset
    • Pre-indexed
    • Post-indexed
addressing modes1
Addressing modes
  • Base-plus-offset addressing:

LDR r0,[r1,#16]

    • Loads from location r1+16
  • Pre-indexing increments base register:

LDR r0,[r1,#16]!

  • Post-indexing fetches, then does offset:

LDR r0,[r1],#16

    • Loads r0 from r1, then adds 16 to r1.
load and store
LDR

LDRB

LDRH

LDRSB (signed byte)

LDRSH (signed halfw)

STR

STRB

STRH

Load and store
examples
Examples

LDR R1, [R0] ; load R1 from the address in R0

LDR R8, [R3, #4] ; EA = [R3] + 4

LDR R8, [R3, #-4] ; EA = [R3] – 4

STRB R10, [R7, -R4] ; EA = [R7] – [R4]

LDR R11, [R3, R5, LSL#2] ; EA = [R3] + ([R5]x4)

LDR R3, [R9], #4 ; EA = [R9], R9 = [R9] +4 post-indexed

LDR R1, [R0, #2] ! ; EA = [R0]+2, R0=[R0]+2 pre-indexed

LDR R0, [PC, #40] ; load R0 from PC+0x40 (= address of the ; instruction +8 + 0x40)

load and store multiple
Load and store multiple
  • Addressing modes
    • IA : increment after
    • IB : increment before
    • DA: decrement after
    • DB: decrement before
load and store multiple1
Load and store multiple
  • LDM
  • STM
  • Examples
    • LDMIA r0, {r5 – r8} ; load multiple r5-r8 from ; the address in r0
    • STMDA r1!, {r2, r5, r7 – r9, r11} ; update r1
branch instructions
Branch instructions
  • Conditional branch forwards or backwards up to 32 MB
    • Sign-extending the 24-bit imm_data to 32 bits
    • Shifting the result left two bits
    • Adding this to the PC (the addr of branch +8)
    • Approximately ± 32MB
  • B, BL
examples1
Examples

B label

BCC label ; branch if carry flag is clear

BEQ label ; if zero flag is set

MOV PC, #0 ; branch to location zero

BL func ; subroutine call

MOV PC,LR ; return

MOV LR, PC

LDR PC, =func ;

arm adr pseudo op
ARM ADR pseudo-op
  • Cannot refer to an address directly in an instruction.
  • Generate value by performing arithmetic on PC.
  • ADR pseudo-op generates instruction required to calculate address:

ADR r1,FOO

examples2
Examples

start MOV r0, #10

ADR r4, start; => SUB r4,pc,#0xc

start = pc - 4 - 8 = pc - 12 = pc - 0xc

example c assignments
Example: C assignments
  • C:

x = (a + b) - c;

  • Assembler:

ADR r4,a ; get address for a

LDR r0,[r4] ; get value of a

ADR r4,b ; get address for b, reusing r4

LDR r1,[r4] ; get value of b

ADD r3,r0,r1 ; compute a+b

ADR r4,c ; get address for c

LDR r2[r4] ; get value of c

c assignment cont d
C assignment, cont’d.

SUB r3,r3,r2 ; complete computation of x

ADR r4,x ; get address for x

STR r3[r4] ; store value of x

example c assignment
Example: C assignment
  • C:

y = a*(b+c);

  • Assembler:

ADR r4,b ; get address for b

LDR r0,[r4] ; get value of b

ADR r4,c ; get address for c

LDR r1,[r4] ; get value of c

ADD r2,r0,r1 ; compute partial result

ADR r4,a ; get address for a

LDR r0,[r4] ; get value of a

c assignment cont d1
C assignment, cont’d.

MUL r2,r2,r0 ; compute final value for y

ADR r4,y ; get address for y

STR r2,[r4] ; store y

example c assignment1
Example: C assignment
  • C:

z = (a << 2) | (b & 15);

  • Assembler:

ADR r4,a ; get address for a

LDR r0,[r4] ; get value of a

MOV r0,r0,LSL 2 ; perform shift

ADR r4,b ; get address for b

LDR r1,[r4] ; get value of b

AND r1,r1,#15 ; perform AND

ORR r1,r0,r1 ; perform OR

c assignment cont d2
C assignment, cont’d.

ADR r4,z ; get address for z

STR r1,[r4] ; store value for z

example if statement
Example: if statement
  • C:

if (a < b) { x = 5; y = c + d; } else x = c - d;

  • Assembler:

; compute and test condition

ADR r4,a ; get address for a

LDR r0,[r4] ; get value of a

ADR r4,b ; get address for b

LDR r1,[r4] ; get value for b

CMP r0,r1 ; compare a < b

BGE fblock ; if a>= b, branch to false block

if statement cont d
If statement, cont’d.

; true block

MOV r0,#5 ; generate value for x

ADR r4,x ; get address for x

STR r0,[r4] ; store x

ADR r4,c ; get address for c

LDR r0,[r4] ; get value of c

ADR r4,d ; get address for d

LDR r1,[r4] ; get value of d

ADD r0,r0,r1 ; compute y

ADR r4,y ; get address for y

STR r0,[r4] ; store y

B after ; branch around false block

if statement cont d1
If statement, cont’d.

; false block

fblock ADR r4,c ; get address for c

LDR r0,[r4] ; get value of c

ADR r4,d ; get address for d

LDR r1,[r4] ; get value for d

SUB r0,r0,r1 ; compute a-b

ADR r4,x ; get address for x

STR r0,[r4] ; store value of x

after ...

example conditional instruction implementation
Example: Conditional instruction implementation

; true block

MOVLT r0,#5 ; generate value for x

ADRLT r4,x ; get address for x

STRLT r0,[r4] ; store x

ADRLT r4,c ; get address for c

LDRLT r0,[r4] ; get value of c

ADRLT r4,d ; get address for d

LDRLT r1,[r4] ; get value of d

ADDLT r0,r0,r1 ; compute y

ADRLT r4,y ; get address for y

STRLT r0,[r4] ; store y

conditional instruction implementation cont d
Conditional instruction implementation, cont’d.

; false block

ADRGE r4,c ; get address for c

LDRGE r0,[r4] ; get value of c

ADRGE r4,d ; get address for d

LDRGE r1,[r4] ; get value for d

SUBGE r0,r0,r1 ; compute a-b

ADRGE r4,x ; get address for x

STRGE r0,[r4] ; store value of x

example fir filter
Example: FIR filter
  • C:

for (i=0, f=0; i<N; i++)

f = f + c[i]*x[i];

  • Assembler

; loop initiation code

MOV r0,#0 ; use r0 for I

MOV r8,#0 ; use separate index for arrays

ADR r2,N ; get address for N

LDR r1,[r2] ; get value of N

MOV r2,#0 ; use r2 for f

fir filter cont d
FIR filter, cont’.d

ADR r3,c ; load r3 with base of c

ADR r5,x ; load r5 with base of x

; loop body

loop LDR r4,[r3,r8] ; get c[i]

LDR r6,[r5,r8] ; get x[i]

MUL r4,r4,r6 ; compute c[i]*x[i]

ADD r2,r2,r4 ; add into running sum

ADD r8,r8,#4 ; add one word offset to array index

ADD r0,r0,#1 ; add 1 to i

CMP r0,r1 ; exit?

BLT loop ; if i < N, continue

nested subroutine calls
Nested subroutine calls
  • Nesting/recursion requires coding convention:

f1 LDR r0,[r13] ; load arg into r0 from stack

; call f2()

STR r14,[r13]! ; store f1’s return adrs

STR r0,[r13]! ; store arg to f2 on stack

BL f2 ; branch and link to f2

; return from f1()

SUB r13,#4 ; pop f2’s arg off stack

LDR r15,[r13]! ; restore register and return

summary
Summary
  • Load/store architecture
  • Most instructions are RISCy, operate in single cycle.
    • Some multi-register operations take longer.
  • All instructions can be executed conditionally.
reference manuals
Reference Manuals
  • MPC850 Family User Manual
  • PowerPC Programming Environment Manual
    • Course Home Page http://calab.kaist.ac.kr/~maeng/cs310/micro02.htm
    • Motorola Home Page

http://e-www.motorola.com

overview
Overview
  • Versatile, one-chip, integrated communication processor
    • Embedded PowerPC core
    • Versatile memory controller
    • Communication processor module (CPM)
      • Serial communication controllers (SCCs)
      • One USB
      • Etc.
embedded powerpc core
Embedded PowerPC core
  • Single issue, 32-bit version
  • Branch folding and prediction
  • 2-K byte I-cache, 1K byte D-cache
    • 2-way set-associative
    • Physical
  • MMUs with 8-entry TLBs
  • 4K, 16K, 256K, 512K, and 8MB page sizes
other features
Other Features
  • Dynamic data bus sizing : 8-, 16-, 32-bit
  • CPU clock : 0-80MHz
  • System Integration Unit (SIU)
  • Memory Controller
  • General Purpose timer
  • CPM, SCCs, SMCs, etc.
powerpc instruction set
PowerPC instruction set
  • Overview
  • Operand Conventions
  • PowerPC Registers and programming model
  • Addressing Modes
  • Instruction Set
  • Cache model
  • Exception Model
  • Memory management model
powerpc architecture1
PowerPC Architecture
  • Motorola, IBM, Apple computer
  • Power Architecture: RS/6000 family
  • 64-bit architecture with a 32-bit subset
  • Three Levels of the architecture
    • Flexibility – degrees of SW compatibility
      • UISA (User instruction set architecture)
      • VEA (Virtual environment architecture)
      • OEA (Operating environment architecture)
features not defined by the powerpc architecture
Features not defined by the PowerPC Architecture
  • For flexibility
  • System bus interface signals
  • Cache design
  • The number and the nature of execution units
  • Other internal micro-architecture issues
endianness1
Endianness
  • Relationship between bit and byte/word ordering defines endianness:

bit 31

bit 0

bit 0

bit 31

byte 3

byte 2

byte 1

byte 0

byte 0

byte 1

byte 2

byte 3

little-endian

big-endian

ARM, Intel

PowerPC, IBM,

Motorola

powerpc programming model register set
PowerPC programming model - Register Set
  • User Model – UISA (32-bit architecture)

Condition register

GPR0(32)

FGPR0(64)

CR(32)

GPR1(32)

FGPR1(64)

FP status and control

register

GPR31(32)

FPSCR(32)

FGPR31(64)

Count register

XER register

Link register

XER(32)

LR(64/32)

CTR(64/32)

condition registers cr
Condition Registers (CR)
  • For testing and branching

CR0

CR1

CR2

CR3

CR4

CR5

CR6

CR7

0

31

FP

Condition register CRn Field – Compare Instruction

For all integer instrs.

Bit0: Negative(LT)

Bit1: Positive(GT)

Bit2: Zero (EQ)

Bit3: Summary Overflow(SO)

back

link register lr count register ctr
Link Register (LR), Count Register (CTR)

bclrx (bc to link register)

Branch with link update

addressing modes2
Addressing Modes
  • Effective Address Calculation
    • Register indirect with immediate index mode
    • Register indirect with index mode
    • Register indirect mode
instruction formats
Instruction Formats
  • 4 bytes long and word-aligned
  • Bits 0-5 always specify the primary opcode
    • Extended opcode
instruction set
Instruction set
  • Integer
  • Floating-point
  • Load and store
  • Flow control
  • Processor control
  • Memory synchronization
  • Memory control
  • External control
integer instructions
Integer Instructions
  • Arithmetic, compare, logical, rotate and shift
  • Integer arithmetic, shift, rotate, and string move
    • May update or read values from the XER
    • The CR may be updated if the Rc bit is set.
      • addic - addic.
integer compare
Integer Compare
  • Algebraically, logically
  • crfD can be omitted if the result is to be placed in CR0
  • crfD field : the target CR
  • The L bit has no effect on 32-bit operations
rotate and shift instructions
Rotate and Shift Instructions
  • SH: specify the number of bits to rotate
  • MB: mask start
  • ME: mask stop
load and store1
Load and Store
  • Integer load and store
  • Integer load and store with byte-reverse
  • Integer load and store multiple
  • FP load and store
  • Memory synchronization
branch and flow control
Branch and Flow Control
  • EA calculation
    • Branch relative
    • Branch conditional to relative address
    • Branch to absolute address
    • Branch conditional to absolute address
    • Branch conditional to link register
    • Branch conditional to count register
example
Example
  • Test and Set

loop:lwarx r5,0,r3 # load and reserve

cmpwi r5,0 # done if word

bne $+12 # not equal to 0

stwcx. r4,0,r3 # try to store non-zero

bne- loop # loop if lost reservation

summary1
Summary
  • UISA, VEA, OEA
    • Register set
  • Fixed size instruction - RISC
  • Load and store architecture
    • 3 addressing modes
  • Condition Register Update – Rc field
    • 8 condition registers
  • Branch addressing modes
    • BO, BI fields
    • Relative, absolute, LR, CTR