Embedded system hw
Sponsored Links
This presentation is the property of its rightful owner.
1 / 131

Embedded System HW PowerPoint PPT Presentation


  • 91 Views
  • Uploaded on
  • Presentation posted in: General

Embedded System HW. Why use microprocessors?. Alternatives: field-programmable gate arrays (FPGAs), custom logic, etc. (dedicated Single-purpose Processor or HW Logic) Microprocessors are often very efficient: can use same logic to perform many different functions.

Download Presentation

Embedded System HW

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Embedded System HW


Why use microprocessors?

  • Alternatives: field-programmable gate arrays (FPGAs), custom logic, etc. (dedicated Single-purpose Processor or HW Logic)

  • Microprocessors are often very efficient: can use same logic to perform many different functions.

  • Microprocessors simplify the design of families of products.


The performance paradox

  • Microprocessors use much more logic to implement a function than does custom logic.

  • But microprocessors are often at least as fast:

    • heavily pipelined;

    • large design teams;

    • aggressive VLSI technology.


Power

  • Custom logic is a clear winner for low power devices.

  • Modern microprocessors offer features to help control power consumption.

  • Software design techniques can help reduce power consumption.


Microprocessor varieties

  • Microcontroller: includes I/O devices, on-board memory.

  • Digital signal processor (DSP): microprocessor optimized for digital signal processing.

  • Typical embedded word sizes: 8-bit, 16-bit, 32-bit.


Embedded Processors

  • 임베디드 프로세서

    • 원래는 마이크로컨트롤러를 의미

    • 마이크로컨트롤러를 확장한 개념으로도 사용

    • CPU 코어, 메모리, 주변 장치, 입출력장치에 다양한 종류의 네트워크 장치가 추가되는 형태

Netsilicon NET+ARM Embedded Processor


Many Types of Programmable Processors

  • Past

    • Microprocessor

    • Microcontroller

    • DSP

    • Graphics Processor

  • Now / Future

    • Network Processor

    • Sensor Processor

    • Cryptoprocessor

    • Game Processor

    • Wearable Processor

    • Mobile Processor


Application-Specific InstructionProcessors (ASIPs)

  • Processors with instruction-sets tailored to specific applications or application domains

    • instruction-set generation as part of synthesis

  • Pluses:

    • customization yields lower area, power etc.

  • Minuses:

    • higher h/w & s/w development overhead

      • design, compilers, debuggers

      • higher time to market


  • Reconfigurable SoC

    Triscend’s A7 CSoC

    Other Examples

    Atmel’s FPSLIC(AVR + FPGA)

    Altera’s Nios(configurable RISC on a PLD)


    Instruction Sets


    von Neumann architecture

    • Memory holds data, instructions.

    • Central processing unit (CPU) fetches instructions from memory.

      • Separate CPU and memory distinguishes programmable computer.

    • CPU registers help out: program counter (PC), instruction register (IR), general-purpose registers, etc.


    CPU + memory

    memory

    CPU

    PC

    200

    address

    ADD r5,r1,r3

    ADD r5,r1,r3

    IR

    200

    data


    Harvard architecture

    address

    CPU

    data memory

    PC

    data

    address

    program memory

    data


    von Neumann vs. Harvard

    • Harvard can’t use self-modifying code.

    • Harvard allows two simultaneous memory fetches.

    • Most DSPs use Harvard architecture for streaming data:

      • greater memory bandwidth;

      • more predictable bandwidth.


    RISC vs. CISC

    • Complex instruction set computer (CISC):

      • many addressing modes;

      • many operations.

    • Reduced instruction set computer (RISC):

      • load/store;

      • pipelinable instructions.


    CISC 프로세서

    • Intel 계열 마이크로프로세서의 종류 및 역사


    CISC - History : Packaging기술 변천


    CISC - History


    Instruction set characteristics

    • Fixed vs. variable length.

    • Addressing modes.

    • Number of operands.

    • Types of operands.


    Programming model

    • Programming model: registers visible to the programmer.

    • Some registers are not visible (IR).


    Multiple implementations

    • Successful architectures have several implementations:

      • varying clock speeds;

      • different bus widths;

      • different cache sizes;

      • etc.


    Advanced RISC Machines(1990)

    (ACORN and Apple Computer)

    ARM Architecture


    ARM Architecture

    • ARM versions.

    • ARM assembly language.

    • ARM programming model.


    ARM versions

    • ARM architecture has been extended over several versions.

    • We will concentrate on ARMv5


    Evolution of the ARM architecture versions


    ARMv6 Improvement

    • Memory management

    • Multiprocessing

    • Multimedia support: SIMD capability


    Evolution of the ARM architecture

    ARM11


    Introduction

    • To allow very small, yet high-performance implementations

    • RISC

      • Large uniform register file

      • Load/store architecture

      • Simple addressing modes

      • Uniform and fixed-length instr fields

      • Auto-increment and auto-decrement addr modes

      • Conditional execution of all instrcutions


    ARM assembly language

    • Fairly standard assembly language:

      LDR r0,[r8] ; a comment

      labelADD r4,r0,r1


    Programming Model


    ARM data types

    • Byte :

    • Halfword : 16 bits

      • Must be aligned to two-byte boundaries

    • Word : 32 bits

      • Must be aligned to four-byte boundaries

    • ARM addresses canbe 32 bits long.

    • Address refers to byte.

      • Address 4 starts at byte 4.

    • Can be configured at power-up as either little- or bit-endian mode.


    Processor modes

    • User: usr – Normal program execution modes

    • FIQ: fiq – Supports a high-speed data transfer or channel process

    • IRQ: irq – Used for general-purpose interrupt handling

    • Supervisor: svc – A protected mode for OS

    • Abort: abt – Implements VM and/or memory protection

    • Undefined: und – Supports software emulation of HW coprocessors

    • System: sys – Runs privileged OS tasks

    • fiq, irq, svc, abt, und – exception modes


    N Z C V

    Registers

    r0

    r8

    r1

    r9

    0

    31

    r2

    r10

    CPSR

    r3

    r11

    r4

    r12

    r5

    r13

    r6

    r14

    r7

    r15 (PC)

    Link register

    unbanked registers

    banked registers


    Endianness

    • Relationship between bit and byte/word ordering defines endianness:

    bit 31

    bit 0

    bit 0

    bit 31

    byte 3

    byte 2

    byte 1

    byte 0

    byte 0

    byte 1

    byte 2

    byte 3

    little-endian

    big-endian


    ARM status bits

    • Every arithmetic, logical, or shifting operation may set CPSR (current program statues register) bits:

      • N (negative), Z (zero), C (carry), V (overflow).

    • Examples:

      • -1 + 1 = 0: NZCV = 0110.

      • 231-1+1 = -231: NZCV = 0101.


    ARM data processing – operand addressing

    • Instruction syntax

      • <opcode>{<cond>}{S} <Rd>, <Rn>, <shifter-operand>

    • <shifter-operand> has 11 options


    Condition field

    • Almost all ARM instrs. – conditionally executed


    31

    31

    28

    28

    25

    25

    21

    21

    19

    19

    16

    16

    12

    12

    7

    7

    5

    5

    4

    4

    3

    3

    0

    0

    cond

    cond

    000

    000

    opcode

    opcode

    S

    S

    Rn

    Rn

    Rd

    Rd

    shift amount

    shift

    shift

    0

    1

    Rm

    Rm

    Rs

    0

    31

    28

    25

    21

    19

    16

    12

    7

    5

    4

    3

    0

    cond

    001

    opcode

    S

    Rn

    Rd

    rotate

    immediate-8

    ARM data processing – operand addressing

    Data processing immediate shift

    Data processing register shift

    Data processing 32-bit immediate


    Shifter operand

    • Immediate

      • 8-bit constant and a 4-bit rotate (0,2,4,8,…,30)

        • mov r0, #0

        • add r9, r9,#1

    • Register operand

      • mov r2, r0

  • Shifted register operand

    • ASR, LSL, LSR, ROR, RRX (by one bit)

      • mov r2, r0, LSL#2 ; shift r0 left by 2, write to r2 (r2=r0x4)

      • sub r10,r9,r8, LSR #4 ; r10 = r9 - r8/16

      • sov r10,r9,r8, ROR r3 ; r10 = r9 - (r8 rotated by value of r3)


  • AND

    EOR

    SUB : Rd:= Rn - shifter operand

    RSB : Rd:= shifter operand - Rn

    ADD

    ADC (with carry)

    SBC

    RSC (reverse SBC)

    TST : update flags after Rn AND shifter operand

    TEQ

    CMP

    CMN: copmare negated

    ORR (logical OR)

    MOV

    BIC

    MVN (mov not)

    ARM data-processing


    ARM data-processing

    • Shift, Rotate ? – shifter-operand

      • LSL, LSR : logical shift left/right

      • ASR : arithmetic shift left/right

      • ROR : rotate right

      • RRX : rotate right extended with C


    Data operation varieties

    • Logical shift:

      • fills with zeroes.

    • Arithmetic shift:

      • fills with sign extension

    • RRX performs 33-bit rotate, including C bit from CPSR above sign bit.


    Load and Store instructions

    • Two types

      • 32-bit word or an 8-bit unsigned byte

      • Load and store halfword and load signed byte

    • Addressing modes

      • Base register

        • Any one of GPR (including the PC)

      • Offset

        • Three format


    Addressing modes

    • Offset

      • Immediate: unsigned number (12 bits or 8 bits)

      • Register: GPR (not the PC)

      • Scaled register: shifted by an immediate value

        • LSL, LSR, ASR, ROR, RRX

    • Three ways to form the memory address

      • EA := Base register + or – Offset

    • Offset

    • Pre-indexed

    • Post-indexed


    Addressing modes

    • Base-plus-offset addressing:

      LDR r0,[r1,#16]

      • Loads from location r1+16

    • Pre-indexing increments base register:

      LDR r0,[r1,#16]!

    • Post-indexing fetches, then does offset:

      LDR r0,[r1],#16

      • Loads r0 from r1, then adds 16 to r1.


    LDR

    LDRB

    LDRH

    LDRSB (signed byte)

    LDRSH (signed halfw)

    STR

    STRB

    STRH

    Load and store


    Examples

    LDRR1, [R0]; load R1 from the address in R0

    LDR R8, [R3, #4]; EA = [R3] + 4

    LDRR8, [R3, #-4]; EA = [R3] – 4

    STRBR10, [R7, -R4]; EA = [R7] – [R4]

    LDRR11, [R3, R5, LSL#2] ; EA = [R3] + ([R5]x4)

    LDR R3, [R9], #4; EA = [R9], R9 = [R9] +4 post-indexed

    LDRR1, [R0, #2] !; EA = [R0]+2, R0=[R0]+2 pre-indexed

    LDRR0, [PC, #40]; load R0 from PC+0x40 (= address of the ; instruction +8 + 0x40)


    Load and store multiple

    • Addressing modes

      • IA : increment after

      • IB : increment before

      • DA: decrement after

      • DB: decrement before


    Load and store multiple

    • LDM

    • STM

    • Examples

      • LDMIA r0, {r5 – r8} ; load multiple r5-r8 from ; the address in r0

      • STMDA r1!, {r2, r5, r7 – r9, r11} ; update r1


    Branch instructions

    • Conditional branch forwards or backwards up to 32 MB

      • Sign-extending the 24-bit imm_data to 32 bits

      • Shifting the result left two bits

      • Adding this to the PC (the addr of branch +8)

      • Approximately ± 32MB

    • B, BL


    Examples

    Blabel

    BCClabel ; branch if carry flag is clear

    BEQlabel ; if zero flag is set

    MOV PC, #0 ; branch to location zero

    BLfunc ; subroutine call

    MOV PC,LR ; return

    MOV LR, PC

    LDR PC, =func ;


    ARM ADR pseudo-op

    • Cannot refer to an address directly in an instruction.

    • Generate value by performing arithmetic on PC.

    • ADR pseudo-op generates instruction required to calculate address:

      ADR r1,FOO


    Examples

    startMOVr0, #10

    ADRr4, start; => SUB r4,pc,#0xc

    start = pc - 4 - 8 = pc - 12 = pc - 0xc


    Example: C assignments

    • C:

      x = (a + b) - c;

    • Assembler:

      ADR r4,a; get address for a

      LDR r0,[r4]; get value of a

      ADR r4,b; get address for b, reusing r4

      LDR r1,[r4]; get value of b

      ADD r3,r0,r1; compute a+b

      ADR r4,c; get address for c

      LDR r2[r4]; get value of c


    C assignment, cont’d.

    SUB r3,r3,r2; complete computation of x

    ADR r4,x; get address for x

    STR r3[r4]; store value of x


    Example: C assignment

    • C:

      y = a*(b+c);

    • Assembler:

      ADR r4,b ; get address for b

      LDR r0,[r4] ; get value of b

      ADR r4,c ; get address for c

      LDR r1,[r4] ; get value of c

      ADD r2,r0,r1 ; compute partial result

      ADR r4,a ; get address for a

      LDR r0,[r4] ; get value of a


    C assignment, cont’d.

    MUL r2,r2,r0 ; compute final value for y

    ADR r4,y ; get address for y

    STR r2,[r4] ; store y


    Example: C assignment

    • C:

      z = (a << 2) | (b & 15);

    • Assembler:

      ADR r4,a ; get address for a

      LDR r0,[r4] ; get value of a

      MOV r0,r0,LSL 2 ; perform shift

      ADR r4,b ; get address for b

      LDR r1,[r4] ; get value of b

      AND r1,r1,#15 ; perform AND

      ORR r1,r0,r1 ; perform OR


    C assignment, cont’d.

    ADR r4,z ; get address for z

    STR r1,[r4] ; store value for z


    Example: if statement

    • C:

      if (a < b) { x = 5; y = c + d; } else x = c - d;

    • Assembler:

      ; compute and test condition

      ADR r4,a ; get address for a

      LDR r0,[r4] ; get value of a

      ADR r4,b ; get address for b

      LDR r1,[r4] ; get value for b

      CMP r0,r1 ; compare a < b

      BGE fblock ; if a>= b, branch to false block


    If statement, cont’d.

    ; true block

    MOV r0,#5 ; generate value for x

    ADR r4,x ; get address for x

    STR r0,[r4] ; store x

    ADR r4,c ; get address for c

    LDR r0,[r4] ; get value of c

    ADR r4,d ; get address for d

    LDR r1,[r4] ; get value of d

    ADD r0,r0,r1 ; compute y

    ADR r4,y ; get address for y

    STR r0,[r4] ; store y

    B after ; branch around false block


    If statement, cont’d.

    ; false block

    fblock ADR r4,c ; get address for c

    LDR r0,[r4] ; get value of c

    ADR r4,d ; get address for d

    LDR r1,[r4] ; get value for d

    SUB r0,r0,r1 ; compute a-b

    ADR r4,x ; get address for x

    STR r0,[r4] ; store value of x

    after ...


    Example: Conditional instruction implementation

    ; true block

    MOVLT r0,#5 ; generate value for x

    ADRLT r4,x ; get address for x

    STRLT r0,[r4] ; store x

    ADRLT r4,c ; get address for c

    LDRLT r0,[r4] ; get value of c

    ADRLT r4,d ; get address for d

    LDRLT r1,[r4] ; get value of d

    ADDLT r0,r0,r1 ; compute y

    ADRLT r4,y ; get address for y

    STRLT r0,[r4] ; store y


    Conditional instruction implementation, cont’d.

    ; false block

    ADRGE r4,c ; get address for c

    LDRGE r0,[r4] ; get value of c

    ADRGE r4,d ; get address for d

    LDRGE r1,[r4] ; get value for d

    SUBGE r0,r0,r1 ; compute a-b

    ADRGE r4,x ; get address for x

    STRGE r0,[r4] ; store value of x


    Example: FIR filter

    • C:

      for (i=0, f=0; i<N; i++)

      f = f + c[i]*x[i];

    • Assembler

      ; loop initiation code

      MOV r0,#0 ; use r0 for I

      MOV r8,#0 ; use separate index for arrays

      ADR r2,N ; get address for N

      LDR r1,[r2] ; get value of N

      MOV r2,#0 ; use r2 for f


    FIR filter, cont’.d

    ADR r3,c ; load r3 with base of c

    ADR r5,x ; load r5 with base of x

    ; loop body

    loop LDR r4,[r3,r8] ; get c[i]

    LDR r6,[r5,r8] ; get x[i]

    MUL r4,r4,r6 ; compute c[i]*x[i]

    ADD r2,r2,r4 ; add into running sum

    ADD r8,r8,#4 ; add one word offset to array index

    ADD r0,r0,#1 ; add 1 to i

    CMP r0,r1 ; exit?

    BLT loop ; if i < N, continue


    Nested subroutine calls

    • Nesting/recursion requires coding convention:

      f1LDR r0,[r13] ; load arg into r0 from stack

      ; call f2()

      STR r14,[r13]! ; store f1’s return adrs

      STR r0,[r13]! ; store arg to f2 on stack

      BL f2 ; branch and link to f2

      ; return from f1()

      SUB r13,#4 ; pop f2’s arg off stack

      LDR r15,[r13]! ; restore register and return


    Summary

    • Load/store architecture

    • Most instructions are RISCy, operate in single cycle.

      • Some multi-register operations take longer.

    • All instructions can be executed conditionally.


    MPC850


    Reference Manuals

    • MPC850 Family User Manual

    • PowerPC Programming Environment Manual

      • Course Home Page http://calab.kaist.ac.kr/~maeng/cs310/micro02.htm

      • Motorola Home Page

        http://e-www.motorola.com


    Overview

    • Versatile, one-chip, integrated communication processor

      • Embedded PowerPC core

      • Versatile memory controller

      • Communication processor module (CPM)

        • Serial communication controllers (SCCs)

        • One USB

        • Etc.


    Embedded PowerPC core

    • Single issue, 32-bit version

    • Branch folding and prediction

    • 2-K byte I-cache, 1K byte D-cache

      • 2-way set-associative

      • Physical

    • MMUs with 8-entry TLBs

    • 4K, 16K, 256K, 512K, and 8MB page sizes


    Other Features

    • Dynamic data bus sizing : 8-, 16-, 32-bit

    • CPU clock : 0-80MHz

    • System Integration Unit (SIU)

    • Memory Controller

    • General Purpose timer

    • CPM, SCCs, SMCs, etc.


    PowerPC Architecture


    PowerPC instruction set

    • Overview

    • Operand Conventions

    • PowerPC Registers and programming model

    • Addressing Modes

    • Instruction Set

    • Cache model

    • Exception Model

    • Memory management model


    PowerPC Architecture

    • Motorola, IBM, Apple computer

    • Power Architecture: RS/6000 family

    • 64-bit architecture with a 32-bit subset

    • Three Levels of the architecture

      • Flexibility – degrees of SW compatibility

        • UISA (User instruction set architecture)

        • VEA (Virtual environment architecture)

        • OEA (Operating environment architecture)


    Features not defined by the PowerPC Architecture

    • For flexibility

    • System bus interface signals

    • Cache design

    • The number and the nature of execution units

    • Other internal micro-architecture issues


    Endianness

    • Relationship between bit and byte/word ordering defines endianness:

    bit 31

    bit 0

    bit 0

    bit 31

    byte 3

    byte 2

    byte 1

    byte 0

    byte 0

    byte 1

    byte 2

    byte 3

    little-endian

    big-endian

    ARM, Intel

    PowerPC, IBM,

    Motorola


    Programming Model – Registers


    PowerPC programming model - Register Set

    • User Model – UISA (32-bit architecture)

    Condition register

    GPR0(32)

    FGPR0(64)

    CR(32)

    GPR1(32)

    FGPR1(64)

    FP status and control

    register

    GPR31(32)

    FPSCR(32)

    FGPR31(64)

    Count register

    XER register

    Link register

    XER(32)

    LR(64/32)

    CTR(64/32)


    Condition Registers (CR)

    • For testing and branching

    CR0

    CR1

    CR2

    CR3

    CR4

    CR5

    CR6

    CR7

    0

    31

    FP

    Condition register CRn Field – Compare Instruction

    For all integer instrs.

    Bit0: Negative(LT)

    Bit1: Positive(GT)

    Bit2: Zero (EQ)

    Bit3: Summary Overflow(SO)

    back


    XER Register (XER)

    back


    XER Register (XER), cont’d


    Link Register (LR), Count Register (CTR)

    bclrx (bc to link register)

    Branch with link update


    Counter Register

    • Loop count


    VEA Register Set – Time Base


    OEA Register Set


    Machine State Register (MSR)


    Addressing Modes

    • Effective Address Calculation

      • Register indirect with immediate index mode

      • Register indirect with index mode

      • Register indirect mode


    Register Indirect with Immediate Index Addressing

    back


    Register Indirect with Index

    back


    Register Indirect

    back


    Instruction Formats

    • 4 bytes long and word-aligned

    • Bits 0-5 always specify the primary opcode

      • Extended opcode


    Instruction set

    • Integer

    • Floating-point

    • Load and store

    • Flow control

    • Processor control

    • Memory synchronization

    • Memory control

    • External control


    Integer Instructions

    • Arithmetic, compare, logical, rotate and shift

    • Integer arithmetic, shift, rotate, and string move

      • May update or read values from the XER

      • The CR may be updated if the Rc bit is set.

        • addic - addic.


    Integer Compare

    • Algebraically, logically

    • crfD can be omitted if the result is to be placed in CR0

    • crfD field : the target CR

    • The L bit has no effect on 32-bit operations


    Integer compare, cont’d


    Integer Logical


    Integer Logical, cont’d


    Rotate and Shift Instructions

    • SH: specify the number of bits to rotate

    • MB: mask start

    • ME: mask stop


    Integer Rotate


    Integer Shift


    Load and Store

    • Integer load and store

    • Integer load and store with byte-reverse

    • Integer load and store multiple

    • FP load and store

    • Memory synchronization


    Branch and Flow Control

    • EA calculation

      • Branch relative

      • Branch conditional to relative address

      • Branch to absolute address

      • Branch conditional to absolute address

      • Branch conditional to link register

      • Branch conditional to count register


    Branch Relative


    Branch conditional to relative


    Branch to Absolute


    Branch conditional to absolute


    Branch conditional to LR


    Branch conditional to count register


    Conditional Branch control


    Branch Instructions


    CR logical Instructions


    Trap, System Linkage


    Processor Control


    Memory Synchronization


    Example

    • Test and Set

      loop:lwarx r5,0,r3 # load and reserve

      cmpwi r5,0 # done if word

      bne $+12 # not equal to 0

      stwcx. r4,0,r3 # try to store non-zero

      bne- loop # loop if lost reservation


    Summary

    • UISA, VEA, OEA

      • Register set

    • Fixed size instruction - RISC

    • Load and store architecture

      • 3 addressing modes

    • Condition Register Update – Rc field

      • 8 condition registers

    • Branch addressing modes

      • BO, BI fields

      • Relative, absolute, LR, CTR


  • Login