嵌入式系统架构软体设计
This presentation is the property of its rightful owner.
Sponsored Links
1 / 179

嵌入式系统架构软体设计 PowerPoint PPT Presentation


  • 62 Views
  • Uploaded on
  • Presentation posted in: General

嵌入式系统架构软体设计. 嵌入式系統架構軟體設計 ---using ARM Day #3,#4,#5 Modules Outline. 課程介紹. Day #3 Simple RISC Assembly Language ARM Assembly Language ARM Development Suite 使用練習 Day #4 Arm Instruction set Important ASM Programming Skills ARM/THUMB/C Interworking Day #5 ARM Exception Handler

Download Presentation

嵌入式系统架构软体设计

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


3340657

嵌入式系统架构软体设计

嵌入式系統架構軟體設計 ---using ARM

Day #3,#4,#5 Modules Outline


3340657

課程介紹

  • Day #3

    • Simple RISC Assembly Language

    • ARM Assembly Language

    • ARM Development Suite 使用練習

  • Day #4

    • Arm Instruction set

    • Important ASM Programming Skills

    • ARM/THUMB/C Interworking

  • Day #5

    • ARM Exception Handler

    • Build ARM ROM Image

    • Use NET-Start! ucLinux BSP


3340657

嵌入式系統產品設計流程概觀


3340657

  • Steve Furber, ARM system-on-chip Architecture, 2nd ed.

  • Seal, ARM architecture reference manual, 2nd ed.

  • ARM Development Suite-Getting Started

  • ARM Development Suite-Developer Guide

  • ARM Development Suite-Assembler Guide

  • http://www.uclinux.org/

  • 2002嵌入式系統開發經驗

  • Building powerful platform with Windows CE

  • Software Engineering, A practitioner’s Approach 3rd ed.

  • Professional Symbian Programming


3340657

嵌入式系統架構軟體設計 ---using ARM

Module #3-1: Simple RISC Assembly Concept


Risc vs cisc

RISC精简指令集vs.CISC复杂指令集

Hardware instruction decode logic

Pipeline execution

Single -cycle execution

Large microcode ROMs to decode instruction

Allow little pipeline

Many cycles to completer a single instruction

  • A smaller die size

  • A shorter development time

  • A higher performance

  • Poor code density


3340657

MUO 一個簡單的處理器

硬體單元

功能

PC

Program Counter

ACC

Accumulator

ALU

Arithmetic logic unit

IR

Instruction register


3340657

指令

Opcode

功能

MUO指令集與資料路徑

LDA S

0000

ACC=mem[S]

STO S

0001

mem[S]=ACC

ADD S

0010

ACC=ACC+mem[S]

SUB S

0011

ACC=ACC-mem[S]

JMP S

0100

PC=S

JGE S

0101

If ACC>= PC=S

JNE S

0110

If ACC!=0 PC=S

STP

0111

stop

指令規則


3340657

指令執行範例

  • ADD 0x16A

    ACC:=ACC+mem[0x16A]


3340657

指令

Opcode

功能

運算範例

LDA S

0000

ACC=mem[S]

STO S

0001

mem[S]=ACC

ADD S

0010

ACC=ACC+mem[S]

SUB S

0011

ACC=ACC-mem[S]

JMP S

0100

PC=S

JGE S

0101

If ACC>= PC=S

JNE S

0110

If ACC!=0 PC=S

STP

0111

stop

C function:

Main()

{

C=A+B;

}

MUO 機器指令

LDA 0x100

ADD 0x104

STO 0x108


3340657

指令

Opcode

功能

練習: MUO微處理器的運算

LDA S

0000

ACC=mem[S]

STO S

0001

mem[S]=ACC

ADD S

0010

ACC=ACC+mem[S]

SUB S

0011

ACC=ACC-mem[S]

JMP S

0100

PC=S

JGE S

0101

If ACC>= PC=S

JNE S

0110

If ACC!=0 PC=S

STP

0111

stop

0x000 LDA 0x100

0x002 SUB 0x104

0x004 STO 0x100

0x006 JNE 0x000

0x008 STP

請描述此段程式的動作,暫存器值的變化、與資料流。請用C語言來寫出這段程式碼。


3340657

嵌入式系統架構軟體設計---using ARM

Module #3-2: ARM Assembly Language


Arm7tdmi

ARM7TDMI資料流

e.g.r3:=r4+(r4,,2)

ADD r3,r4,r4,LSL#2

A bus B bus


3340657

ARM 的暫存器

  • 30 general-purpose, 32 bits registers

  • 1 Program Counter (PC)

  • 1 Current Program Status Register (CPSR)

  • 5 Saved Program Status Registers (SPSR)

r0

r1

r2

r3

r4

r5

r6

r7

r8

r9

r10

r11

r12

r13 (sp)

r14 (lr)

r15 (pc)

cpsr

User mode FIQ mode irq mode SVC mode abort mode undefined mode


Program status register

Program Status Register

  • CPSR: Current Program Status Register

  • SPSR: Saved Program Status Register

  • T bit

  • Architecture xT only

  • T=0: ARM state

  • T=1: Thumb state

  • Condition code flags

  • N: Negative result from ALU

  • Z: Zero result from ALU

  • C: ALU operation Carried out

  • V: ALU operation overflowed

  • Interrupt Disable bits

  • I: disable the IRQ

  • F: Disable the FIQ

31 30 29 28 27 24 7 6 5 4 0

N Z C V Q J undefined I F T mode

  • Q: Sticky Overflow flag

  • Architecture 5TE only

  • QADD, QSUB…

  • Mode bits

  • Specify the processor mode

  • 10000 User

  • 10001 FIQ

  • 10010 IRQ

  • 10011 SVC

  • 10111 Abort

  • 11011 Undef

  • 11111 System

  • J: Processor in Jazelle state

  • Architecture 5TEJ only


Program counter r15

Program Counter –R15

  • ARM state:

    • All ARM instructions are four bytes long (one 32-bit word) and are always aligned on a word boundary.

    • The PC value is stored in bits [31:2] with bits [1:0] undefined.

  • In Thumb state:

    • All instructions are 16 bits wide, and halfword aligned

    • The PC value is stored in bits[31:1] with bits [0] undefined.

  • In Jazelle state:

    • All instructions are 8 bits wide.

    • The processor performs a word access to read 4 instructions at once.


  • Link register r14

    Link Register –R14

    • Register 14 is the Link Register (LR).

    • This register holds the address of the next instruction after a Branch and Link (BL) instruction, which is the instruction used to make a subroutine call.

    • At all other times, R14 can be used as a general-purpose register


    Other register r0 r13

    Other Register R0-R13

    • The remaining 15 registers have no special hardware purpose.

    • Their uses are defined purely by software.

    • By convention, ARM assembly language use R13 as Stack Pointer.

    • C and C++ compilers always use R14 as the Stack Pointer(SP).


    Structure of arm assembly language module

    Structure of ARM Assembly Language Module

    AREA Sectionname{,attr}{,attr}…

    Start of New code or data section.

    CODE: contain machine instructions.

    READONLY: section should not be written to.

    Other attr: DATA, NOINIT, READWRITE,…

    Declares an entry point to a program.

    Labels.

    Declares the end of the source file.


    Calling subroutines uses bl

    Calling Subroutines Uses BL

    • BL destination

      • destination is the label on the first instruction of the subroutine.

    • BL does:

      • place the return address in the link register (R14)

      • sets PC to the address of the subroutine.

    • In the subroutine

      • we can use “MOV pc,lr” to return.

    • By convention, R0-R3 are used to pass parameters.


    Calling subroutines example

    Calling Subroutines Example

    ; name this block of code

    ; mark first instruction

    ; to execute

    ; Set up parameters

    ; Call subroutine

    ; angel_SWI reason_report Exception

    ; ADP_Stopped_ApplicationExit

    ; ARM semihosting SWI

    ; Subroutine code

    ; Return from subroutine.

    ; Mark end of file


    Constant data types

    Constant Data Types

    • Numbers Numeric constants are accepted in three forms:

      • Decimal, for example, 123

      • Hexadecimal, for example, 0x7B

      • n_XXX where:

        • n is as base between 2 and 9

        • xxx is a number in that base.

  • Boolean TRUE and FALSE must be written as {TRUE} and {FALSE}.

  • Characters constants consist of opening and closing single quotes ‘X’, enclosing either a single character or an escaped character, using the standard C escape characters.

  • Strings consist of opening and closing double quotes “XXXX”. If double quotes or dollar signs are used within a string as literal text characters, they must be represented by a pair of the appropriate character.

    • For example, you must use $$ if you require a single $ in the string. The standard C escape sequences can be used within string constants.


  • Conditional arm instructions

    Almost all ARM instructions can be conditionally executed.

    e.g.

    ADDS r0, r1, r2

    ADDEQ r0, r1, r2

    Execute if the N, Z, C and V flags in the CPSR satisfy a condition specified in the instruction, otherwise, NOP.

    Conditional ARM Instructions

    指令名稱

    條件

    XXXCC


    Conditional execution

    Almost every ARM instruction can be executed conditionally on the state of the ALU state flags in the CPSR.

    Add an S suffix to an ARM data processing instruction to make it update the ALU state flags in the CPSR

    E.g. ADDS r0, r1, r2 ; r0= r1+ r2 and update ALU status in CPSR.

    In ARM state, you can:

    update the ALU status flags in the PSR on the result of a data operation

    execute several other data operation without updating the flags

    execute following instructions or not, according to the state of the flags updated in the first operation.

    In Thumb state

    most data operations always update the flags

    and conditional execution can only be achieved using the conditional branch instruction (B).

    Do not use the S suffix with CMP, CMN, TST, or TEQ. These comparison instructions always update the flags.

    Conditional Execution


    Alu status register in cpsr

    ALU Status Register in CPSR

    • N Set when the result of the operation was Negative.

    • Z Set when the result of the operation was Zero.

    • C when the result of the operation was Carry.

      • A carry occurs if the result of an addition is greater than or equal to 232

      • If the result of a instruction is positive,

      • or as the result of an inline barrel shifter operation in a move or logical instruction.

    • V Set when the operation caused oVerflow.

      • Overflow occurs if the result of an add, subtract, or compare is greater than or equal to 231, or less than – 231.

    • Q ARM architecture v5Eonly. Sticky flag.

      • Used to detect saturation in special saturating arithmetic instructions (e.g. QAD, ASUB, QDADD, and QDSUB),

      • Or overflow in certain multiply instructions (SMLAxy and SMLAWy)


    Conditional code suffixes

    Conditional Code Suffixes


    Conditional code examples

    Conditional Code Examples

    • ADD r0, r1, r2;r0 = r1 + r2, don’t update flags

    • ADDS r0, r1, r2;r0 = r1 + r2, and update flags

    • ADDCSS r0, r1, r2;if C flag set then r0 = r1 + r2, and update flags

    • CMP r0, r1;update flags based on r0-r1.

    • Example code sequence:

      MOV R0, #0

      LOOP ADD R0, R0, #1

      CMP R0, #10

      BNE LOOP

      SUB R1, R1, R0


    Write efficient and small size code by conditional instruction

    Write Efficient and small size Code by Conditional Instruction


    Exercise

    Exercise

    Write program by ARM assembly, & evaluate the execution cost in clock.

    A Branch needs 3 cycles, others cost 1.

    註:唯需透過CMP, SUB, B這三個指令,加上條件式, 就可以完成。

    While (r1!=r2) do

    {

    if (r1>r2)

    r1=r1-r2;

    else

    r2=r2-r1;

    }


    3340657

    嵌入式系統架構軟體設計

    ---using ARM

    Module #3-3: ARM Development Suite使用練習


    Arm ads 1 2

    ARM ADS 1.2

    Others:

    • C & C++ Libraries

    • ARM firmware suite

    • AM application library

    • RealMonitor: for real time debug monitor


    Implementation integration by command line makefile codewarrior

    Implementation Integrationby command line, makefile, CodeWarrior


    Pre configured project stationary files

    Pre-configured Project Stationary Files

    • Debug

      • This build target is configured to built output binaries that are fully debuggable, at the expense of optimization.

    • Release

      • This build target is configured to output binaries that are fully optimized, at the expense of debug information.

    • DebugRel

      • This build target is configured to build output binaries that provide adequate optimization, and give a good debug view.


    Possible development environments

    Possible Development Environments


    Reference

    Reference

    • ARM Developer Suite Version 1.2 Getting Started

    • 請依Chapter 3練習使用ADS。


    3340657

    嵌入式系統架構軟體設計---using ARM

    Module #3-4: ARM Instruction Set


    3340657

    ARM 指令集特點

    • 所有指令為32 bits

      • ADD r0, r1, r2;r0:=r1+r2

    • 大部分的指令,可以在一個週期內執行完成

    • 指令皆可為有條件式執行

    • Load/store架構


    Thumb

    Thumb指令集

    • Thumb指令長度為16 bits

      • 針對程式碼的密度最佳化, 約為65%的ARM code size

      • 適合小記憶體系統

      • Thumb指令支援的功能為ARM指令集的一部分

      • 執行期間必須切換到Thumb模式

        ADDSr1,r1,#3

        ADDr1,#3


    Jazelle

    Jazelle

    • Jazelle技術可以讓ARM執行8-bit Java Bytecode

      • 硬體可以支援到95%的bytecodes

      • 速度約為一般軟體JVM的五倍


    3340657

    ARM指令集分類

    • Branch instructions

    • Data-processing instructions

    • Load and store instructions

    • Status register transfer instructions

    • Coprocessor instructions

    • Exception-generating instructions.


    Branch instructions

    Branch Instructions

    • B Branch

    • BL Branch with link

      • Store the return address to r14

      • e.g.

        • CMP r2, #0

        • BLEQ function

        • function

          MOV PC, r14


    Branch instruction encoding

    Branch Instruction Encoding

    • The range of the branch instruction is +/- 32 Mbytes

    • ‘L’: the branch and link variant.

    Assembly Format:

    B{L}{<cond>}{S}Rm

    B{L}{<cond>}{S}<Target address>


    Branch instructions example

    Branch instructions example

    • e.g. C

      if (a=0) function 1 (1);

      Else…

      c

      function 1(){

      function2();

      …}

      function2(){

      return;}

    • ASM

      function 1

    • STMFDr13!, {r0-r4, r14}

    • BL function2

    • LDMFDr13!, {r0-r4, pc}

    • function2

    • MOV pc, r14


    Data processing instructions encoding

    Data-processing Instructions Encoding

    Assembly Format:

    <op>{<cond>}{S} Rd, Rn,#<32-bit immediate>

    <op>{<cond>}{S} Rd, Rn,Rm, {shift}


    Data processing opcode

    Data Processing Opcode

    Assembly Format:

    <op>{<cond>}{S}Rd, Rn #<32-bit immediate>

    <op>{<cond>}{S}Rd, Rn, Rm, {<shift>}

    OpcodeMnemonic MeaningEffect

    [24:21]

    0000ANDLogical bit-wise AND Rd:=Rn & Op2

    0001 EOR Logical bit-wise excusive OR Rd:=Rn EOR Op2

    0010 SUBSubtract Rd:=Rn-Op2

    0011RSBReverse subtract Rd:=Op2-Rn

    0100ADDAdd Rd:=Rn+Op2

    0101 ADCAdd with carry Rd:=Rn+Op2+C

    0110 SBCSubtract with carry Rd:=Rn-Op2+C-1

    0111RSCReverse subtract with carry Rd:= Op2-Rn+C-1

    1000TSTTest Scc on Rn&Op2

    1001TEQTest equivalence Scc on Rn EOR Op2

    1010CMPCompare Scc on Rn-Op2

    1011CMNCompare negated Scc on Rn+Op2

    1100ORRLogical bit-wise OR Rd:=Rn | Op2

    1101MOVMove Rd:=Op2

    1110BICBit clear Rd:=Rn AND NOT Op2

    1111MVNMove negated Rd:=NOT Op2


    Example data processing instructions

    Example Data-processing Instructions

    • Arithmetic operations

      • ADD r0,r1,r2; r0=r1+r2

      • SUBr0,r1,r2; r0=r1-r2

      • RSBr0,r1,r2; r0=r2-r1

    • Bit-wise logical operations

      • AND r0,r1,r2; r0 = r1&r2

      • ORRr0,r1,r2; r0 = r1| r2

      • EORr0,r1,r2; r0 = r1 xor r2

      • BICr0,r1,r2; r0 = and not r2; bit clear


    Example data processing instructions cont

    Example Data-processing Instructions (cont.)

    • Register movement operations

      • MOV r0,r2; r0=r2

      • MVN r0,r2; r0=not r2

    • Comparison operations (set condition code bits N, Z, C, V)

      • CMP r1,r2; set cc on r1-r2

    • Immediate operands

      • ADD r3,r3,#1 ; r3=r3+1

      • ANDr8,r7, #&ff; r8=r7[7:0]

      • & : base 16


    Shifter

    Shifter

    • LSL: Logical Left Shift (X2)

    • LSR: Logical Shift Right (/2)

    • ASR: Arithmetic Right Shift

    • ROR: Rotate Right


    Shifter applications

    Shifter Applications

    e.g. #1

    ADD r3,r2,r1, LSL #3;

    r3:= r2+8*r1

    e.g. #2

    r0=r1*5

     r0=r1+(r1*4)

     ADD r0 ,r1, r1, LSL #2


    Multiply instruction binary encoding

    Multiply Instruction Binary Encoding

    Assembly Format

    MUL{<cond>}{S} Rd, Rm, Rs

    MLA{<cond>}{S} Rd, Rm, Rs, Rn

    <mul>{<cond>}{S} RdHi, RdLo, Rm, Rs

    RdHi: the most significant 32 bits of 64-bit format number

    RdLo: the least significant 32 bits of 64-bit format number

    Opcode Mnemonic Meaning Effect

    [23:21]

    000 MUL Multiply (32-bit result)Rd:=(Rm*Rs)[31:0]

    001 MLA Multiply-accumulate (32-bit result)Rd:=(Rm*Rs+Rn)[31:0]

    100 UMULL Unsigned multiply longRdHi:RdLo:=Rm*Rs

    101 UMLAL Unsigned multiply-accumulate longRdHi:RdLo+=Rm*Rs

    110 SMULL Signed multiply longRdHi:RdLo:=Rm*Rs

    111 SMLAL Signed multiply-accumulate longRdHi:RdLo+=Rm*Rs


    Count leading zeros instruction v5t only

    Count Leading Zeros Instruction (v5T only)

    Assembly Format:

    CLZ{<cond>}{S}Rd, Rm

    • Sets Rd to the number of the bit position of the most significant 1 in Rm. If Rm=0 Rd=32.

    • E.g.

      MOV r0, #&100

      CLZr1, R0

      r1=8


    3340657

    練習

    • 用ARM Assembly寫一個程式,包含一個subroutine用來做x10的運算。

    • 用ADS環境。

    • 不支援具有乘法器功能的ARM Core 。

      main()

      {

      x=5;

      y=mul_ten(x);

      }

      int mul_ten(x)

      {

      return 10*x;

      }


    Single word and unsigned byte data transfer instruction binary encoding

    Single Word and Unsigned Byte Data Transfer Instruction Binary Encoding

    Assemble Format:

    LDR|STR{<cond>}{B} Rd,[Rn, <offset>]{!}; Pre-indexed form

    LDR|STR{<cond>}{B} Rd,[Rn], <offset>; Post-indexed form

    LDR|STR{<cond>}{B} Rd,LABEL; PC-relative form


    Load and store examples

    Load and Store Examples

    • Single register load and store

      LDRr0, [r1] ; r0 := mem32[r1]

      STR r0, [r1]; mem32[r1] := r0

    • Base plus offset addressing

      • Pre-indexing

        LDR r0, [r1, #4] ; r0 := mem32[r1+4]

      • Auto indexing

        LDR r0, [r1, #4]! ; r0 := mem32[r1+4], r1=r1+4

      • Post-indexed

        LDR r0, [r1], #4 ; r0 := mem32[r1], r1=r1+4

      • PC-relative

        LDR r1, UART_ADD; UART address into r1

        STRBr0, [r1]; store data to UART

        UART_ADD &&10000000; address literal


    Half word and signed byte data transfer instruction binary encoding

    Half-word and Signed Byte Data Transfer Instruction Binary Encoding

    Assemble Format:

    LDR|STR{<cond>}H |SH|SB Rd;[Rn, <offset>]{!} ; Pre-indexed form

    LDR|STR{<cond>}H |SH|SB Rd;[Rn], <offset> ; Post-indexed form

    • An unsigned value is zero-extended to 32 bits when loaded;

    • A singed value is extended to 32 bits by replicating the most significant bit of the data.


    Half word load store example

    Half-word Load/Store Example

    ADR r1, ARRAY1; half-word array start

    ADR r2, ARRAY2; word array start

    ADR r3, ENDARR1; ARRAY1 end +2

    LOOP LDRSH r0, [r1], #2; get signed half-word

    STR r0, [r2], #4; save word

    CMP r1, r3; check for end of array

    BLT LOOP; if not finished, loop


    3340657

    練習:字串複製

    • 寫一個Assembly程式做字串複製的動作。

    • 用ADS環境。

      A=“Hello, this is a sunny day!”

      • B=“ ”


    Multiple register data transfer instruction binary encoding

    Multiple Register Data Transfer Instruction Binary Encoding

    Assembly Format:

    LDM|STM{<cond>}<add mode> Rn{!}, <registers>

    <add mode>

    IA: Increment after.

    IB: Increment before.

    DA: Decrement after.

    DB: Decrement before.

    In a non-user mode, CPSP may be restored by:

    LDM|{<cond>}<add mode>Rn{!}, <registers, PC>^

    Full or empty: The stack pointer can either point to the last item in the stack (a full stack), or the next free space on the stack (an empty stack).


    Example addressing mode for ldm stm

    Example Addressing Mode for LDM/STM


    Isr example

    ISR Example

    • e.g. Interrupt handler

      __irq void IRQHandler(void)

      {

      volatile unsigned int *base=(unsigned int *) 0x80000000;

      If (*base==1)

      C_int_handler_1( );

      *(base+1)=0;

      }

    IRQHandler PROC

    STMFDspl,{r0-r4, r12, lr}

    MOVr4,#0x80000000

    LDRr0, [r4,#0]

    SUBsp,sp,#4

    CMPr0,#1

    BLEQ C_int_handler

    MOV r0,#0

    STR r0,[r4,#4]

    ADD sp, sp, #4

    LDMFD spl,{r0-r4, r12, lr}

    SUBSpc, lr, #4


    Swap memory and register instruction binary encoding

    Swap Memory and Register Instruction Binary Encoding

    Assembly Format:

    SWP{<cond>}{B}Rd,Rm,[Rn]


    Swp example

    SWP Example

    ADR r0, SEMAPHORE

    SWPB r1, r1, [r0] ; exchange byte

    r0

    r1

    0

    r?


    Status register to general register transfer instruction binary encoding

    Status Register to General Register Transfer Instruction Binary Encoding

    Assembly Format:

    MRS{<cond>}Rd,CPSR|SPSR

    E.g.

    MRS r0, CPSR; move the CPSR to r0

    MRS r3, CPSR; move the SPSR to r3

    Note:

    The SPSR form should not be used in user or system mode.


    Transfer to status register instruction binary encoding

    Transfer to Status Register Instruction Binary Encoding

    Assembly Format:

    MRS{<cond>}CPSR_f|SPSR_f, #<32-bit immediate>

    MRS{<cond>}CPSR_<field>|SPSR_<field>, Rm

    <field>

    C - the control field – PSR[7:0]

    X – the extension field – PSR[15:8]

    S – the status field – PSR[23:16]

    F – the flags field – PSR[31:24]


    Msr example

    MSR Example

    • Set N, C,V, Z flags:

      MSRCPSR_f, #&f0000000 ; set all the flags

    • Set C flag, preserving N, Z, and V

      MRSr0, CPSR ; move the CPSR to r0

      ORRr0, r0, #&20000000 ; set bit 29 of r0

      MSR CPSR_f, r0 : move back to CRSR


    3340657

    練習:切換ARM操作模式

    • 寫一段程式,將ARM由Supervisory mode切換到IRQ mode。

    • 用ADS環境。

      31 30 29 28 27 24 7 6 5 4 0

      N Z C V Q J undefined I F T mode

    • Mode bits

       Specify the processor mode

       10000User

       10001 FIQ

       10010 IRQ

       10011 SVC

       10111Abort

       11011 Undef

       11111 System


    Coprocessor instructions

    Coprocessor Instructions

    • There are 3 types:

      • Coprocessor data operations

        • CDP: initiate a coprocessor data processing operation

      • Coprocessor Register transfers

        • MRC: Move to ARM register from coprocessor register

        • MCR: Move to coprocessor register from ARM register

      • Coprocessor Memory transfer

        • LDC: load coprocessor register from memory

        • STC: store from coprocessor register to memory


    Exception generating semaphore instructions

    Exception-generating & Semaphore Instructions

    • SWI

      • Used to cause a Software Interrupt exception to occur

        SWI {<cond>} <immed_24>

        SWI 0x123456

    • BKPT

      • Used from software breakpoints in ARM architecture 5 or above. Cause a Prefetch Abort exception to occur.

        BKPT <immediate>


    Summary of arm architectures

    Summary of ARM Architectures

    Core Architecture

    ARM1v1

    ARM2v2

    ARM2as, ARM3v2a

    ARM6, ARM600, ARM610v3

    ARM7, ARM700, ARM710v3

    ARM7TDMI, ARM710T, ARM720T, ARM740Tv4T

    StrongARM, ARM8, ARM810v4

    ARM9TDMI, ARM920T, ARM940Tv4T

    ARM9ES, XScale Microarchitecturev5TE

    ARM10TDMI, ARM1020Ev5TE

    926EJ-S/1026EJ-Sv5TEJ


    Reference1

    Reference

    • S. Furber, ARM system-on-chip Architecture, 2nd ed. Addison-Wesley

    • Seal. ARM architecture reference manual, 2nd ed. Addison-Wesley

    • ARM Development Suite User Guide


    3340657

    嵌入式系統架構軟體設計

    • --- using ARM

      Module #3-5: Important ARM ASM Programming Skills


    Load constant into register

    Load Constant into Register

    • Direct loading with MOV and MVN

    • Loading with LDR Rd,=const


    Direct load constant into register

    Direct Load Constant into Register

    224

    • Mov{cond}{S},Operand2

      • Load immediate constant to register

      • E.g.

        • MOV R1,0x18 ;R1=0x18

      • Can load any 8-bit constant, giving a range of 0x00 to 0xFF

    • MVN: load the bitwise complement of these values. The numerical values are –(n+1).

    Compiler ERROR MSG: Immediate n out of range for this operation.


    Loading with ldr rd const

    Loading with LDR Rd,=const

    • The LDR Rd,=const pseudo-instruction can construct any 32-bit numeric constant in a single instruction

    • The LDR pseudo-instruction generates the most efficient code for a specific constant:

      • If the constant can be constructed with a MOV or MVN instruction, the assembler generates the appropriate instruction.

      • If the constant cannot be constructed with a MOV or MVN instruction, the assembler:

        • Places the value in a literal pool.

        • Generates an LDR instruction with a program-relative address that reads the constant from the literal pool.

    • e.g.:

      LDR Rn,[pc,#offset to literal pool]

      ;load register n with one word from the address [pc+offset]

      Literal Pool: A portion of memory embedded in the code to hold constant values.


    Ldr literal pool example

    LDR & Literal Pool Example

    ; ; c:\ARM\ADSv1_2\Examples\asm\loadcon.s

    AREALoadcon, CODE, READONLY

    ENTRY

    START

    BLfunc1

    BLfunc2

    stopMOVr0,#0x18; =>MOV R0, #42

    LDRr1, =0x20026

    SWI0x123456

    func1

    LDRr0, =42; =>MOV R0, #42

    LDRr1, =0x55555555; =>LDR R1, [PC, #offset to Literal Pool l]

    LDRr2, =0xFFFFFFFF; =>MVN R2, #0

    MOVpc, lr

    LTORGLitetal Pool l constains

    Litetal 0x55555555

    func2

    LDRr3, =0x55555555; =>LDR R3, [PC, #offset to Literal Pool l]

    LDRr4, =0x66666666; If this is uncommented it is out of reach

    ; fails, because Literal Pool 2

    MOV pc, lr

    LargeTable

    SPACE4200

    ; Starting at the current location

    ; clears a 4200 bytes area of memory

    ; to zero, reserves a zeroed block of memory

    ; Literal Pool 2 is empty

    END


    Loading addresses into registers

    Loading Addresses into Registers

    • Direct loading with ADR and ADRL

    • Loading addresses with LDR Rd, =label.


    Direct loading with adr

    Direct Loading with ADR

    • The assembler converts an ADR Rn, label pseudo-instruction by generating:

      • A single ADD or SUB instruction that loads the address, if it is in range

      • An error message if the address cannot be reached in a single instruction.

      • The offset range is 255 bytes for an offset to a non word-aligned address, and 1020 bytes (255 words) for an offset to a word-aligned address.

      • E.g.

        • ADRr2, Label+1000

        • ADR r2, Label +211


    Direct loading with adrl

    Direct Loading with ADRL

    • The assembler converts an ADR Rn, label pseudo-instruction by generating:

      • Two data-processing instruction that load the address, if it is in range

      • An error message if the address cannot be constructed in two instructions

      • The range of an ADRL pseudo-instruction is 64KB for a non word-aligned address and 256 KB for a word-aligned address.

      • E.g.

        • ADRL r2, Label +4300


    Adr and adrl example

    ADR and ADRL Example

    ; c:\ARM\ADSv1_2\Examples\asm\adrlabel.s

    AREAadrlabel, CODE, READONLY

    ENTRY

    START

    BLfunc

    stop

    MOVr0,#0x18; angel_SWIreason_ReportException

    LDRr1,=0x20026; ADP_Stopped_ApplicationExit

    SWI0x123456; ARM semihosting SWI

    LTORG; Create a literal pool

    Func

    ADRr0, Start; =>SUB r0, PC, #offset to Start

    ADRr1, DataArea; =>ADD r1, PC, #offset to DataArea

    ;ADRr2, DataArea+4300; =>This would fail because the offset

    ; cannot be expressed by operand2 of an ADD

    ADRLr2, DataArea+4300; =>ADD r2, r2, #offset1

    ; ADD r2, r2, #offset2

    MOVpc, lr; Return

    DataArea

    SPACE8000; Starting at the current location.

    ; clears a 8000 byte area of memory to zero

    END


    Loading addresses with ldr rd label

    Loading Addresses with LDR Rd,=label

    • Load any 32-bit constant into a register

    • The assembler converts an LDR r0,=label pseudo-instruction by:

      • Placing the address of label in a literal pool (a portion of memory embedded in the code to hold constant values).

      • Generating a program-relative LDR instruction that reads the address from the literal pool, for example:

        LDR rn [pc, #offset to literal pool ]

        ; load register n with one word from the address [ pc + offset ]


    Example for ldr rd label

    Example for LDR Rd, =label

    ; c:\ARM\ADSv1_2\examples\asm\ldriabel.s

    AREALDRlabel, CODE, READONLY

    ENTRY

    START

    BLfunc1

    BLfunc2

    stop

    MOVr0,#0x18

    LDRr1,=0x20026

    SWI0x123456

    func1

    LDRr0, =start; =>LDR R0, [PC, #offset into Literal Pool 1]

    LDRr1, =Darea+12; =>LDR R1, [PC, #offset into Literal Pool 1]

    LDRr2, =Darea+6000; =>LDR R2, [PC, #offset into Literal Pool 1]

    MOVpc, lr

    LTORG; Litetal Pool l

    func2

    LDRr3, =Darea+6000; =>LDR R3, [PC, #offset into Literal Pool 1](sharing with previous literal)

    ;LDRr4, =Darea+6004; If uncommented produces an error as Literal Pool 2 is out of range

    MOV pc, lr

    Darea

    SPACE8000; Literal Pool 2 is out of range of the LDR instructions above

    END


    Exercise implement a jump table

    Exercise: Implement a Jump Table

    Trace this program!

    And make a flow chart.

    ; c:\ARM\ADSv1_2\examples\asm\jump.s

    AREAJump, CODE, READONLY

    CODE32

    NumEQU2; Number of entries in jump table

    ENTRY

    STARTMOVr0,#0; Set up the three parameters

    MOVr0,#3

    MOVr0,#2

    BLarithfunc; Call the function

    stopMOVr0,#0x18

    LDRr1,=0x20026

    SWI0x123456

    arithfunc

    CMPr0, #num; Label the function

    MOVHSpc, lr; Treat function code as unsigned integer

    ADRr3, JumpTable; if code is >=num then simply return

    LDRpc, [R3,R0,LSL#2]; load address of jump table

    LTORG; Jump to the appropriate routine

    JumpTable

    DCDDoAdd

    DCDDoSub

    DoAdd

    ADDr0, r1, r2; Operation 0

    MOVpc, lr; Return

    DoSub

    SUBr0, r1, r2; Operation 1

    MOVpc, lr; Return

    END; Mark the end of the file


    Exercise string copy

    Exercise: String Copy

    ; c:\ARM\ADSv1_2\examples\asm\strcopy.s

    AREAStrCopy, CODE, READONLY

    ENTRY

    StartLDRr0,=srcstr; Point to first string

    LDRr0,=dststr; Point to second string

    BLstrcopy; Call subroutine to do copy

    StopMOVr0,#0x18

    LDRr1,=0x20026

    SWI0x123456

    strcopy

    LDRBr2, [r1],#1; Load byte and update address

    STRBr2, [r0],#1; Store byte and update address

    CMPr2, #0; Check for zero terminator

    BNEstrcopy; Keep going if not

    MOVpc, lr

    AREAStrings, DATA, READWRITE

    srcstrDCB"First string-source",0

    dststrDCB"Second string-destination",0

    END

    Trace this program!

    And make a flow chart.


    Load and store multiple register instructions

    Load and Store Multiple Register Instructions

    • An efficient way of moving the contents of several registers to and from memory

    • Used for block copy and for stack operations at subroutine entry and exit

    • The advantages include: (Compare to single L/S)

      • Smaller code size

      • Single instructions fetch overhead

      • On uncached ARM processors, the first word of data transferred by a load or store multiple is always a nonsequential memory cycle, but all subsequent words transferred can be sequential memory cycles.


    Ldm and stm instructions

    LDM and STM Instructions

    • The load ( or store ) multiple instruction loads (stores) any subset of the 16 general-purposes registers from (to) memory, using a single instruction.

    • Syntax:

      • LDM {cond} address-mode Rn{1}, reg-list{^}

      • !:Specifies base register write back. If this is specified, the address in the base register is updated after the transfer.

      • ^:Specifies that the CPSR is restored from the SPSR. It must be used only from a privileged mode (i.e. other than user mode).

    • E.g.

      • LDMIA r0!, {r2-r9}

      • STMIA r1, {r0-r10}.


    Addressing mode for ldm stm

    Addressing Mode for LDM/STM

    • IA: Increment after

    • IB: Increment before

    • DA: Decrement after

    • DB: Decrement before

    • Full of empty: The stack pointer can either point to the last item in the stack (a full stack), or the next free space on the stack (an empty stack).

    • Note: The ARM-Thumb Procedure Call Standard (ATPCS), and ARM and Thumb C and C++ compiles always use a full descending stack.


    Example addressing mode for ldm stm1

    Example Addressing mode for LDM/STM


    Stacking registers for nested subroutines

    Stacking Registers for Nested Subroutines

    Subroutine STMFD sp!, {r5-r7,r} ; Push work registers and lr

    ……

    ; code Note: Your codes user5,r6,r7,lr

    ….

    BL xxxx_xxxx

    ……

    ; code

    ……

    LDMFD sp!, {f5-f7,pc} ; Pop work registers and pc


    Using macro

    Using Macro

    MACRO

    $Label TestAndBranch $dest,$reg,$cc

    $Label CMP $reg, #0

    B$cc $dest

    MEND

    This macro can be invoked as follows:

    test TestAndBranch Non Zero, r0, NE

    NonZero

    After substitution this becomes:

    test CMP r0, #0

    BNE NonZero

    NonZero


    3340657

    嵌入式系統架構軟體設計

    • ---using ARM

      Module #4-1: ARM-Thumb Interworking


    Outline

    Outline


    Arm thumb procedure call standard atpcs

    ARM-Thumb Procedure Call Standard (ATPCS)

    • To ensure that separately compiled or assembled subroutines can work together.

    • Register roles and names

    • The stack

      • A full, descending stack

      • Eight-byte alignment at all external interfaces.


    Parameter passing

    Parameter Passing

    • Nonvariadic: A routine with a fixed number of arguments is.

      • The first integer arguments are allocated to r0-r3 in order.

      • Remaining parameters are allocated to the stack in order.

    • Variadic routine: A routine with a variable number of arguments.

      • a1-a4, a1 first.

      • The stack, lowest address first.

        #include <stdarg.h>

        int ShowVar(char *szTypes, ...)

        {

        va_list vl;

        va_start (vl, szTypes);

        ...

        }

        void main()

        {

        ShowVar("fcsi",32,4f,'a',"Test string",4);

        }


    When to use interworking

    When to Use Interworking

    • Code density

      • Thumb state has better code density

    • Speed consideration

      • Running in ARM state has better efficiency than Thumb state.

    • Functionality

      • Thumb instructions are less flexible than their ARM equivalents.

      • Some operations are not possible in thumb state.

        • e.g. enable/disable interrupt & A state change.

    • Exception handing

      • The processor automatically enters ARM state when a processor exception occurs.

      • This means that the first part of an exception handler must be coded with ARM instructions, even if it re-enters Thumb state to carry out the main processing of the exception.

    • Standalone Thumb programs

    • A thumb-capable ARM processor always starts in ARM state.


    Non interworking function call

    Non-interworking Function Call

    • Implementing a function call usually requires two steps:

      • Store the return address in the link register (LR)

      • Branch to the address of the required function

    • void mouse()

    • {

    • monkey();

    • }

    • void monkey()

    • {

    • return();

    • }

    ; mouse()

    BL monkey

    ; monkey()

    MOV pc, lr


    Arm thumb interworking function call

    ARM/Thumb Interworking Function Call

    • Use BX or BLX

    • BX does not store return address in LR automatically.

      • -> return from function by BX LR to original state.

    • v5T supports BLX


    Blx example

    BLX Example

    • A call to Thumb subroutine:

      CODE32; ARM code follows

      …..

      BLXTSUB; call thumb routine

      ….

      CODE16; start of Thumb code

      TSUB….; Thumb subroutine

      BXr14; return


    Arm thumb interworking example

    ARM/Thumb Interworking Example

    AREAAddReg. CODE,READONLY ;Name this block of code.ENTRY ; Mark first instruction to call.

    main

    ADR r0, ThumbProg + 1 ;Generate branch target address

    ; and set bit 0, hence arrive

    ; at target in Thumb state.

    BXr0 ; Branch exchange to ThumbProg

    CODE16 ; Subsequent instructions are Thumb code

    ThumbProg

    MOV r2,#2 ; Load r2 with value 2

    MOV r3,#3 ; Load r2 with value 3

    ADD r2,r2,r3 ; r2=r2+r3

    ADR r0,ARMProg

    BX r0

    CODE32 ; Subsequent instructions are ARM code.

    ARMProg

    MOV r4,#4

    MOV r5,#5

    ADD r4,r4,r5

    StopMOV r0,#0x18 ; angel_SWIreason_ReportException

    LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit

    SWI 0x123456 ;ARM semihosting SWI

    END ; Mark end of this file


    Arm thumb interworking 1

    練習:ARM/Thumb Interworking #1

    • Trace code:

      • \ADSv1_2\example\asm\thumbsub.mcp

      • Monitor CPSR with PC


    C functions interworking

    C Functions&Interworking

    • Compiler C to run in Thumb&support interworking

      • tcc -c-g-O1-apcs/interwork thumbmain.c

    • Compiler C to run in ARM state & support intherworking

      • armcc -c-g-O1 -apcs/interwork armsub.c

    • Link

      • armlink thumbmain.o armsub.o -o thumbtoarm.axf -info veneers

        ; By armcc –apcs/interwork;Bytcc –apcs/interwork

        xxxxxx

        Void xxx() STMFD SP!, {r4-r11, lr} PUSH SP!, {r4-r7,lr}

        { … …

        … BL sub BL sub

        func(); … …

        … LDMFD SP!, {r4-r11, lr} POPSP!, {r4-r7, lr}

        BXlr POP{r3}

        BXr3

        }


    Arm thumb interworking 2

    練習:ARM/Thumb Interworking #2

    \ADSv1_2\examples\interwork\*.c

    • ARM (main) code calling a Thumb subroutine

      armcc -c-g-O1 - apcs/interwork armmain.c

      tcc -c -g-O1 -apcs/interwork thumbsub.c

      armlink armmain.o thumbsub.o -o armtothumb.axf -info veneers

    • Thumb (main) code calling an Arm subroutine

      tcc -c-g-O1 -apcs/interwork thumbmain.c

      armcc -c -g-O1 –apcs/interwork armsub.c

      armlink thumbmain.o armsub.o -o thumbtoarm.axf -info veneers

    • Run the code by AXD


    Mixing c assembler

    Mixing C & Assembler

    • Inline Assembler

    • C calls ASM function

    • ASM call C function


    Inline assembler

    Inline Assembler

    • The ARM C compiler support inline assembly language with the

      ___asm specifier.

    • The ARM C++ compilers support the asm syntax proposed in the ANSI C++ Standard, with the restriction that the string literal must be a single string. e.g:

      • asm(“instruction[;instruction]”);

    • ARM C++ supports the C compiler ___asm sytax.

      ___asm

      {

      instruction [; instruction]

      [instruction]

      }


    Restriction of inline assembler

    Restriction of Inline Assembler

    • Not support

      • LDR Rn, =express

      • Label expression

      • ADR, ADRL

      • & can’t be used to express hex. (use 0x prefix instead)

      • BX, BLX

    • Can’t write to PC


    String copy example

    String Copy Example

    Void my_strcpy(char *src, const char *dst )

    {

    int ch;

    ___asm

    {

    loop:

    #ifndef___thumb

    // Arm version

    LDRBch,[src], #1

    STRBch,[dst],#1

    #else

    //Thumb version

    LDRBch, [src]

    ADDsrc,#1

    STRBch,[dst]

    ADDdst,#1

    #endif

    CMPch,#0

    BNEloop

    }

    }

    #include <stdio.h>

    int main(void)

    {

    const char *a=“Hello world!”;

    char b[20];

    __asm

    {

    MOVR0, a

    MOVR1, b

    BL my_strcpy, {R0, R1}

    }

    printf(“Original string: %s\n”,a);

    printf(“Copied string: %s\n”,b);

    return 0;

    }


    Some issues to use inline assembler

    Some Issues to Use Inline Assembler

    • Use r0-r3, ip, lr and CPSR with caution.

      • E.g.

        Int funct(int x)

        {

        ….

        __asm

        {

        add r0,r0,#1//we can’t assert x is in r0

        }add x,x,#1//correct usage

    • Don’t save &restore physical registers.

      • E.g.

        int funct(int x)

        {

        ….

        __asm

        {

        stmfd sp! {r0}//save r0

        add r0,x,1

        eor x,r0,x

        ldmfd sp!, {r0}//restore r0

        }

        }


    Asm calls c example

    ASM calls C example

    int g(int a, int b, int d, int e)

    {

    return a+b+c+d+e;

    }

    ; int f(int i) {return g(i,2*i,3*i,4*i,5*i);}

    EXPORT f

    AREA f, CODE, READONLY

    IMPORT g; i is in r0

    str LR,[SP, #-4]!; preserve lr

    ADD r1,r0,r0; computer 2*i (2nd param)

    ADD r2,r1,r0; computer 3*i (3nd param)

    ADD r3,r1,r2; computer 5*i

    STR r3, [sp,#4]!; 5th param on stack

    ADD r3,r1,r1; computer 4*i (4nd param)

    BL g; branch to C function

    ADD sp, sp, #4; remove 5th param

    LDR pc,[sp],#4; return

    END


    Access c global variables

    Access C Global Variables

    • Use “IMPORT”

    • Use “LDR/STR”

    • E.g.

      AREAglobal_variable, CODE, READONLY

      EXPORTasmsub

      IMPORTglob_var

      asmsub

      ldrr1,=glob_var; read address of glob_var

      ; into r1 from literal pool

      ldrr0, [r1]

      addr0, r0, #10

      strr0, [r1]

      movpc, lr

      END


    C call asm example install directory example asm as strtest c and scopy s

    C call ASM Exampleinstall_directory\example\asm as strtest.c and scopy.s

    AREA SCopy,CODE,REAONLY

    EXPORT strcopy

    strcopy; r0 points to destination string.

    ; r1 points to source string.

    LDRB r2,[r1],#1

    ; Load byte and update address.

    STRB r2,[r0],#1

    ; Store byte and update address.

    CMP r2,#0

    ; Check for zero terminator.

    BNE strcopy

    ; Keep going if not.

    MOV pc,lr

    ; Return.

    END

    #include <stdio.h>

    extern void strcopy(char*d, const char *s);

    int main()

    { const char *srcstr = “First string -source”;

    char dststr[] = “Second string - destination”;

    / *dststr is an array since we’re going to change it*/

    printf(“Before copying:\n”);

    printf(“%s\n %s\n”,srcstr,dststr);

    strcopy(dststr,srcstr);

    printf(“After copying:\n”);

    printf(“ %s\n %s\n”,srcstr,dststr);

    return(0);

    }


    Inlie irq controller

    練習:Inlie IRQ Controller

    • 分別寫一個disable interrupt 和enable interrupt的inline assembler subroutine。

      __inline void enable_IRQ(void)

      {

      …..

      }

      __inline void disable_IRQ(void)

      {

      …..

      }

    • 在c 的main function 去呼叫執行


    3340657

    嵌入式系統架構軟體設計

    ---using ARM

    Module #4-2: ARM Exception Handler


    Exception type vector address

    Exception Type & Vector Address


    Exception handling by arm core

    Priority Exception

    Exception Handling by ARM Core

    Highest 1 Reset

    2 Data Abort

    3 FIQ

    4 IRQ

    5 Prefetch Abort

    Lowest 6 Undefined

    instruction

    SWI

    • When an exception occurs, the banked versions of R14 and the SPSR for the exception mode are used to save state as follows:

      1. R14_<exception_mode> = return link

      2. SPSR_<exception_mode> = CPSR

      3. CPSR[4:0] = exception mode number

      4. CPSR[5] = 0/*Execute in ARM state*/

      5. if <exception_mode> = = Reset or FIQ then??

      6. CPSR[6] = 1/*Disable fast interrupt else CPSR[6] is unchanged*/

      7. CPSR[7] = 1/* Disable normal interrupts*/

      8. PC = exception vector address

    • Summary:

      • Copy PC into r14_mode, save CPSR into SPSR

      • Change to appropriate exception mode

      • PC is force to 0x0-0x1c


    Reset exception

    Reset Exception

    1. R14_svc = UNPREDICTABLE value

    2. SPSR_svc = UNPREDICTABLE value

    3. CPSR[4:0] = 0b10011/*enter supervisory mode*/

    4. CPSR[5] = 0/*Execute in ARM state*/

    5. CPSR[6] = 1/*Disable fast interrupts*/

    6. CPSR[7] = 1/*Disable normal interrupts*/

    7. If high vectors configured then

    PC = 0xFFFF0000

    else

    PC = 0x00000000


    Undefined instruction exception

    Undefined Instruction Exception

    • While to execute an co-processor instruction, ARM can’t get response from co-processor.

    • Attempt to execute an instruction that is UNDEFINED.

    • Can be used for software emulation of a coprocessor in a system that does not have physical coprocessor.

    • Action performed:

      • R14_und = address of next instruction after the undefined instruction

      • SPSR_und = CPSR

      • CPSR[4:0] = 0b11011/*enter Undefined mode*/

      • CPSR[5] = 0/*Execute in ARM state*/

      • CPSR[6] /*unchanged*/

      • CPSR[7] = 1/*Disable normal interrupts*/

      • If high vectors configured then

        PC = 0xFFFF0004

        else

        PC = 0x00000004


    Software interrupt

    Software Interrupt

    • Software Interrupt Instruction (SWI) enters supervisor mode.

    • Action performed:

      • R14_svc = address of next instruction after the undefined instruction

      • SPSR_svc = CPSR

      • CPSR[4:0] = 0b10011/*enter supervisor mode*/

      • CPSR[5] = 0/*Execute in ARM state*/

      • CPSR[6] /*unchanged*/

      • CPSR[7] = 1/*Disable normal interrupts*/

      • If high vectors configured then

        PC = 0xFFFF0008

        else

        PC = 0x00000008


    Return from undefined instruction software interrupt exceptions

    Return from Undefined Instruction&Software Interrupt Exceptions

    • Simple

      MOVSpc, lr

    • In exception handler

      STMFDsp!, {reglist, lr}

      ….

      LDMFDsp!, {reglist, pc}^

      ^: Auto restore CPSR from SPSR.


    Prefetch abort exception

    Prefetch Abort Exception

    • If processor attempts to fetch an instruction from an illegal address, the instruction is marked as invalid. (in pipeline)

    • When reach the invalid instruction, prefetch abort exception is generated.

      • R14_abt = address of aborted instruction +4

      • SPSR_abt = CPSR

      • CPSR[4:0] = 0b10111/*enter Abort mode*/

      • CPSR[5] = 0/*Execute in ARM state*/

      • CPSR[6] /*unchanged*/

      • CPSR[7] = 1/*Disable normal interrupts*/

      • If high vectors configured then

        PC = 0xFFFF000C

        else

        PC = 0x0000000C


    Return from prefetch abort exception

    Return from Prefetch Abort Exception

    • Simple

      SUBSpc, lr,#4

    • Exception handler

      SUBSlr,lr,#4

      STMFDsp!, {reglist, lr}

      …..

      LDMFDsp!, {reglist, pc}^


    Data abort

    Data Abort

    • Signaled by the memory system.

    • Action performed:

      • R14_abt = address of aborted instruction +8

      • SPSR_abt = CPSR

      • CPSR[4:0] = 0b10111/*enter Abort mode*/

      • CPSR[5] = 0/*Execute in ARM state*/

      • CPSR[6] /*unchanged*/

      • CPSR[7] = 1/*Disable normal interrupts*/

      • If high vectors configured then

        PC = 0xFFFF0010

        else

        PC = 0x00000010


    Return from data abort exception

    Return from Data Abort Exception

    • Simple

      SUBSpc, lr,#8

    • Exception handler

      SUBSlr,lr,#8

      STMFDsp!, {reglist, lr}

      ….

      LDMFDsp!, {reglist, pc}^


    Fiq exception handling by arm

    FIQ Exception Handling by ARM

    When an FIQ is detected, the following actions are performed:

    R14_fiq = address of next instruction to be executed +4

    SPSR_fiq = CPSR

    CPSR[4:0] = 0b10001/*enter FIQ mode*/

    CPSR[5] = 0/*Execute in ARM state*/

    CPSR[6] = 1 /*Disable fast interrupts*/

    CPSR[7] = 1/*Disable normal interrupts*/

    If high vectors configured then

    PC = 0xFFFF001C

    else

    PC = 0x0000001C

    To return after servicing the interrupt, use:

    SUBS PC, R14,#4


    Fiq exception handler by programmer

    FIQ Exception Handler by Programmer

    • Your FIQ handler

      SUBSlr, lr,#4

      STMFDsp!, {r0-r4, lr}

      ….

      ….

      LDMFDsp!, {r0-r4, pc}^


    Irq exception

    IRQ Exception

    • Action performed:

      • R14_irq = address of next instruction to be executed +4

      • SPSR_abt = CPSR

      • CPSR[4:0] = 0b10010/*enter IRQ mode*/

      • CPSR[5] = 0/*Execute in ARM state*/

      • CPSR[6] /*unchanged*/

      • CPSR[7] = 1/*Disable normal interrupts*/

      • If high vectors configured then

        PC = 0xFFFF0018

        else

        PC = 0x00000018

      • Return

        • SUBSpc,lr,#4


    Implement a swi handler

    Implement a SWI Handler

    • \ADSv1_2\examples\swi\*

      main.c

      installs the SWI vector in the exception table, then calls SWIs (0, 1, 2 & 3) via__swi().

      ahandle.s

      top-level SWI handler written in assembler. Identify ARM&thumb SWIs then pass to chandle.c for processing.

      chandle.c

      second-level SWI handler, called from ahandle.s.

      SWIs 0, 1, 2 & 3 execute some simple arithmetic.

      Swi.h

      contains the definitions of __swi(), __swi(1), __swi(2) &__swi(3).


    Calling swi from an application

    Calling SWI from An Application

    • In assembly, set up required register values then issue SWI.

      MOVr0,#65

      SWI 0x0;call SWI 0x0 with parameter value in r0

    • C/C++, declare the SWI as an __SWI function, and call it.

      • ___swi() void my__swi(int);

      • my__swi(65);

    • Provide that:

      • Any arguments are passed in r0-r3 only.

      • Any results are returned in r0-r3 only.

        • If there are 2-4 return values, those must be returned by a structure.

          • Directive as __value_in_regs


    Swi function declare usage

    SWI Function Declare&Usage

    // from example\SWI\swi.h

    __swi(0) int multiply_two(int, int);

    __swi(1) int add_two(int, int);

    __swi(2) int add_multiply_two(int, int, int, int);

    struct four_results

    {

    int a;

    int b;

    int c;

    int d;

    };

    __swi(3) __value_in_regs struct four _results many_operations(int, int, int, int);

    // calling example

    structure four_results res_3;

    res_3 = many_operations(1, 2, 3, 4);

    add_two(1, 2);


    Parameters passing

    Parameters Passing

    • We can pass values in and out of a SWI handler written in C, provided that the top-level handler passes the stack pointer into the C function as the second parameter.

      MOVr1, sp

      BL C_SWI_Handler

      • Parameter0 = reg[0]

      • Parameter1 = reg[1]

      • Parameter2 = reg[2]

      • Parameter3 = reg[3]

    • Write back

      • reg[0] = updated_value_0

      • reg[1] = updated_value_1

      • reg[2] = updated_value_2

      • reg[3] = updated_value_3


    Identify swi

    Identify SWI

    • When a SWI handler is entered, it must establish which SWI is being called.

      • Load the SWI instruction that cause the exception.

        • LDR r0, [lr, #-4]

      • Extract the SWI number by clearing the 31-24 bits.

        • BICr0,r0,#0xFF000000


    Using swi in supervisory mode

    Using SWI in Supervisory Mode

    • Call a SWI in supervisory mode LR_SVC and SPSR_SVC are corrupted.

    • Therefore, we must store LR_SVC and SPSR_SVC when a SWI is called.

      MRSr0, spsr; Get spsr

      STMFDsp!, {r0}; Store spsr onto stack

      ….

      LDMFDsp!, {r0}; get spsr from stack

      MSRspsr_cf, r0; restore spsr


    Identify swi from arm thumb

    Identify SWI from Arm/Thumb

    …..

    T_bit EQU 0x20

    SWI_Handler

    STMFDsp!, {r0-r3, r12, lr} ; Store registers

    MOVr1, sp; Set pointer to parameters

    MRSr1, spsr; Get spsr

    STMFDsp!, {r0}; Store spsr onto stack

    TSTr0, #T_bit; Occurred in Thumb state?

    LDRNEH r0, [lr,#-2]; Yes: Load halfword and…

    BICNE r0,r0,#0xFF00; …extract comment field

    LDREQr0, [lr,#-4]; No: Load word and…

    BICEQr0,r0,#0xFF000000; …extract comment field

    ; r0 now contains SWI number

    ; r1 now contains pointer to stacked registers

    BLC_SWI_Handler; Call main part of handler

    ….


    Simple c swi handler

    Simple C SWI Handler

    void C_SWI_Handler(unsigned swi_num)

    {

    switch(swi_num)

    {

    case0:

    ….

    break;

    case1:

    ….

    break;

    case2:

    ….

    break;

    ….

    }

    }


    Install an exception handler during development work

    Install an Exception Handler During Development Work

    Unsigned Install_Handler(unsigned routine, unsigned *vector)

    {

    unsigned vec, old_vec;

    vec = (routine – (unsigned)vector - 8)>>2;//-8 for prefetching, >> for word offset

    if (vec & 0xff000000)//check is branch offset is out of limit?

    {

    printf(“Handler greater than 32MBytes from vector”);

    }

    vec = 0xea000000 | vec;// 0xea000000 the opcode of branch inst.

    old_vec = *vector;

    *vector = vec;

    return (old_vec);

    }

    Usage:

    unsigned *swi_vec = (unsigned *)0x08;

    extern void SWI_Handler(void);

    Install_Handler((unsigned) SWI_Handler, swi_vec);


    3340657

    練習:增加一個SWI功能

    • 利用ADS和AXD Trace \example\SWI \的範例程式

    • 自行增加一個運算功能swi(4) return (regs[0]+regs[1])*10。


    C interrupt handlers

    C Interrupt Handlers

    __irq void IRQHandler (void)

    {

    volatile unsigned int *base = (unsigned int *) 0x80000000;

    if(*base = = 1)

    {

    c_int_handler();

    }

    *(base+1) = 0;

    }

    Note: __irq does not provide reentrant processing


    Compiled irq

    Compiled __irq

    ; with__irq

    IRQHandler PROC

    STMFDsp!,{r4,lr}

    MOVr4,#0x80000000

    LDRr0,[r4,#0]

    CMPr0,#1

    BLEQC_int_handler

    MOVr0,#0

    STRr0,[r4,#4]

    LDMFD sp!,{r4,pc}

    ENDP

    ; with__irq

    IRQHandler PROC

    STMFDsp!,{r0-r4,r12,lr}

    MOVr4,#0x80000000

    LDRr0,[r4,#0]

    SUBsp,sp,#4

    CMPr0,#1

    BLEQC_int_handler

    MOVr0,#0

    STRr0,[r4,#4]

    ADDsp,sp,#4

    LDMFDsp!,{r0-r4,r12,lr}

    SUBSpc,lr,#4

    ENDP


    Reentrant interrupt handlers

    Reentrant Interrupt Handlers

    • The steps needed to safely re-enable interrupts in an IRQ handler are:

      1. Construct return address and save on the IRQ stack.

      2. Save the work registers and spsr__IRQ.

      3. Clear the source of the interrupt.

      4. Switch to System mode and re-enable interrupts.

      5. Save User mode link register and non callee-saved registers.

      6. Call the C interrupt handler function.

      7. When the C interrupt handler returns, restore User mode registers and

      disable interrupts.

      8. Switch to IRQ mode, disabling interrupts.

      9. Restore work registers and spsr__IRQ.

      10. Return from the IRQ.


    Reentrant example

    Reentrant Example

    AREA INTERRUPT, CODE, READONLY

    IMPORT C_irq_handler

    IRQ

    SUBlr,lr,#4; construct the return address

    STMFDsp!,{lr}; and push the adjusted lr_IRQ

    MRSr14,SPSR; copy spsr_IRQ to r14

    STMFDsp!, {r12,r14}; save work regs and spsr_IRQ

    ; Add instruction to clear the interrupt here

    ; then re-enable interrupts.

    MSRCPSR_c,#0x1F; switch to SYS mode, FIQ and IRQ enabled.

    ; USR mode registers are now current.

    STMFDsp!, {r0-r3, lr}; save lr_USR and non-callee saved registers

    BLC_irq_handler; branch to C IRQ handler.

    LDMFDsp!,{r0-r3, lr}; restore registers

    MSRCPSR_c,#0x92; switch to IRQ mode and disable IRQs. FIQ is still enabled.

    LDMFDsp!, {r12,r14}; restore work regs and spsr_IRQ

    MSRSPSR_cf,r14

    LDMFDsp!,{pc}^; return from IRQ.

    END


    3340657

    嵌入式系統架構軟體設計---using ARM

    Module #4-3:Build ARM ROM Image


    System startup

    System Startup

    Load code from address 0x0 to execute

    • SVC mode

    • Interrupts disable

    • ARM state

      System initialization

      1. Initialing the execution environment, i. e. exception vectors, stacks, memory system, I/O, etc.

      2. Initializing the C library and application (C variables for example).

    Exception


    System memory mapping

    System Memory Mapping

    • ROM at 0x0]

      • Simple

      • Slow to handle exceptions

    • RAM at 0x0

      • Complex

      • Fast to handle exceptions


    General process for remap

    General Process for Remap

    • Power on to fetch the RESET vector at 0x0 (from the aliased copy of ROM).

    • Execute the RESET vector:

      LDR PC, =0x0C000004

      This causes a jump to the real address of the next ROM instruction

    • Write to the REMAP register and set REMAP = 1.

    • Complete the rest of the initialization code as described in Initializing the system.


    Initialize system

    Initialize System

    • Reset vector is at 0x0

    • Initialization process = Reset handler, to

      • Set up exception vectors

      • Initialize the memory system

      • Initialize the stack pointer registers

      • Initialize any critical I/O devices

      • Change processor mode if necessary

      • Change processor state if necessary


    Set up exception vectors

    Set up Exception Vectors

    • If ROM at address 0x0

      • Vectors consist of a sequence of hard-coded instructions to branch to the handlers.

    • If ROM at elsewhere

      • Dynamically initialize the vectors by initialization codes.

      • Typically, copy the vector table from ROM to RAM


    Example to set up exception vectors

    Example to Set up Exception Vectors

    ; *****************

    ; Exception Handlers

    ; *****************

    Undefined_Handler

    B Undefined_Handler

    SWI_Handler

    B SWI_Handler

    Prefetch_Handler

    B Prefetch_Handler

    Abort_handler

    B Abort_handler

    ; IRQ_Handler

    ; B IRQ_Handler

    FIQ_Handler

    B FIQ_Handler

    END

    ENTRY

    LDR PC, Reset_Addr

    LDR PC, Undefined_Addr

    LDR PC, SWI_Addr

    LDR PC, Prefetch_Addr

    LDR PC, Abort_Addr

    NOP; Reserved vector

    LDR PC, IRQ_Addr

    LDR PC, FIQ_Addr

    IMPORT Reset_Handler; In init.s

    IMPORT IRQ_Handler; In init.s

    Reset_AddrDCDReset_Handler

    Undefined_AddrDCDUndefined_Handler

    SWI_AddrDCDSWI_Handler

    Prefetch_AddrDCDPrefetch_Handler

    Abort_AddrDCDAbort_Handler

    DCD0; Reserved vector

    IRQ_AddrDCDIRQ_Handler

    FIQ_AddrDCDFIQ_Handler


    Initialize the stack pointer registers

    Initialize The Stack Pointer Registers

    • Sp_SVC

      • Must always be initialized.

    • Sp_ IRQ

      • Initialize it if IRQ interrupt used.

      • Initialize before interrupts are enabled.

    • Sp_FIQ

      • Initialize it if FIQ interrupt used.

      • Initialize before interrupts are enabled.

    • Sp_ABT

      • Initialize for Data and Prefetch Abort handling

    • Sp_UND

      • Initialize for Undefined Instruction handling.

      • Initialize sp_ABT and sp_UND for debugging purposes.

    • Set up the stack pointer sp_USR when changing to User mode to start executing the application.


    Change processor mode state

    Change Processor Mode & State

    Mode_USREQU0x10

    Mode_ FIQ EQU0x11

    Mode_IRQEQU0x12

    Mode_SVCEQU0x13

    Mode_ABTEQU0x17

    Mode_UNDEFEQU0x1B

    Mode_SYSEQU0x1F;MSRCPSR_c, #Mode_IRQ:OR:I_Bit:OR:F_Bit

    • Mode bits

      • Specify the processor mode

      • 1,0000User

      • 1,0001FIQ

      • 1,0010IRQ

      • 1,0011SVC

      • 1,0111Abort

      • 1,1011Undef

      • 1,1111System


    Example to initialize the stack pointer registers

    Example to Initialize The Stack Pointer Registers

    Mode_IRQEQU0x12

    I_BitEUU0x80

    F_BitEQU0x40

    RAM_LimitEQU0x1000000;For 16MByte SDRAM

    SVC_StackEQURAM_Limit

    IRQ_StackEQURAM_Limit-1024

    .

    ….

    …..

    .

    ; ---Initialize stack pointer registers

    ; Enter IRQ mode and set up the IRQ stack pointer

    MSR CPSR_c, #Mode_IRQ:OR:I_Bit:OR:F_Bit

    LDR SP, =IRQ_Stack


    Scatter loading

    Scatter Loading

    • Enable us to specify the memory map of an image to armlink.

    • For more information Refer to:

      • ARM Developer Suite Version 1.2 Linker and Utilities Guide


    Example of scatter loading script

    Example of Scatter Loading Script

    All other read only code is placed after vector.o

    Contain RW and ZI data regions

    Heap grows from this address

    Stack grows downward from this address

    ROM_LOAD 0x0

    {

    ROM_EXEC 0x0

    {

    vectors.o (Vect, +First)

    * (+RO)

    }

    RAM 0x28000000

    {

    * (+RW, +ZI)

    }

    HEAP +0 UNINIT

    {

    heap.o (+ZI)

    }

    STACKS 0x28080000 UNINIT

    {

    stack.o (+ZI)

    }

    UART0 0x16000000 UNINIT

    {

    uart.o (+ZI)

    }

    }


    Example of scatter loading script rom ram remap

    Example of Scatter Loading Script ROM/RAM Remap

    FLASH 0x24000000 0x4000000

    {

    FLASH 0x24000000 0x4000000

    {

    init.o (init, +First)

    * (+RO)

    }

    32bitRAM 0x0000

    {

    vectors.o (Vect, +First)

    * (+RW,+ZI)

    }

    HEAP +0 UNINIT

    {

    heap.o (+ZI)

    }

    STACKS 0x40000 UNINIT

    {

    stack.o (+ZI)

    }

    UART0 0x16000000 UNINIT

    {

    uart.o (+ZI)

    }

    }


    Initialize application

    Initialize Application

    • Initialize nonzero writable data by copying the initializing values (RW from ROM to ROM) to the writable data region

    • Set all writable data of ZI region to zero.

    • When the compiler compiles a function called main(), it generates a reference to the symbol__main to force the linker to include the basic C run-time system from the ANSI C library.

      (The symbol __main is marked as an entry point.)


    User initial stackheap

    __user_initial_stackheap()

    In /Retarget.c

    __value_in_regs struct__initial_stackheap__user_initial_stackheap(unsigned R0, unsigned SP, unsigned R2, unsigned SL)

    {

    struct__initial_stackheap config;

    config.heap_base = 0x00060000;

    config.stack_base = SP;

    return config;

    }


    Build trace a rom image

    練習:Build & Trace a ROM Image

    • Try example\embedded\embed\embed.mcp

    • Building an image by ADS

    • Trace code by AXD


    Case study build an image for wiscore evm

    Case Study:Build An Image for Wiscore EVM

    • Application:

      • 7-segment LED test

      • Set up Timer 0: Interrupt to flash LED 0

      • While loop detect switch & change 7-segment LED number


    S3c4510b initial memory map

    S3C4510B Initial Memory Map

    Figure4-2 Initial System Memory Map (After Reset)


    S3c4510b special register base

    S3C4510B Special Register Base

    • SYSCFG register determines:

      • Start address of the System Manager’s special registers

      • Start address of internal SRAM.

        • (The total special register space in the system memory map is fixed at 64 K bytes.)

    [15:6] Internal SRAM base pointer

    This 10-bit address becomes the upper address of SRAM.

    A25 through A16, the remaining SRAM address, and A15 through A0, are filled with zeros.

    [25:16] Special register bank base pointer

    The resolution of this value is 64K. Therefore, to place the start address at 1800000H (24M), use this formula:

    Setting value = (1800000H/64K)<<16.


    Net start evm memory mapping

    NET-Start! EVM Memory Mapping

    Figure 3-1 NET-Start! Memory Map

    Figure 3-2 Memory Usage


    Net start 7 segment led leds

    NET-Start! 7-segment LED&LEDs


    Gpio assignments

    GPIO Assignments

    Figure 2-8 NET-Start! GPIO assignment

    Note:

    The GPIO_IN1 is used by the bootstrap loader. Selecting the ON position will force the bootstrap loader to run in the diagnostic mode. Otherwise, the embedded Linux will be booted up after system power on.


    S3c4510b special registers for i o port

    S3C4510B Special Registers for I/O Port

    Figure 12-1. I/O Port Function Diagram


    S3c4510b iopmod

    S3C4510B IOPMOD


    S3c4510b iopcon

    S3C4510B IOPCON


    S3c4510b iopdata

    S3C4510B IOPDATA

    Figure 12-4. I/O Port Data Register (IOPDATA)


    Example code to control led

    Example Code to Control LED

    #define SYS_BASE0x03ff0000

    #define IOPMOD((volatile unsigned*)(SYS_BASE + 0x5000))

    #define IOPCON((volatile unsigned*)(SYS_BASE + 0x5004))

    #define IOPDATA((volatile unsigned*)(SYS_BASE + 0x5008))

    #define GPIO_LED00x00010000

    #define GPIO_LED10x00020000

    void SetLED (int n, int on)

    {

    int ctr[] = {GPIO_LED0, GPIO_LED1};

    if (0 <= n && n< sizeof(ctrl) / sizeof(int))

    *IOPDATA = on? (*IOPDATA | ctr[n]) : (*IOPDATA & ~ctrl[n]);

    }


    S3c4510b timer

    S3C4510B Timer


    S3c4510b special register tmod

    S3C4510B Special Register TMOD


    S3c4510b special register tdata

    S3C4510B Special Register TDATA

    TIMER DATA REGISTERS

    The timer data registers, TDATA0 and TDATA1, contain a value that specifies the time-out duration for each timer. The formula for calculating the time-out duration is: (Timer data +1) cycles.


    S3c4510b special register tcnt

    S3C4510B Special Register TCNT

    TIMER COUNT REGISTERS

    The timer count registers, TCN0 and TCNT1, contain the current timer 0 and 1 count value, respectively, during normal operation.


    Example code to control timer

    Example Code to Control Timer

    #define TMOD*(volatile unsigned*)(SYS_BASE + 0x6000)

    #define TDATA0*(volatile unsigned*)(SYS_BASE + 0x6004)

    void Init_Timer()

    {

    TDATA0 = 0x17D7840//MCLK/25M

    TMOD = 0x1;//enable timer0 at interval mode

    }


    S3c4510b interrupt controller

    S3C4510B Interrupt Controller


    S3c4510b special register intmod

    S3C4510B Special Register INTMOD


    S3c4510b special register intpnd

    S3C4510B Special Register INTPND

    INTERRUPT PENDING REGISTER

    The interrupt pending register. INTPND contains interrupt pending bits for each interrupt source. This register has to be cleared at the top of an interrupt service routine.


    S3c4510b special register intmsk

    S3C4510B Special Register INTMSK


    S3c4510b special register intoffset

    S3C4510B Special Register INTOFFSET

    • INTERRUPT OFFSET REGISTER

      • The interrupt offset register, INTOFFSET, contains the interrupt offset address of the interrupt, which has the highest priority among the pending interrupts.

      • The content of the interrupt offset address is “bit position value of the interrupt source <<2”.

      • If all interrupt pending bits are “0” when you read this register, the return value is “0x00000054”.

    0x0000,0054>>2=21

    =Total interrupt


    Example code for timer0 interrupt control

    Example Code for Timer0 Interrupt Control

    #define INTMOD*(volatile unsigned*)(SYS_BASE + 0x4000)

    #define INTPND*(volatile unsigned*)(SYS_BASE + 0x4004)

    #define INTMSK*(volatile unsigned*)(SYS_BASE + 0x4008)

    #define INTOFFSET*(volatile unsigned*)(SYS_BASE + 0x4024)

    void interrupt_InitMask()

    {

    INTMSK = 0x1FFBFF;// enable Timer0 interrupt

    INTPND = 0x0;

    }

    lrq handler.s

    STMFDsp!, {r1-r3}

    LDRr3, =INTOFFSET

    LDRr2,[r3]

    MOVr2,r2,LSR#2

    MOVr1,#0x1

    MOVr1,r1,LSL r2

    LDRr3,=INTPND

    STRr1,[r3]

    LDMFDsp!,{r1-r3}

    Timer0 IRQ

    INTOFFSET = 0b101000

    r2 = 0b101000>>2 = 0b1010 = 10


    Use net start bootloader

    Use Net-Start! Bootloader

    • Commands:

      • HELP-help

      • UNIT – Set access unit

      • DUMP – Memory dump

      • COPY- Memory copy

      • FILL – Fill memory

      • POKE – Poke memory

      • PEEK – Peek memory

      • TX – Send XMODEM file

      • RX – Receive XMODEM file

      • GO – Execute binary

      • INFO – Print system information

      • SWITCH – Switch mode

    • Load image:

      • Select SW2 to OFF position

      • Press RESET button

      • Terminal set: bound rate 19200,8N1

      • RUN>rx 0x10000

      • RUN>copy 0x10000 0x20000 0x1810000

      • RUN>go 0x1810000


    Timer 2

    練習:Timer 2

    • 根據前面的CASE,增加Timer 2功能, 讓Timer 2 控制另一顆LED.


    Reference2

    Reference

    • ARM Developer Suite Version 1.2 Developer Guide

    • ARM Developer Suite Version 1.2 Linker and Utilities Guide

    • Wiscore Net! Start user Guide

    • SAMSUNG S3C4510B User Manual


  • Login