slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
嵌入式處理器架構與 程式設計 PowerPoint Presentation
Download Presentation
嵌入式處理器架構與 程式設計

Loading in 2 Seconds...

play fullscreen
1 / 60

嵌入式處理器架構與 程式設計 - PowerPoint PPT Presentation


  • 129 Views
  • Uploaded on

嵌入式處理器架構與 程式設計. 王建民 中央研究院 資訊所 2008 年 7 月. Contents. Introduction Computer Architecture ARM Architecture Development Tools GNU Development Tools ARM Instruction Set ARM Assembly Language ARM Assembly Programming GNU ARM ToolChain Interrupts and Monitor. Lecture 3 ARM Architecture.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '嵌入式處理器架構與 程式設計' - ayoka


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

嵌入式處理器架構與程式設計

王建民

中央研究院 資訊所

2008年 7月

contents
Contents
  • Introduction
  • Computer Architecture
  • ARM Architecture
  • Development Tools
  • GNU Development Tools
  • ARM Instruction Set
  • ARM Assembly Language
  • ARM Assembly Programming
  • GNU ARM ToolChain
  • Interrupts and Monitor
outline
Outline
  • Overview
  • ARM Architecture
  • ARM Processor Core
introduction to arm
Introduction to ARM
  • Advanced RISC Machines
    • Founded in November 1990
    • Spun out of Acorn Computers
  • Designs the ARM range of RISC processor cores
  • Licenses ARM core designs to semiconductor partners who fabricate and sell to their customers.
    • ARM does not fabricate silicon itself
  • Also develop technologies to assist with the design-in of the ARM architecture
    • Software tools, boards, debug hardware, application software, bus architectures, peripherals etc
why arm here
Why ARM here?
  • ARM is the most licensed and thus widespread processor cores in the world.
  • Used especially in portable devices due to low power consumption and reasonable performance (MIPS/watt)
  • Several interesting extension available or in development like Thumb instruction set and Jazelle Java machine
history of the arm architecture
History of the ARM Architecture

5TE

Improved ARM/Thumb Interworking

CLZ

4

Jazelle

Java bytecodeexecution

5TEJ

Halfword and signed halfword / byte support

System mode

1

ARM9EJ-S

ARM926EJ-S

SA-110

Saturated maths

DSP multiply-accumulate instructions

2

SA-1110

ARM7EJ-S

ARM1026EJ-S

3

6

SIMD Instructions

Multi-processing

V6 Memory architecture (VMSA)

Unaligned data support

ARM1020E

Thumb instruction set

4T

XScale

Early ARM architectures

ARM7TDMI

ARM9TDMI

ARM9E-S

ARM720T

ARM940T

ARM966E-S

ARM1136EJ-S

example arm based system

Peripherals

32 bit RAM

16 bit RAM

Interrupt

Controller

I/O

nIRQ

nFIQ

ARM

Core

8 bit ROM

Example ARM-based System
slide16
AMBA

Advanced Microcontroller Bus Architecture

ADK

Complete AMBA Design Kit

AMBA

Arbiter

Reset

ARM

TIC

Timer

Remap/

Pause

External

ROM

External

Bus

Interface

Bus Interface

Bridge

External

RAM

Interrupt

Controller

On-chip

RAM

Decoder

AHB or ASB

APB

System Bus

Peripheral Bus

  • ACT
    • AMBA Compliance Testbench
  • PrimeCell
    • AMBA compliant peripherals
the realview product families
The RealView Product Families

Compilation Tools

ARM Developer Suite (ADS) – Compilers (C/C++ ARM & Thumb),Linker & Utilities

Debug Tools

AXD (part of ADS)

Trace Debug Tools

Multi-ICE

Multi-Trace

Platforms

ARMulator (part of ADS)

Integrator™ Family

RealView Compilation Tools (RVCT)

RealView Debugger (RVD)

RealView ICE (RVI)

RealView Trace (RVT)

RealView ARMulator ISS (RVISS)

arm debug architecture
ARM Debug Architecture

Ethernet

  • EmbeddedICE Logic
    • Provides breakpoints and processor/system access
  • JTAG interface (ICE)
    • Converts debugger commands to JTAG signals
  • Embedded trace Macrocell (ETM)
    • Compresses real-time instruction and data access trace
    • Contains ICE features (trigger & filter logic)
  • Trace port analyzer (TPA)
    • Captures trace in a deep buffer

Debugger (+ optional

trace tools)

Trace Port

JTAG port

ARM

core

ETM

TAP

controller

EmbeddedICE

Logic

outline1
Outline
  • Overview
  • ARM Architecture
  • ARM Processor Core
arm architecture
ARM Architecture
  • 32-bit RISC-processor core
    • Fixed length 32-bit instructions
    • 3-address instruction format
    • Load/store architecture
    • Pipelined execution (ARM7: 3 stages)
  • Cache (depending on the implementation)
  • Bus structure
    • Von Neuman-type bus structure (ARM7)
    • Harvard-type bus structure (ARM9)
  • Coprocessor support
  • Simple structure  reasonably good speed/power consumption ratio
arm features
ARM Features
  • Operating states
    • ARM: 32-bit ARM instruction set
    • Thumb: 16-bit Thumb instruction set
    • Jazelle cores can also execute Java bytecode
  • Memory formats
    • Little-endian
    • Big-endian
  • 6 data types
  • 7 operating modes
  • 37 pieces of 32-bit integer registers
  • Exception support
data types
Data Types
  • The ARM is a 32-bit architecture.
  • When used in relation to the ARM:
    • Byte means 8 bits
    • Halfword means 16 bits (two bytes), aligned on 2-byte boundary
    • Word means 32 bits (four bytes), aligned on 4-byte boundary
  • Both signed and unsigned data types are supported.
  • ARM coprocessor supports floating point values.
processor modes
Processor Modes
  • The ARM has seven basic operating modes:
    • User: unprivileged mode under which most tasks run
    • FIQ: entered when a high priority (fast) interrupt is raised
    • IRQ: entered when a low priority (normal) interrupt is raised
    • Supervisor: entered on reset and when a Software Interrupt instruction is executed
    • Abort: used to handle memory access violations
    • Undef: used to handle undefined instructions
    • System: privileged mode using the same registers as user mode
      • Not in ARM Architectures 1, 2 or 3
privileged modes
Privileged Modes
  • Most programs operate in User mode.
  • Modes other than User mode are collectively known as privileged modes.
  • Privileged modes are used to service interrupts or exceptions, or to access protected resources.
  • Privileged modes have more rights to memory systems and coprocessor.
registers
Registers
  • ARM has 37 registers all of which are 32-bits long.
    • 1 dedicated program counter
    • 1 dedicated current program status register
    • 5 dedicated saved program status registers
    • 30 general purpose registers
  • The current processor mode governs which of several banks is accessible. Each mode can access
    • a particular set of r0-r12 registers
    • the stack pointer, r13 (sp) and the link register,r14 (lr)
    • the program counter,r15(pc)
    • the current program status register, cpsr
  • Privileged modes (except System) can also access
    • a particular spsr (saved program status register)
arm register set
ARM Register Set

Current Visible Registers

r0

Abort Mode

r1

r2

Banked out Registers

r3

r4

r5

User

FIQ

IRQ

SVC

Undef

r6

r7

r8

r8

r9

r9

r10

r10

r11

r11

r12

r12

r13 (sp)

r13 (sp)

r13 (sp)

r13 (sp)

r13 (sp)

r13 (sp)

r14 (lr)

r14 (lr)

r14 (lr)

r14 (lr)

r14 (lr)

r14 (lr)

r15 (pc)

cpsr

spsr

spsr

spsr

spsr

spsr

register organization summary
Register Organization Summary

User

FIQ

IRQ

SVC

Undef

Abort

r0

Usermoder0-r7,r15,andcpsr

Usermoder0-r12,r15,andcpsr

Usermoder0-r12,r15,andcpsr

Usermoder0-r12,r15,andcpsr

Usermoder0-r12,r15,andcpsr

r1

r2

r3

r4

Thumb state

Low registers

r5

r6

r7

r8

r8

r9

r9

Thumb state

High registers

r10

r10

r11

r11

r12

r12

r13 (sp)

r13 (sp)

r13 (sp)

r13 (sp)

r13 (sp)

r13 (sp)

r14 (lr)

r14 (lr)

r14 (lr)

r14 (lr)

r14 (lr)

r14 (lr)

r15 (pc)

cpsr

spsr

spsr

spsr

spsr

spsr

Note: System mode uses the User mode register set

example user to fiq mode

Registers in use

Registers in use

r0

r0

r1

r1

r2

r2

r3

r3

r4

r4

r5

r5

r6

r6

r7

r7

EXCEPTION

r8

r8 FIQ

r8

r8 FIQ

r9

r9 FIQ

r9

r9 FIQ

r10

r10 FIQ

r10

r10 FIQ

r11

r11 FIQ

r11

r11 FIQ

r12

r12 FIQ

r12

r12 FIQ

r13 (sp)

r13 FIQ

r13 (sp)

r13 FIQ

r14 (lr)

r14 FIQ

r14 (lr)

r14 FIQ

r15 (pc)

cpsr

spsr FIQ

spsr FIQ

Example: User to FIQ Mode

User Mode

FIQ Mode

Return address calculated from User mode PC value and stored in FIQ mode LR

User mode CPSR copied to FIQ mode SPSR

access registers using instructions
Access Registers using Instructions
  • No breakdown of currently accessible registers.
    • All instructions can access r0-r14 directly.
    • Most instructions also allow use of the PC.
  • Specific instructions to allow access to CPSR and SPSR.
  • When in a privileged mode, it is also possible to load / store the (banked out) user mode registers to or from memory.
    • See later for details.
program status registers 1
Program Status Registers1
  • The program status registers
    • Condition code flags: hold information about the most recently performed ALU operation.
    • Interrupt disable bits: control the enabling and disabling of interrupts.
    • T-bit: reflects the operating state.
    • Mode bits: set the processor operating mode.
    • Reserved bits: unused.
  • To maintain compatibility with future ARM processors, you must not alter any othe the reserved bits.
program status registers 2
Condition code flags

N =Negative result from ALU

Z = Zero result from ALU

C = ALU operation Carried out

V = ALU operation oVerflowed

Sticky Overflow flag - Q flag

Architecture 5TEJ only

Indicates if saturation has occurred

J bit

Architecture 5TEJ only

J = 1: Processor in Jazelle state

31

28

27

24

23

16

15

8

7

6

5

4

0

N Z C V Q

I F T mode

U n d e f i n e d

J

f

s

x

c

Program Status Registers2
  • Interrupt Disable bits.
    • I = 1: Disables the IRQ.
    • F = 1: Disables the FIQ.
  • T Bit
    • Architecture xT only
    • T = 0: Processor in ARM state
    • T = 1: Processor in Thumb state
  • Mode bits
    • Specify the processor mode
condition flags
Condition Flags

Flag Logical Instruction Arithmetic Instruction

Negative No meaning Bit 31 of the result has been set

(N=‘1’) Indicates a negative number in

signed operations

Zero Result is all zeroes Result of operation was zero

(Z=‘1’)

Carry After Shift operation Result was greater than 32 bits

(C=‘1’) ‘1’ was left in carry flag

oVerflow No meaning Result was greater than 31 bits

(V=‘1’) Indicates a possible corruption of

the sign bit in signed numbers

mode bits
Mode Bits

M[4:0]Processor Mode

10000 User

10001 FIQ

10010 IRQ

10011 Supervisor

10111 Abort

11011 Undefined

11111 System

program counter r15
Program Counter (r15)
  • When the processor is executing in ARM state:
    • All instructions are 32 bits wide.
    • All instructions must be word aligned.
    • pc value is stored in bits [31:2] with bits [1:0] undefined.
  • When the processor is executing in Thumb state:
    • All instructions are 16 bits wide.
    • All instructions must be halfword aligned.
    • pc value is stored in bits [31:1] with bit [0] undefined.
  • When the processor is executing in Jazelle state:
    • All instructions are 8 bits wide.
    • Processor performs a word access to read 4 instructions at once.
link register r14
Link Register (r14)
  • The r14is used as the subroutine link register (LR) and stores the return address when Branch with Link operations are performed, calculated from the PC.
  • Thus to return from a linked branch
    • MOV r15, r14

or

    • MOV pc, lr
exception handling 1
Exception Handling1
  • Exceptions arise whenever the normal flow of a program has to be halted temporarily.
  • When an exception occurs, the ARM:
    • Stores the return address in LR_<mode>
    • Copies CPSR into SPSR_<mode>
    • Sets appropriate CPSR bits
      • Change to ARM state
      • Change to exception mode
      • Disable interrupts (if appropriate)
    • Sets PC to fetch the next instruction from the relevant vector address
the vector table

0x1C

0x18

0x14

0x10

0x0C

0x08

0x04

0x00

The Vector Table

FIQ

IRQ

(Reserved)

Data Abort

Prefetch Abort

Software Interrupt

Undefined Instruction

Reset

Vector Table

Vector table can be at 0xFFFF0000 on ARM720T and on ARM9/10 family devices

exception handling 2
Exception Handling2
  • Exceptions are always entered in ARM state.
  • After the exception has been processed, the control normally flows back to the original task.
  • To return, exception handler needs to:
    • Clear the disable interrupt flags that were set on entry
    • Restore CPSR from SPSR_<mode>
    • Restore PC from LR_<mode>
  • The last two steps must happen atomically as part of a single instruction.
exception handling 3
Exception Handling3

ExceptionReturn instruction

BL MOV PC, R14

SWI MOVS PC, R14_svc

UDEF MOVS PC, R14_und

PABT SUBS PC, R14_abt, #4

FIQ SUBS PC, R14_fiq, #4

IRQ SUBS PC, R14_irq, #4

DABT SUBS PC, R14_abt, #8

RESET Not applicable

quiz 1
Quiz #1
  • What registers are used to store the program counter and link register?
  • What is r13 often used to store?
  • Which mode, or modes has the fewest available number of registers available? How many and why?
outline2
Outline
  • Overview
  • ARM Architecture
  • ARM Processor Core
arm7tdmi organization
ARM7TDMI Organization
  • Register Bank
    • 2 read ports and 1 write port
    • In addition, 1 read port and 1 write port for PC
  • Barrel Shifter
  • ALU
  • Address Register and Incrementer
  • Data Register
  • Instruction Decoder and Control Logic
pipelined execution
Pipelined Execution
  • When cycle = 3, PC = 208
    • ADD instruction (addr=200=PC-8) in the execute stage
    • SUB instruction (addr=204=PC-4) in the decode stage
    • MOV instruction (aadr=208=PC) in the fetch stage

Cycle

1

2

3

4

5

6

7

PC

200

204

208

20C

210

214

218

Address

Instruction

200

ADD

Fetch

Decode

Execute

204

SUB

Fetch

Decode

Execute

208

Decode

Execute

MOV

Fetch

20C

Decode

Execute

AND

Fetch

210

Fetch

Decode

Execute

ORR

3 stage pipeline
3-Stage Pipeline
  • There are 3 instructions undertaken simultaneously at different stage
  • For data processing instructions
    • Latency = 3 cycles
    • Throughput = 1 instruction / cycle
  • When accessing PC, PC = address of the instruction being executed + 8
data processing instructions
Data Processing Instructions
  • Operations
    • Arithmetic operations: ADD, SUB, …
    • Logic operations: AND, ORR, …
    • Register operations: MOV, CMP, …
  • Operands
    • Register-Register
    • Register-Immediate
  • All operations can be executed in a single clock cycle.
multi cycle instructions
Multi-Cycle Instructions
  • Data Transfer Instructions: LDR and STR
    • 1st cycle: Compute a memory address similar to a data processing instruction.
    • 2nd cycle: Load data from memory to read data register or store data to memory
    • 3rd cycle: Transfer data from read data register to Register Bank for LDR
  • Branch Instructions: BL
    • 1st cycle similar to address calculation
    • 2nd cycle saves return address
    • 3rd cycle adjusts the value in link register
pipelining for str
Pipelining for STR
  • Memory access once in every cycle
  • Data path used once in every cycle
  • Decoder generate control signals for the data path in the next cycle(s)

Cycle

1

2

3

4

5

6

7

8

Operation

ADD

Fetch

Decode

Execute

STR

Fetch

Decode

Addr. calc.

Data xfer

AND

Fetch

Decode

Execute

MOV

Fetch

Decode

Execute

CMP

Fetch

Decode

Execute

pipelining for bl
Pipelining for BL

Cycle

1

2

3

4

5

6

7

8

Operation

ADD

Fetch

Decode

Execute

BL

Fetch

Decode

Target calc.

Link return

Adjust

Decode

?

Fetch

??

Fetch

AND

Fetch

Decode

Execute

MOV

Fetch

Decode

Execute