830 likes | 904 Views
Explore hardware architecture in embedded systems including processor selection, memory layout, and I/O devices. Learn about RISC vs. CISC, 8051 architecture, and Nuvoton N76E003 systems.
E N D
EMBEDDED SYSTEMSUNIT 2 Arjun Gour & Dr. Limaye ETC Department.
EMBEDDED SYSTEM ARCHITECTURE Hardware and software architecture, Processor selection for Embedded System, Memory Architecture and IO devices , Interrupt Service Mechanism ,Context switching, Device Drivers. SVPCET
Hardware Architecture • Hardware architecture describes the arrangement of various components - registers, data paths (buses), processing elements( ALU, FPU, encryptor), memory, control unit. • It describes the system at high level of abstraction. It contains behavioral and inter relationship information. Implementation details are not shown. • Instruction set determines processing capability. • RISC (Reduced Instruction Set Computer) • CISC (Complex Instruction Set Computer) • Processing elements (word length 8/16/32/64) • Simple: Addition/ Subtraction • Medium: Multiply divide • Complex: Floating point, trigonometric, special purpose SVPCET
RISC Vs CISC • RISC • Lesser number of Instructions • Instruction pipelining • One instruction per clock cycle • ALU Operands are registers only, memory operations are load/store • CISC • Greater number of Instructions • Generally no instruction pipelining • Many clock cycles for one instruction • ALU Operands are can be either register or memory
RISC • Large number of registers are available • More Code for a task • Fixed length instructions • Less silicon • Hardwired control unit • Generally with Harvard memory Architecture • CISC • Limited number of registers are available • Less Code for Task • Variable length instructions • More silicon • Microprogrammed control • Can be Harvard or Von-Neumann Architecture
8051 Architecture SVPCET
The figure shows 8051 architecture, the classic version contains, • 12 MHz clock, Processor instruction cycle time of 1 microsec. • 8-Bit ALU, Internal BUS width 8 bit. • CISC Architecture. • Special bit manipulation instructions. • Program counter with initial default reset value 0x0000. • Stack pointer with initial default reset value 0x07. • Simple architecture with no-floating point processor, no cache, no memory management, no pipeline, no instruction level pipelining and no DMA controller. SVPCET
Harvard memory architecture, program memory and data memory have separate address space and separate control signals. • On-chip RAM -128 bytes, 8052 version has 256 bytes of RAM, 32 bytes of RAM are also used as four banks of register. Each bank has 8 registers. • There are special function registers (SFRs). They are PSW(processor status word), A(Accumulator), B register, SP(Stack pointer) and registers of serial IOs, timers, ports and interrupt handler. • Two external interrupts INT0 and INT1. • Four ports P0, P1, P2, P3 of 8 bit each. • It has no PWM and provides no support to DAC. It has no MODEM, no Watchdog timer, no ADC. SVPCET
Advanced architectures : ARM (Details in Unit 3) • History • ARM was developed in 1985 by Acorn Computer Group in the United Kingdom. • Acorn introduced the first RISC processor in 1987, targeting low-cost PCs. • In 1990, Acorn formed Advanced RISC Machines. • ARM, which initially stood for Acorn RISC Machine but later changed to Advanced RISC Machine, defines a 32-bit RISC architecture.
Advanced features • RISC • Auto increment addressing modes • ALU and barrel shifter in same operation • Coprocessors, FPU, MMU • Pipelining (ARM 7 - 3 stage, ARM 9 – 5 stage) • Harvard memory architecture for cache (Arm 9) • 3 instruction sets • ARM 32 bit 58 instr. 40% faster than thumb • THUMB 16 bits 30 instr 30% memory saving • JAZELLE Java virtual machine instr(95%)
Sharc *From Analog Devices ADSP TS203SABP *Two 32 bit ALUs (J,K) for address generation *Two computational blocks (X,Y) *Super harvard 3 buses(IJK) Each bus has 128 bit data, 32 bit address 1 MB on chip RAM, 16 GB off Two sets of 16 registers for fast context switching. Tiger Sharc – Highest performance. 24 MB on chip RAM, 250 MHz. Used in cellular base stations.
TMS320C64X DSP 256 bits instruction VLIW 8 Instructions in parallel 8 execution units L1, S1,M1,D1,L2,S2,M2,D2 M=Multiplier L,S,D ALU Two register files A,B Each file 32 reg of 32 bits Fetch unit – 2 instr Emulation – debugging L1, L2 cache
Memory architecture • Memory buses • Von Neuman(Princeton) • Harvard • Super Harvard • Types of memory • ROM, EPROM, EEPROM, Flash, SRAM, DRAM • Endianness • Little • Big • DMA • MMU • Cache • Scratch pad memory (Tightly coupled memory-ARM)
Memory bus architecture • Harvard architecture • Separate buses, separate address spaces, and separate data paths for program and data memory (8051, TMS320) • Super Harvard- More than one buses for data • Von Neuman (Princeton) architecture • Common memory for program and data and a single bus (8085) • Mixed • Common main memory but D cache and I cache(ARM)
Compare Harvard and Princeton • Princeton • Less hardware • Less memory bandwidth • Can easily load programs from disk • Can modify program • Harvard • More hardware • More memory bandwidth • Needs a special mechanism to load program • Cannot modify program
Little endian and big endian • The terms originated in Jonathan Swift’s novel, “Gulliver’s travels”. • These are two ways of ordering multibyte data in memory. • Little endian = LSB at lower address • Big endian = MSB at lower address • Consider data 0x01234567 stored at address 100 in memory.
Is it little or big? void show_mem_rep(char *start, int n) { int i; for (i = 0; i < n; i++) printf(" %.2x", start[i]); printf("\n"); } int main() { int i = 0x01234567; show_mem_rep((char *)&i, sizeof(i)); getchar(); return 0; }
Program output • When above program is run on little endian machine, gives • “67 45 23 01” as output • If it is run on big endian machine, gives • “01 23 45 67” as output.
Simple program int main() { unsigned int i = 1; /* internally 0x0001 or 0x0100*/ char *c = (char*)&i; if (*c) printf("Little endian"); /*LSB first = 01 */ else printf("Big endian"); /*MSB first = 01 */ getchar(); return 0; }
Types of memory EPROM (Obsolete, program) EEPROM (Non volatile data), Flash ROM (Program, bootloader) SRAM(Internal-on chip) DRAM (External off chip) CACHE is used to reduce the time needed to access (read or write) memory, a local copy of a portion of memory may be kept in a small but especially fast memory called cache. • TCM Tightly coupled memory (ARM) • As fast as cache but always mapped at fixed physical address. Deterministic fast access SVPCET
Cache memory • Why do we use cache memory? • It provides a way of achieving a low average access time with minimum cost and silicon area and power consumption. • It does this by having a two level hierarchy of memory types. SRAM (Fast, High power, more silicon, simple control) and DRAM (Slow, less power, less silicon, needs refreshing)
Scratch pad memory • What is difference between scratch pad memory (TCM) and cache? • Scratch pad memory is allotted a known address space. We need to give instruction to linker to load the desired portions of code and data there. It provides a guaranteed low access time storage. Useful for time critical code. • Cache does not have an address space. It is transparent to software designer. It automatically copies main memory locations to itself. It provides a low average access time but there is no guarantee of finding a code in it at desired time.
ROM types • Mask ROM • Programmed by IC foundry. NRE cost is high. Updating is not possible. Per unit cost is low and no programming required. Used in very high volume products • OTP (One time programmable) • ID on credit card, smart card, boot program • EPROM (UV erasable EPROM-obsolete) • EEPROM (Electrically Erasable PROM) • Non volatile data. High write time, Can be serial • Flash • Similar to EEPROM but erases 1 sector at a time) • Boot program or whole program, memory sticks
RAM types • SRAM (Static RAM) • No refresh required, fast, more silicon, flip flop • On chip memory, cache • DRAM (Dynamic RAM) obsolete • Charge stored on capacitor • Off chip bulk memory • EDO (Extended Data Output) DRAM • Data retained on output while new read cycle starts. Allows zero wait state for Pentium. • SDRAM (Synchronous DRAM) • Synchronizes fetch with CPU clock. Faster. 1 GHz.
RAM types contd. • RDRAM (Rambus DRAM) • Uses pipelining and accesses a block of 4 consecutive words. Achieves 1.6 billion bytes/sec • Parameterized distributed RAM. RAM is distributed in various system subunits like IO buffers and transreceivers. It provides buffering of memory at subunits before they are fetched by main processor.It facilitates faster input from IO compared to CPU bus access. • Parameterized block RAM. A specific block of RAM is dedicated for a subunit e.g. MAC unit. Used when internal bus is slow compared to subunit. (Scratchpad)
Processor selection • To meet real deadlines, processor should operate at sufficient clock speed. • Further high computing performance is possible when • (a) Pipelined superscalar architectures, • (b) pre-fetch cache unit, caches, and register-files and MMU and • (c) RISC architecture. • Register windows provide fast context switching. E.g. 8051 has 4 windows. SPARC has 7 • Power efficient • Auto shut down, sleep mode, wake up • Cache disable
Processor selection contd. • Burst mode for external memory access • Atomic operation unit to avoid shared data problem. E.g. LOCK prefix of 8086. • Endian ness • Intel Little endian • Motorola Big endian • ARM flexible
Selection examples Case 1: Chocolate vending, data acquisition, robots, ECG recorder. Low performance processor is sufficient. Instr. Cycle = .5 to 1 ms. 8-bits. All peripherals and memory on-chip. Choice: 8051, 68HC11, ATMEGA328 Case 2: 2 Mbps router, voice/video compression/ decompression, adaptive cruise control with string stability and network gateway. Medium performance processor is required. Instruction cycle time 10-40 ns (25-100 MHz) 32-bits RISC, FPU. Pipelined. Off chip RAM, ROM, cache, MMU required. Choice: 80x86, 80860 & 80960 (Intel RISC, obsolete)Intel XScale based on ARM
Selection examples contd • Case 3: 100 Mbps routers, switches, multichannel encryption/ decryption. High performance processor is required. Instr. Cycle = .5-1 ns (1-2GHz) is required. 32-bit RISC, FPU. Pipelined superscalar. Off chip RAM, ROM, cache, MMU, DMA required. Harvard. Choice: ARM7, SuperSparc • Case 4: Real time audio video processing, mobile phone. Very high performance processor is required. Instr. Cycle = .5-1 ns (1-2GHz) is required. 64-bit RISC, FPU, DSP. Pipelined superscalar. Off chip RAM, ROM, cache, MMU,DMA required. Harvard. Choice: ARM9,11, TMS320C67xx
Software architecture (simple) • Details in unit 5 • Bare metal system • No operating system. No command interface because it runs only one application for lifetime. Suitable for simple single threaded systems. • Components • Initializing (Start up code). Connecting internal peripherals (Timer, ADC, DAC, UART) to pins. Setting GPIO mode (In/out), Initializing RAM data, SP • Libraries (UART, I2C, SPI, LCD, WiFi) • Application
Software architecture (advanced) • Start up code • RTOS Kernel • Starting, suspending, switching and stopping Tasks • Inter task communication • Memory management (Protection and Virtual memory) • Interrupt service routines • Libraries e.g. math.h • Communication (TCP/IP stack) • File system (Arrange secondory storage into files) • Device drivers (Serial, LAN, Graphic LCD) • Application (Set of tasks)
Software Architecture SVPCET
IO devices IO devices are a means of communication between a microcontroller and external world. A Port connects to the processor using address decoder and system buses The processor uses the addresses of the port-registers for programming the port functions or modes, reading port status and for writing or reading bytes. The address may be IO mapped or memory mapped. IO port can be broadly classified as serial or parallel
Types of IO ports • Synchronous Serial • CD, HDD, audio, video, network, SPI, I2C, CAN, SDIO (Secure Digital IO- SD cards based on SPI), modem, Bluetooth, USB. Details in unit 4. • Clock and data may be on different lines (SPI,I2C ) • Or same line (modem, HDD) we need clock data separator • Asynchronous Serial – UART, modem • Parallel GPIO (8255) • LED, Switch, Matrix Keyboard, 16 X2 LCD, ADC, DAC, PWM
Serial bus specifications • Transmission Direction • Simplex (transmits in one direction only) • Half duplex (Both directions but one at a time) • Full duplex (Simultaneously transmits in both directions • Bus interface • RS232 Point to Point, single ended +/- 12 V, 20 ft, 20 Kb/s • RS422 Point to Point, differential +/- 6 V, 4000 ft, 10 Mb/s • RS485 Multi drop +/- 6V 4000 ft, 10 Mb/s
UART COM port UART uses a shift register for serial/ parallel conversion. CPU can also directly use single bit IO, e.g. SID & SOD. DTE- Data terminal equipment DCE- Data Comm equipment DCD – Data Carrier Detect DSR – Data Set ready (Powered) DTR – Data Term Ready (Powered) RTS – Request To Send CTS – Clear To Send TxD – Transmit Data RxD – Received Data
UART frame No clock sent. Clock is generated by receiver using START and known baud rate. Watch for negative edge on RxD. When received, wait for ½ bit period. Sample TxD 8 times after a delay of 1 bit period in between. For ASCII, bit 7 is parity. For binary data, either no parity or extra bit added. Stop bits can be 1 or two. Stop bit is needed to accommodate mismatch between transmit and receive clocks. In 8.5 bit periods, sampling edge is allowed to drift ½ period (5.8%, i.e. 2.9% at each end). 64 x internal clock adds uncertainty of 1/64 bit periods to sync, (.18 %) . Why 8051 clk = 11.0592 MHz? Short form specs – 9600, 8, e, 1
Interfacing (Connecting IO device to system) Generally embedded systems use microcontrollers, so system bus does not come out of the chip. We use GPIO pins or dedicated peripheral pins (SPI, ADC, UART) for interfacing. In ARM based controllers, IC pins are shared between GPIO and dedicated peripherals. We need to initialize connection between desired peripherals to pins. When using a microprocessor, we have to use system bus e.g. ISA, PCI, AMBA etc. The bus consists of address, data and control signals. The device data is connected to data bus through tri state buffers and enabled with WR and RD control signals.
ARM LPC2148 IO pin configuration In 8051, there are 32 GPIO pins. But some have alternate functions like Timer I/O, UART (TxD, RxD) interrupts etc. ARM based controllers have too many peripheral units and the number of IO pins is limited. So every pin has a basic function of GPIO but it can have alternate functions like UART, SPI, I2C, ADC, DAC etc. The peripherals are connected to pins through “Pin connect block”. LPC2148 (ARM7 TDMI from NXP) has two 32 bit ports, P0, P1 There are 3 Function Select Registers in LPC2148 that control the pin connect block. 1. PINSEL0 : - PINSEL0 for configuring PORT0 pins P0.0 to P0.15. 2. PINSEL1 : - PINSEL1 for configuring PORT0 pins P0.16 to P0.31. 3. PINSEL2 : - PINSEL2 for configuring PORT1 pins P1.16 to P1.31. SVPCET
Pin connect block PINSELx register bits are organized as 16 pairs of bits. Each pair selects one out of 4 peripherals that connects to a pin. “00” always represents GPIO On reset, PINSELx(31:0) = 0, IOxDIR=0 (All GPIO in) SVPCET
Addresses of GPIO registers (LPC 2148) Register Address Comment PINSEL0 E002 C000 0 at reset = GPIO PINSEL1 E002 C004 0 at reset = GPIO PINSEL2 E002 C014 0 at reset = GPIO IO0PIN E002 8000 Write 0/1 data IO1PIN E002 8010 Write 0/1 data IO0SET E002 8004 Write 1 to set bits IO1SET E002 8014 Write 1 to set bits IO0CLR E002 800C Write 1 to reset bits IO1CLR E002 801C Write 1 to reset bits IO0DIR E002 8008 0 at reset = input IO1DIR E002 8018 0 at reset = input SVPCET
Bit manipulation • Set bit 3 of port 0 (P0.3) • IO0SET = (1 << 3); • IO0PIN |= (1<<3); • Reset bit 25 of port 1 (P1.25) • IO1CLR = (1 << 25); • IO1PIN &= (not (1 << 25)); SVPCET
ARM Interfacing Switch and LED Write a C program such that When switch is open (High), led should glow. When switch is closed(low), led should be off. Note that LED is active low SVPCET
C program int main(void) { //PINSEL0 = 0x00000000; /* Configuring P0.0 to P0.15 as GPIO */ /* No need for this as PINSEL0 reset value is 0x00000000 */ IO0DIR = 0x00000001; /* Make P0.0 bit as output bit, P0.1 bit as an input pin */ while(1) { if ( IO0PIN & (1<<1) ) /* If switch is open, pin is HIGH */ { IO0CLR = 0x00000001; /* Turn on LED */ } else /* If switch is closed, pin is LOW */ { IO0SET = 0x00000001; /* Turn off LED */ } } } SVPCET
Twisted nematic liquid crystals • In normal state, crystals are twisted. Direction of polarization changes by 900.When voltage is applied, molecules align with field and untwist.
Principle of LCD (OFF pixel) • large bright light source • A horizontal polarizing filter. • Only horizontal light • A transistor switches off this pixel by switching on the electricity flowing through its liquid crystals. That makes the crystals straighten out • Light still vibrating horizontally. • A vertical polarizing filter • No light reaches the screen at this point. In other words, this pixel is dark.
Principle of LCD(On pixel) • The bright light at the back • The horizontal polarizing filter • Only horizontal waves • A transistor switches on this pixel by switching off the electricity The twisted crystals rotate light waves by 90° • Light vibrating vertically. • The vertical polarizing filter • The pixel is lit up. A red, blue, or green filter gives the pixel its color.
Types of LCD display panels • There is a controller chip inside LCD module which accepts commands from microcontroller and drives the LCD pixels. • Controller can be • Alphanumeric (16 x 2) • Displays alphabetical characters from ASCII input • Graphical (128 x 64) • Commands to set pixel, draw line, circles etc. • Controller interface may be parallel or SPI