4446 Design of Microprocessor-Based Systems

4446 Design of Microprocessor-Based Systems Memory Interface Dr. Esam Al_Qaralleh CE Department Princess Sumaya University for Technology

Connections Between CPU and Memory Control signals Memory 8088 Data Bus Address bus • What are the control signals from the microprocessor to memory? What are the control signal from memory to the microprocessor? • Address and data signals should be buffered • The use of buffers on address bus increases driving capability • Bi-directional buffers are used to control the data transferring directions on data bus • D latches are used to de-multiplex signals on AD[7:0] (and A[19:16])

Timing Diagram of A Memory Operation • Example:8088 sends address 70C12 to memory in a memory read operation assume that data 30H is read T3 T4 T2 T1 CLK Addr[15:0] D latch ALE 8088 A[15:8] A[19:16] 7H S3-S6 Buffer A[15:8] 0CH AD[7:0] Memory D latch AD[7:0] 12H 30H D[7:0] Trans -ceiver Addr[19:16] 7H DT/R DEN Addr[15:8] 0CH IO/M Addr[7:0] 12H WR RD D[7:0] 30H

11.3 Bus Buffering some high Address lines and IO/M used to identify the accessed chip RD WR IO/M A16/S3-A19/S6 A8-A15 8088 AD0-AD7 ALE DEN DT/R READY 74LS244 G1 G2 Memory Address Decoder I/O Address Decoder 8282 STB OE 8282 STB OE D Q LE D Q LE Address Bus WR and RD for each chip 8286 OE T Memory Chip 1 I/O Chip 1 Memory Chip 2 I/O Chip 2 Memory Chip n I/O Chip m Data Bus some low address lines identify the internal accessed byte (more for memory, few for I/O) CE (Chip Enable) or CS (Chip Select), activate each chip the decoder drives READY: provides enough access time for the selected chip

Memory Chips • The number of address pins is related to the number of memory locations. • Common sizes today are 1Kto 256Mlocations. (10 and 28 address pins are present.) • The data pins are typically bi-directionalin read-write memories. • The number of data pins is related to the size of the memory location. • For example, an 8-bit wide (byte-wide) memory device has 8data pins. • Catalog listing of 1K X 8 indicate a byte addressable 8K memory. • Each memory device has at least one chip select( CS ) or chip enable( CE ) or select( S ) pin that enables the memory device. • Each memory device has at least one control pin. • For ROMs, an output enable( OE ) or gate( G ) is present. • The OE pin enables and disables a set of tristate buffers. • For RAMs, a read-write( R/W ) or write enable( WE ) and read enable (OE ) are present. • For dual control pin devices, it must be hold true that both are not 0 at the same time.

Memory Address Decoding • The processor can usually address a memory space that is much largerthan the memory space covered by an individual memory chip. • In order to splice a memory device into the address space of the processor, decoding is necessary. • For example, the 8088 issues 20-bit addresses for a total of 1MB of memory address space. • However, the BIOS on a 2716 EPROM has only 2KB of memory and 11address pins. • A decoder can be used to decode the additional 9 address pins and allow the EPROM to be placed in any2KB section of the 1MB address space.

Memory Address Decoding

Memory Map - A14 A15 A16 A17 A18 A19 16KB EPROM chip OE A 7 B 6 C 5 74LS138 4 3 E1 2 E2 1 E 0 A0-A13 A0-A13 A19=A18=...=A14=1 select the EPROM CS10 A14 A15 A16 A16, A15, A14 select one EPROM chip 16KB EPROM chip CS9 16KB EPROM chip 16KB EPROM chip 16KB EPROM chip CS8 16KB EPROM chip 16KB EPROM chip 16KB EPROM chip CS7 16KB EPROM chip OE A19 = 1, A18 = 0, A17 = 0 activate the decoder CS6 A17 A18 A19 CS5 220 = 1,048,576 different byte addresses = 1Mbyte CS4 A0 ... A17 CS3 256KB RAM chip MRDC A18 A19 The same Memory-map assignment A0 ... A17 256KB RAM chip WR RD MWTC MRDC A 256Kbyte = 218 RAM chip has 18 address lines, A0 - A17 A18 A19 IO/M A19, A18 assigned to 00 => CS active for every address from 00000 to 3FFFF CS1’ CS1 CS2 A18 = 0 A19 = 0 IO/M = 0 WR RD 8088 Memory Map IO/M = 0 => Memory map All the address lines used by the decoder or memory chip => each byte is uniquely addressed = full address decoding Full address decoding FFFFF FC000 9FFFF 9C000 83FFF 80000 3FFFF 00000 FFFFF FC000 3FFFF 00000 FFFFF 00000 FFFFF FC000 83FFF 80000 3FFFF 00000 FFFFF 3FFFF 00000

Decoding Circuits • NAND gate decoders are not often used. • 3-to-8 Line Decoder (74LS138) is more common.

Addr[14:0] 32KB Addr[19] CS Addr[18] Addr[17] Addr[16] Addr[15] IO/M Memory Address Decoding • Using Full memory addressing space Addr[19:0] FFFFF 0 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Highest address 37FFF 32KB Lowest address 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 30000 These 5 address lines are not changed. They set the base address These 15 address lines select one of the 215 (32768) locations inside the RAMs 00000 Can we design a decoder such that the first address of the 32KB memory is 37124H?

256KB 256KB 256KB 256KB CS CS CS CS Addr[17:0] Addr[18] 2-to-4 decoder Addr[19] CS IO/M Memory Address Decoding • Design a 1MB memory system consisting of multiple memory chips • Solution 1:

256KB 256KB 256KB 256KB CS CS CS CS Addr[19:2] Addr[1] 2-to-4 decoder Addr[0] CS IO/M Memory Address Decoding • Design a 1MB memory system consisting of multiple memory chips • Solution 2:

256KB 256KB 256KB 256KB CS CS CS CS Addr[19:18] Addr[16:7] Addr[5:0] Addr[17] 2-to-4 decoder Addr[6] CS IO/M Memory Address Decoding • Design a 1MB memory system consisting of multiple memory chips • Solution 3: It is a bad design, but still works!

Memory Address Decoding • Design a 1MB memory system consisting of multiple memory chips • Solution 4: 256KB 256KB 512KB CS CS CS Addr[17:0] Addr[18] Addr[18] Addr[19] IO/M Addr[19] Addr[18] Addr[19] IO/M IO/M

Memory Address Decoding • Exercise Problem: • A 64KB memory chip is used to build a memory system with the starting address of 7000H. A block of memory locations in the memory chip are damaged. FFFFH 7FFFFH 733FFH 3317H 73317H Replace this block 3210H 73210H 73200H 0000H 70000H 64KB Damaged block 1M addressing space 1M addressing space

A[19] A[18] A[17] CS A[16] IO/M A[15] A[14] A[13] A[12] CS A[11] A[10] A[9] Memory Address Decoding 64KB A[15:0] 512B A[8:0]

Memory Address Decoding • Exercise Problem: • A 2MB memory chip with a damaged block (from 0DCF12H to 103745H) is used to build a 1MB memory system for an 8088-based computer 1FFFFFH 1FFFFFH 512K 180000H 103745H Use these two blocks 0FFFFFH 0DCF12H 07FFFFH 512K 000000H 000000H Damaged block A[19] A[20] A[19:0] A[19:0] CS

Memory Address Decoding • Partial decoding • Example: • build a 32KB memory system by using four 8KB memory chips • The starting address of the 32KB memory system is 30000H 0 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 high addr. of chip #4 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 Low addr. of chip #4 0 0 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 high addr. of chip #3 Chip #4 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 36000H Low addr. of chip #3 Chip #3 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 34000H high addr. of chip #2 Chip #2 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 Low addr. of chip #2 32000H Chip #1 0 0 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 30000H high addr. of chip #1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Low addr. of chip #1

Memory Map - A14 A15 A16 A17 A18 16KB EPROM chip OE mirror image A 7 B 6 C 5 74LS138 4 3 E1 2 E2 1 E 0 A0-A13 A0-A13 A18=...=A14=1 select the EPROM mirror image CS10 A14 A15 A16 A16, A15, A14 select one EPROM chip 16KB EPROM chip CS9 16KB EPROM chip 16KB EPROM chip 16KB EPROM chip CS8 16KB EPROM chip 16KB EPROM chip 16KB EPROM chip CS7 16KB EPROM chip OE A19 = 1, A17 = 0 activate the decoder CS6 base image A17 A19 CS5 CS4 A0 ... A15 CS3 base image 64KB RAM chip MRDC A18 A19 The same Memory-map assignment A0 ... A15 64KB RAM chip WR RD MWTC MRDC A 64Kbyte = 216 RAM chip has 16 address lines, A0 - A15 mirror images A18 A19 IO/M A19, A18 assigned to 00 => CS active for every address from 00000 to 3FFFF base image CS1’ CS2 CS1 A18 = 0 A19 = 0 IO/M = 0 WR RD 8088 Memory Map A16, A17 not used => four images for the same chip IO/M = 0 => Memory map Some address lines not used by the decoder or memory chip => mirror images = partial address decoding Partial address decoding FFFFF FC000 DFFFF DC000 CF000 CC000 9FFFF 9C000 83FFF 80000 7FFFF 7C000 3FFFF 30000 2FFFF 20000 1FFFF 10000 0FFFF 00000 FFFFF FC000 9FFFF 9C000 83FFF 80000 7FFFF 7C000 3FFFF 30000 2FFFF 20000 1FFFF 10000 0FFFF 00000 FFFFF FC000 7FFFF 7C000 3FFFF 30000 2FFFF 20000 1FFFF 10000 0FFFF 00000 FFFFF 3FFFF 30000 2FFFF 20000 1FFFF 10000 0FFFF 00000 FFFFF 3FFFF 00000 FFFFF 00000

8KB 8KB 8KB 8KB CS CS CS CS Addr[12:0] Addr[13] 2-to-4 decoder Addr[14] IO/M Memory Address Decoding • Implementation of partial decoding • With the above decoding scheme, what happens if the processor accesses location 02117H, 32117H, and 9A117H? • If two 16KB memory chips are used to implement the 32KB memory system, what is the partial decoding circuit? • What are the advantage and disadvantage of partial decoding circuits?

D D Q Q Generating Wait States • Wait states are inserted into memory read or write cycles if slow memories are used in computer systems • Ready signal is used to indicate if wait states are needed data Address memory 8088 Delay circuit decoder Ready clr Ready clr clk

Generating Wait States (Timing)

Memory System

Introduction • To store a single bit, we can use • Flip flops or latches • Larger memories can be built by • Using a 2D array of these 1-bit devices • “Horizontal” expansion to increase word size • “Vertical” expansion to increase number of words • Dynamic RAMs use a tiny capacitor to store a bit • Design concepts are mostly independent of the actual technique used to store a bit of data

Memory Design with D Flip Flops • An example • 4X3 memory design • Uses 12 D flip flops in a 2D array • Uses a 2-to-4 decoder to select a row (i.e. a word) • Multiplexers are used to gate the appropriate output • A single WRITE (WR) is used to serve as a write and read signal • zero is used to indicate write operation • one is used for read operation • Two address lines are needed to select one of four words of 3 bits each

Memory Design with D Flip Flops (cont’d)

Problems with the design Requires separate data in and out lines Cannot use the bidirectional data bus Cannot use this design as a building block to design larger memories To do this, we need a chip select input We need techniques to connect multiple devices to a bus Memory Design with D Flip Flops

Techniques to Connect to a Bus • Three techniques • Use multiplexers • Example • We used multiplexers in the last memory design • We cannot use MUXs to support bidirectional buses • Use open collector outputs • Special devices that facilitate connection of several outputs together • Use tri-state buffers • Most commonly used devices

Techniques to Connect to a Bus Open collector technique

Techniques to Connect to a Bus Tri-State Buffers

Techniques to Connect to a Bus Two example tri-state buffer chips

Building a Memory Block A 4 X 3 memory design using D flip-flops

Building Larger Memories 2 X 16 memory module using 74373 chips

Designing Larger Memories • Issues involved • Selection of a memory chip • Example: To design a 64M X 32 memory, we could use • Eight 64M X 4 in 1 X 8 array (i.e., single row) • Eight 32M X 8 in 2 X 4 array • Eight 16M X 16 in 4 X 2 array • Designing M X N memory with D X W chips • Number of chips = M.N/D.W • Number of rows = M/D • Number of columns = N/W

Designing Larger Memories 64M X 32 memory using 16M X 16 chips

Memory Mapping Full mapping

Memory Mapping (cont’d) Partial mapping

Interleaved Memory • In our memory designs • Block of contiguous memory addresses is mapped to a module • One advantage • Incremental expansion • Disadvantage • Successive accesses take more time • Not possible to hide memory latency • Interleaved memories • Improve access performance • Allow overlapped memory access • Use multiple banks and access all banks simultaneously • Addresses are spread over banks • Not mapped to a single memory module

Interleaved Memory (cont’d) • The n-bit address is divided into two r and m bits: n = r + m • Normal memory • Higher order r bits identify a module • Lower order m bits identify a location in the module • Called high-order interleaving • Interleaved memory • Lower order r bits identify a module • Higher order m bits identify a location in the module • Called low-order interleaving • Memory modules are referred to as memory banks

Interleaved Memory (cont’d)

Interleaved Memory (cont’d) • Two possible implementations • Synchronized access organization • Upper m bits are presented to all banks simultaneously • Data are latched into output registers (MDR) • During the data transfer, next m bits are presented to initiate the next cycle • Independent access organization • Synchronized design does not efficiently support access to non-sequential access patterns • Allows pipelined access even for arbitrary addresses • Each memory bank has a memory address register (MAR) • No need for MDR

Interleaved Memory (cont’d) Synchronized access organization

Interleaved Memory (cont’d) Interleaved memory allows pipelined access to memory

Interleaved Memory (cont’d) Independent access organization

Interleaved Memory (cont’d) • Number of banks • M = memory access time in cycles • To provide one word per cycle • Number of banks M • Drawbacks of interleaved memory • Involves complex design • Example: Need MDR or MAR • Reduced fault-tolerance • One bank failure leads to failure of the whole memory • Cannot be expanded incrementally

1. Static RAM (SRAM) • Essentially uses flip-flops to store charge (transistor circuit) • As long as power is present, transistors do not lose charge (no refresh) • Very fast (no sense circuitry to drive nor charge depletion) • Complex construction • Large bit circuit • Expensive • Used for Cache RAM because of speed and no need for large volume

Static RAM Structure 1 1 “NOT” 0 six transistors per bit (flip flop) 1 0 = example 0/1 0 1 0

2. Dynamic RAM (DRAM) • Bits stored as charge in capacitors • Simpler construction • Smaller per bit • Less expensive • Slower than SRAM • Typical application is main memory • Essentially analogue -- level of charge determines value

Dynamic RAM Structure ‘High’ Voltage at Y allows current to flow from X to Z or Z to X Y X Z + one transistor and one capacitor per bit

SRAM v.s. DRAM Static Random Access Memory (SRAM) Dynamic Random Access Memory (DRAM) Storage element • Fast • No refreshing operations • High density and less expensive Advantages • Large silicon area • expensive • Slow • Require refreshing operations Disadvantages High speed memory applications, Such as cache Applications Main memories in computer systems

4446 Design of Microprocessor-Based Systems