1 / 115

Digital Integrated Circuits A Design Perspective

Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Semiconductor Memories. December 20, 2002. Chapter Overview. Memory Classification. Memory Architectures. The Memory Core. Periphery. Reliability Case Studies.

len-wade
Download Presentation

Digital Integrated Circuits A Design Perspective

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Integrated CircuitsA Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Semiconductor Memories December 20, 2002

  2. Chapter Overview • Memory Classification • Memory Architectures • The Memory Core • Periphery • Reliability • Case Studies

  3. Memory Classification • Size • Speed (timing) • Technology (CMOS, bipolar) • Function (ROM, RWM) • Volatility • Access (random, serial) • Application

  4. Semiconductor Memory Classification Non-Volatile Read-WriteMemory Read-Write Memory Read-Only Memory Random Non-Random EPROM Mask-Programmed Access Access 2 E PROM Programmable (PROM) FLASH FIFO SRAM LIFO DRAM Shift Register CAM

  5. Memory Timing: Definitions

  6. Memory Timing: Definitions • Read-access time: delay between the read request and the moment the data is available at the output. • Write-access time: delay between the write request and the writing of the input data into the memory. • Cycle time: the minimum time required between successive reads (or writes).

  7. A Typical Memory Hierarchy • By taking advantage of the principle of locality: • Present the user with as much memory as is available in the cheapest technology. • Provide access at the speed offered by the fastest technology. On-Chip Components Control eDRAM Secondary Memory (Disk) Instr Cache Second Level Cache (SRAM) ITLB Main Memory (DRAM) Datapath Data Cache RegFile DTLB Speed (ns): .1’s 1’s 10’s 100’s 1,000’s Size (bytes): 100’s K’s 10K’s M’s T’s Cost: highest lowest

  8. Memory Architecture • A memory of N words, each one M bits wide, needs one select bit to identify the word. • If the number of words is high (f.i. N=106 or N=220) we need too many select lines. • A memory word is selected from a decoder where encode addresses enter: N=2k (in the exemple k=20).

  9. Decoder reduces the number of select signals K = log N 2 Memory Architecture: Decoders M bits M bits S S 0 0 Word 0 Word 0 S 1 Word 1 Word 1 A 0 S Storage Storage 2 Word 2 Word 2 A cell cell 1 words A N K 1 Decoder - S N 2 - - Word N 2 Word N 2 - S N 1 - - Word N 1 Word N 1 - K log N = 2 Input-Output Input-Output ( M bits) ( M bits) Intuitive architecture for N x M memory Too many select signals: N words == N select signals

  10. Array-Structured Memory Architecture Problem: ASPECT RATIO or HEIGHT >> WIDTH Amplify swing to rail-to-rail amplitude Selects appropriate word

  11. Array-Structured Memory Architecture • The select line that enables a sigle row of cells is the word line. • The line that connects the cells in a single column to the I/O circuits is the bit line. • Reduced swing requires sense amplifiers.

  12. Hierarchical Memory Architecture Advantages: 1. Shorter wires within blocks 2. Block address activates only 1 block => power savings

  13. Clock -address -address Z X generator buffer buffer Predecoder and block selector Bit line load Transfer gate Column decoder Sense amplifier and write driver CS, WE I/O x1/x4 -address -address Y X buffer buffer controller buffer buffer Block Diagram of 4 Mbit SRAM 128 K Array Block 0 Subglobal row decoder Subglobal row decoder Global row decoder Block 30 Block 31 Block 1 Local row decoder [Hirose90]

  14. Contents-Addressable Memory I/O Buffers I/O Buffers Commands Commands Validity Bits 9 2 Priority Encoder Validity Bits Address Decoder 9 2 Priority Encoder Address Decoder

  15. DRAM Timing SRAM Timing Self-timed Multiplexed Adressing Memory Timing: Approaches

  16. Read-Only Memory Cells BL BL BL VDD WL WL WL 1 BL BL BL WL WL WL 0 GND Diode ROM MOS ROM 1 MOS ROM 2

  17. MOS OR ROM BL [0] BL [1] BL [2] BL [3] WL [0] V DD WL [1] WL [2] V DD WL [3] V bias Pull-down loads

  18. MOS NOR ROM V DD Pull-up devices WL [0] GND WL [1] WL [2] GND WL [3] BL [0] BL [1] BL [2] BL [3]

  19. MOS NOR ROM Layout Cell (9.5l x 7l) Programmming using the Active Layer Only Polysilicon Metal1 Diffusion Metal1 on Diffusion

  20. MOS NOR ROM Layout Cell (11l x 7l) Programmming using the Contact Layer Only Polysilicon Metal1 Diffusion Metal1 on Diffusion

  21. MOS NAND ROM V DD Pull-up devices BL [0] BL [1] BL [2] BL [3] WL [0] WL [1] WL [2] WL [3] All word lines high by default with exception of selected row

  22. No contact to VDD or GND necessary; drastically reduced cell size Loss in performance compared to NOR ROM MOS NAND ROM Layout Cell (8l x 7l) Programmming using the Metal-1 Layer Only Polysilicon Diffusion Metal1 on Diffusion

  23. NAND ROM Layout Cell (5l x 6l) Programmming using Implants Only Polysilicon Threshold-alteringimplant Metal1 on Diffusion

  24. V DD BL r word WL C bit c word Equivalent Transient Model for MOS NOR ROM Model for NOR ROM • Word line parasitics • Wire capacitance and gate capacitance • Wire resistance (polysilicon) • Bit line parasitics • Resistance not dominant (metal) • Drain and Gate-Drain capacitance

  25. Equivalent Transient Model for MOS NAND ROM V DD Model for NAND ROM BL C L r bit c bit r word WL c word • Word line parasitics • Similar to NOR ROM • Bit line parasitics • Resistance of cascaded transistors dominates • Drain/Source and complete gate capacitance

  26. Decreasing Word Line Delay

  27. Precharged MOS NOR ROM V f DD pre Precharge devices WL [0] GND WL [1] WL [2] GND WL [3] BL [0] BL [1] BL [2] BL [3] PMOS precharge device can be made as large as necessary, but clock driver becomes harder to design.

  28. D G S Non-Volatile MemoriesThe Floating-gate transistor (FAMOS) Floating gate Gate Source Drain t ox t ox + +_ n n p Substrate Schematic symbol Device cross-section

  29. 20 V 0 V 5 V 20 V 0 V 5 V 10 V 5 V 5 V 2.5 V 2 2 S D S D S D Avalanche injection Removing programming voltage leaves charge trapped Programming results in higher V . T Floating-Gate Transistor Programming

  30. A “Programmable-Threshold” Transistor

  31. FLOTOX EEPROM Gate Floating gate I Drain Source V 20 – 30 nm -10 V GD 10 V 1 1 n n Substrate p 10 nm Fowler-Nordheim I-V characteristic FLOTOX transistor

  32. V DD EEPROM Cell BL WL Absolute threshold control is hard Unprogrammed transistor might be depletion  2 transistor cell

  33. Flash EEPROM Control gate Floating gate erasure Thin tunneling oxide 1 1 n source n drain programming p- substrate Many other options …

  34. Cross-sections of NVM cells Flash EPROM Courtesy Intel

  35. Basic Operations in a NOR Flash Memory―Erase

  36. Basic Operations in a NOR Flash Memory―Write

  37. Basic Operations in a NOR Flash Memory―Read

  38. NAND Flash Memory Word line(poly) Unit Cell Source line (Diff. Layer) Courtesy Toshiba

  39. Select transistor Word lines Active area STI Bit line contact Source line contact NAND Flash Memory Courtesy Toshiba

  40. Characteristics of State-of-the-art NVM

  41. Read-Write Memories (RAM) • STATIC (SRAM) Data stored as long as supply is applied Large (6 transistors/cell) Fast Differential • DYNAMIC (DRAM) Periodic refresh required Small (1-3 transistors/cell) Slower Single Ended

  42. Q 6-transistor CMOS SRAM Cell WL V DD M M 2 4 Q M M 6 5 M M 1 3 BL BL

  43. CMOS SRAM Analysis (Read Q=1) • Both bit lines are precharged high. • The word line is actvated: M5 and M6 are ON. • The read operation consists of leaving BL=1 and of discharging BL through M1- M5. • The resistance of M5 must be larger than that of M1 to avoid the change of state of the cell (read disturb). • Alternatively, the bit lines can be precharged at VDD/2.

  44. CMOS SRAM Analysis (Read) WL V DD M BL 4 BL Q 0 = M Q 1 = 6 M 5 V V V M DD DD DD 1 C C bit bit

  45. CMOS SRAM Analysis (Read) 1.2 1 0.8 0.6 Voltage Rise (V) 0.4 0.2 Voltage rise [V] 0 0 0.5 1 1.2 1.5 2 2.5 3 Cell Ratio (CR)

  46. CMOS SRAM Analysis (Write Q=0) • When Q=1, a 0 is written in the cell by setting BL=0 e BL=1. • Due to the read-disturb condition the data has to be written through M6. • Q must become lower than the threshold of M1.

  47. WL V DD M 4 M Q 0 = 6 M 5 Q 1 = M 1 V DD BL 1 BL 0 = = CMOS SRAM Analysis (Write)

  48. CMOS SRAM Analysis (Write) PR=W4/W6

  49. Cell Sizing • Keeping cell size minimized is critical for large caches • Minimum sized pull down fets (M1 and M3) • Requires minimum width and longer than minimum channel length pass transistors (M5 and M6) to ensure proper CR • But sizing of the pass transistors increases capacitive load on the word lines and limits the current discharged on the bit lines both of which can adversely affect the speed of the read cycle • Minimum width and length pass transistors • Boost the width of the pull downs (M1 and M3) • Reduces the loading on the word lines and increases the storage capacitance in the cell – both are good! – but cell size may be slightly larger

  50. VDD M2 M4 Q Q M1 M3 GND M5 M6 WL BL BL 6T-SRAM — Layout

More Related