Digital integrated circuits a design perspective
This presentation is the property of its rightful owner.
Sponsored Links
1 / 114

Digital Integrated Circuits A Design Perspective PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on
  • Presentation posted in: General

Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Semiconductor Memories. December 20, 2002. Chapter Overview. Memory Classification. Memory Architectures. The Memory Core. Periphery. Reliability Case Studies.

Download Presentation

Digital Integrated Circuits A Design Perspective

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Digital integrated circuits a design perspective

Digital Integrated CircuitsA Design Perspective

Jan M. Rabaey

Anantha Chandrakasan

Borivoje Nikolic

Semiconductor

Memories

December 20, 2002


Chapter overview

Chapter Overview

  • Memory Classification

  • Memory Architectures

  • The Memory Core

  • Periphery

  • Reliability

  • Case Studies


Semiconductor memory classification

Semiconductor Memory Classification

Non-Volatile

Read-WriteMemory

Read-Write Memory

Read-Only Memory

Random

Non-Random

EPROM

Mask-Programmed

Access

Access

2

E

PROM

Programmable (PROM)

FLASH

FIFO

SRAM

LIFO

DRAM

Shift Register

CAM


Memory timing definitions

Memory Timing: Definitions

Read-access is from read request time to the time data is available on the output

Write-access is from the write request to the final writing of the input data to the memory


Memory architecture decoders

Decoder reduces the number of selected signals

K = log

N

2

Memory Architecture: Decoders

M

bits

M

bits

S

S

0

0

Word 0

Word 0

S

1

Word 1

Word 1

A

0

S

Storage

Storage

2

Word 2

Word 2

A

cell

cell

1

N words

Decoder

A

K

1

2

S

N

2

-

2

Word

N

2

Word

N

2

2

S

N

1

2

-

Word

N

1

Word

N

1

2

K

log

N

=

2

Input-Output

Input-Output

(

M

bits)

(

M

bits)

Intuitive architecture for N x M memory

Too many select signals:

N words == N select signals


Array structured memory architecture

Array-Structured Memory Architecture

Problem: ASPECT RATIO or HEIGHT >> WIDTH

Amplify swing to

rail-to-rail amplitude

Selects appropriate

word


Hierarchical memory architecture

Hierarchical Memory Architecture

Advantages:

1. Shorter wires within blocks

2. Block address activates only 1 block => power savings


Block diagram of 4 mbit sram

Block Diagram of 4 Mbit SRAM

X – row address

Y – column address

Z – block address

Clock

-address

-address

Z

X

generator

buffer

buffer

Predecoder and block selector

Bit line load

Block 1

128 K Array Block 0

Global row decoder

Subglobal row decoder

Subglobal row decoder

Block 31

Block 30

Transfer gate

Local row decoder

Column decoder

Sense amplifier and write driver

CS, WE

I/O

x1/x4

-address

-address

Y

X

buffer

buffer

controller

buffer

buffer

[Hirose90]


Contents addressable memory

Contents-Addressable Memory

Data (64 bits)

Every row that satisfies the validity

check (matches the mask bits) will

activate the validity bits

I/O Buffers

Comparand

Data pattern to be matched

Commands

Mask

Priority encoder passes

Only the valid rows and

selects the one with the

highest address in case

of the multiple matches

CAM Array

ProrityEncoder

Address Decoder

R/W Address (9 bits)

Control Logic

9

2

words

64 bits


Memory timing approaches

DRAM Timing

SRAM Timing

Self-timed

Multiplexed Addressing

Memory Timing: Approaches

Multiplexed addressing

Smaller number of pins needed since row

and column addresses are sent sequentially

Strobe signals RAS and CAS are used to

read them

Complete addressing

Complete address is presented at once

with sensing circuits to detect transitions

on the address buss


Read only memory cells

BL

BL

BL

VDD

WL

WL

WL

1

BL

BL

BL

WL

WL

WL

0

GND

Diode ROM

MOS ROM 1

MOS ROM 2

Read-Only Memory Cells

Bit line (BL) is resistively clamped to the ground, so its default value is 0

Diode disadvantage – no electrical isolation between bit and word lines

BL is resistively clamped

to VDD, so its default

value is 1


Mos or rom

MOS OR ROM

BL

[0]

BL

[1]

BL

[2]

BL

[3]

WL

[0]

V

DD

WL

[1]

WL

[2]

V

DD

WL

[3]

V

bias

Pull-down loads


Mos nor rom

MOS NOR ROM

V

DD

Pull-up devices

WL

[0]

GND

WL

[1]

WL

[2]

GND

WL

[3]

BL

[0]

BL

[1]

BL

[2]

BL

[3]


Mos nor rom layout

MOS NOR ROM Layout

Cell (9.5l x 7l)

  • Programming using the

  • Active Layer Only

  • diffusion is added to

  • create transistors

GND

Polysilicon

GND

Metal1

Diffusion

Metal1 on Diffusion

VDD

VDD

VDD

VDD

Connected to VDD through pMOS


Mos nor rom layout1

MOS NOR ROM Layout

Cell (11l x 7l)

GND

  • Programming using

  • the Contact Layer Only

  • contacts are added

  • to create transistors

Polysilicon

Metal1

GND

Diffusion

Metal1 on Diffusion

VDD

VDD

VDD

VDD

Connected to VDD through pMOS


Mos nand rom

MOS NAND ROM

V

DD

Pull-up devices

BL

[0]

BL

[1]

BL

[2]

BL

[3]

WL

[0]

WL

[1]

WL

[2]

WL

[3]

All word lines high by default with exception of selected row


Mos nand rom layout

No horizontal contact to GND necessary;

drastically reduced cell size

Loss in performance compared to NOR ROM

MOS NAND ROM Layout

Cell (8l x 7l)

Programming using

the Metal-1 Layer Only

Polysilicon

Diffusion

Metal1 on Diffusion

Add metal to eliminate transistor (short circuit)


Nand rom layout

NAND ROM Layout

Cell (5l x 6l)

Programming using

Implants Only

Polysilicon

Threshold-alteringimplant

Implant switches transistor on permanently

thus eliminating it in ROM

this results in a very small layout area

(2 times smaller than NOR cell)

Metal1 on Diffusion


Equivalent transient model for mos nor rom

V

DD

BL

r

word

WL

C

bit

c

word

Equivalent Transient Model for MOS NOR ROM

Model for NOR ROM

  • Word line parasitics

    • Wire capacitance and gate capacitance

    • Wire resistance (polysilicon)

  • Bit line parasitics

    • Resistance not dominant (metal)

    • Drain and Gate-Drain capacitance


Equivalent transient model for mos nand rom

Equivalent Transient Model for MOS NAND ROM

V

DD

Model for NAND ROM

BL

C

L

r

bit

c

bit

r

word

WL

c

word

  • Word line parasitics

    • Similar to NOR ROM

  • Bit line parasitics

    • Resistance of cascaded transistors dominates

    • Drain/Source and complete gate capacitance


Decreasing word line delay

Decreasing Word Line Delay


Precharged mos nor rom

Precharged MOS NOR ROM

V

f

DD

pre

Precharge devices

WL

[0]

GND

WL

[1]

WL

[2]

GND

WL

[3]

BL

[0]

BL

[1]

BL

[2]

BL

[3]

PMOS precharge device can be made as large as necessary,

but clock driver becomes harder to design.


Non volatile memories the floating gate avalanche injection mos transistor famos

D

G

S

Non-Volatile MemoriesThe Floating-gate Avalanche injection MOS transistor (FAMOS)

Floating gate

Gate

Source

Drain

t

ox

t

ox

+

+_

n

n

p

Substrate

Schematic symbol

Device cross-section


Floating gate transistor programming

20 V

0 V

5 V

20 V

0 V

5 V

10 V

5 V

5 V

2.5 V

-

-

S

D

S

D

S

D

Avalanche injection

Removing programming

voltage leaves charge trapped

Programming results in

higher

V

.

T

Floating-Gate Transistor Programming

Hot electrons go through

the oxide and are trapped

reducing floating gate voltage

Typically new threshold is around 7 V so 5 V supply is not sufficient to turn transistor on, so the device is disabled


A programmable threshold transistor

A “Programmable-Threshold” Transistor

Shifted threshold effectively

switches transistor off permanently

Shift in the threshold depends on the

charge injected onto the floating gate

Injected charges are well insulated by silicon dioxide and can

stay there with power off for many years.

Floating gate used in almost all nonvolatile memories (EPROM, EEPROM, Flash)


Erasable programmable read only memory eprom

Erasable-Programmable Read-Only Memory (EPROM)

  • EPROM is erased by shining ultraviolet light

    through a transparent package window

  • UV makes the oxide slightly conductive by

    generation of electron-hole pairs

  • The erasure process is slow (seconds to minutes)

  • Programming takes 5-10 msec

  • Erasing can be repeated up to a thousand times

  • Threshold is difficult to control after many erasures

    so special on-chip circuitry is required to control it

  • High current required during programming


Floating gate tunneling oxide flotox transistor eeprom

Floating Gate Tunneling Oxide FLOTOX Transistor EEPROM

Gate

Floating gate

I

Drain

Source

V

20

30 nm

-10 V

GD

10 V

1

1

n

n

Substrate

p

10 nm

Fowler-Nordheim

tunneling I-V characteristic

FLOTOX transistor

Smaller voltage is required to program and

programming is reversible by changing the voltage sign


Eeprom cell

V

DD

EEPROM Cell

BL

  • Absolute threshold control

  • is hard

  • Programmed transistor might

  • be in depletion mode hard to

  • turn off by word-line signal

  • 2 transistor cell

  • Design is larger than

    EPROM device

  • Thin oxide is hard to make

  • Erase-program can be

    repeated 105 times

WL


Flash eeprom

Control gate

Floating gate

Erasure

12 V on source

Thin tunneling oxide ~10 nm

+

+

n

source

n

drain

Programming

12 V on gate

p-

substrate

Flash EEPROM

  • Flash EPROM combines density of EPROM with versatility of EEPROM

  • Programming performed by avalanche hot-electron injection (fast 1-10msec)

  • Erasure of the complete chip is done using tunneling (careful control >100ms)

  • No extra access transistor needed


Cross sections of nvm cells

Cross-sections of NVM cells

Flash

EPROM

Courtesy Intel


Basic operations in a nor flash memory erase

Basic Operations in a NOR Flash Memory―Erase

All transistors erased

Read back and repeat if

erase is still needed


Basic operations in a nor flash memory write

Basic Operations in a NOR Flash Memory―Write

Programming requires 12V ant

the gate and 6V at the drain

with 0V source


Basic operations in a nor flash memory read

Basic Operations in a NOR Flash Memory―Read

  • In read operation the programmed

  • transistor stores 1 as it is switched

  • off permanently

  • NOR Flash memories have

  • Fast random read time

  • Slow erasure and programming time

  • Need precise control of thresholds


Nand flash memory

Word line(poly)

Unit Cell

Source line

(Diff. Layer)

NAND Flash Memory

High dielectric material –

oxide-nitride-oxide for large CFG

Select transistors

Large storage density lower cost

Fast programming 200-400 nsec

Fast serial access

Courtesy Toshiba


Nand flash memory1

Select transistor

Word lines

Active area

STI

Bit line contact

Source line contact

NAND Flash Memory

All contacts between word line

are eliminated resulting in 40%

smaller cell than NOR structure

During erasure all transistor are

depletion mode obtained by setting 20V at the source

During program the selected word line is set to 20V to store “1” increasing its threshold

During read both select transistors are enabled and read proceeds as in the NAND ROM

Courtesy Toshiba


New nonvolatile memories

New Nonvolatile Memories

  • FRAM – Ferroelectric RAM

    • Uses programmable capacitors

    • Dielectric cristals polarize under electric field

    • Very high density, many read/write cycles,

    • Low power

  • MRAM – Magnetoresistive RAM

    • Similar to magnetic core memories

    • Spin electronic or tunneling magnetic resistance


Characteristics of state of the art nvm

Characteristics of State-of-the-art NVM


Read write memories ram

Read-Write Memories (RAM)

  • STATIC (SRAM)

Data stored as long as supply is applied

Large (6 transistors/cell)

Fast

Differential

  • DYNAMIC (DRAM)

Periodic refresh required

Small (1-3 transistors/cell)

Slower

Single Ended


6 transistor cmos sram cell

Q

6-transistor CMOS SRAM Cell

WL

V

DD

M

M

2

4

Q

M

M

6

5

M

M

1

3

BL

BL


Cmos sram analysis read

CMOS SRAM Analysis (Read)

WL

V

DD

M

BL

4

BL

Q

0

=

M

Q

1

=

6

M

5

V

V

V

M

DD

DD

DD

1

C

C

bit

bit

Initially BL and BLbar are precharged. Assume that cell stores 1

When the WL goes high BLbar is discharged low

Must keep Qbar voltage rise below the nMOS threshold (0.4V)

to avoid flipping the cell


Cmos sram analysis read1

CMOS SRAM Analysis (Read)

1.2

Cell ratio

1

0.8

0.6

0.4

Design for

cell ratio in the green zone

0.2

0

0

0.5

1.2

1

1.5

2

2.5

3

Voltage Rise (V)

Cell Ratio (CR)


Cmos sram analysis write

WL

V

DD

M

4

M

Q

0

=

6

M

5

Q

1

=

M

1

V

DD

BL

1

BL

0

=

=

CMOS SRAM Analysis (Write)


Cmos sram analysis write1

CMOS SRAM Analysis (Write)

Must pull down cell voltage VQbelow the nMOS threshold (0.4V)


6t sram layout

VDD

M2

M4

Q

Q

M1

M3

GND

M5

M6

WL

BL

BL

6T-SRAM — Layout


Resistance load sram cell

WL

V

DD

R

R

L

L

Q

Q

M

M

3

4

BL

BL

M

M

1

2

Static power dissipation -- Want R

large

L

Bit lines precharged to V

to address t

problem

DD

p

Resistance-load SRAM Cell

For large resistor values use

undoped polysilicon with

sheet resistance in Tohm/sq

An alternative solution uses low

quality parasitic thin-film PMOS

(TFT) with OFF current 10-13A


Resistance load sram cell1

WL

V

DD

R

R

L

L

Q

Q

M

M

3

4

BL

BL

M

M

1

2

Static power dissipation -- Want R

large

L

Bit lines precharged to V

to address t

problem

DD

p

Resistance-load SRAM Cell

For large resistor values use

undoped polysilicon with

sheet resistance in Tohm/sq

An alternative solution uses low

quality parasitic thin-film PMOS

(TFT) with OFF current 10-13A


Static cam memory cell

Bit

Bit

Bit

Bit

Bit

Bit

Word

M8

M9

M4

M5

CAM

•••

CAM

M6

M7

Word

S

S

Word

int

CAM

•••

CAM

M3

M2

Match

M1

Wired-NOR Match Line

Static CAM Memory Cell

Incoming data on

Bit and Bit_bar

are compared with

the stored data

S and S_bar

Match line is

precharged high

•••

•••

If there is a match the internal signal int is grounded and match stays high

Otherwise int is pulled high and match goes low


Sram characteristics

SRAM Characteristics


3 transistor dram cell

BL

1

BL

2

WWL

WWL

RWL

RWL

M

3

V

V

X

M

X

2

DD

T

1

M

2

V

DD

BL

1

C

S

V

D

BL

2

V

V

2

DD

T

No constraints on device ratios

Reads are non-destructive

Value stored at node X when writing a “1” = V

-V

WWL

Tn

3-Transistor DRAM Cell


3t dram layout

BL2

BL1

GND

RWL

M3

M2

WWL

M1

3T-DRAM — Layout


1 transistor dram cell

C

S

)------------

V

V

D

=

V

=

(V

V

BL

PRE

BIT

PRE

C

+

C

S

BL

1-Transistor DRAM Cell

Storage capacitance

Bit-line is precharged

Write: C

is charged or discharged by asserting WL and BL.

S

Read: Charge redistribution takes places between bit line and storage capacitance

Voltage swing is small; typically around 250 mV.


Dram cell observations

DRAM Cell Observations

  • 1T DRAM requires a sense amplifier for each bit line, due to charge redistribution read-out.

  • DRAM memory cells are single ended in contrast to SRAM cells.

  • The read-out of the 1T DRAM cell is destructive; read and refresh operations are necessary for correct operation.

  • Unlike 3T cell, 1T cell requires presence of an extra capacitance that must be explicitly included in the design.

  • When writing a “1” into a DRAM cell, a threshold voltage is lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value thanVDD


Sense amp operation

V

V

(1)

BL

V

PRE

D

V

(1)

V

(0)

t

Sense amp activated

Word line activated

Sense Amp Operation


1 t dram cell

Metal word line

SiO

2

Poly

Field Oxide

Diffused

+

+

n

n

bit line

Inversion layer

Poly

Polysilicon

induced by

Polysilicon

plate

plate bias

gate

1-T DRAM Cell

Capacitor

M

word

1

line

Cross-section

Layout

Uses polysilicon-diffusion capacitance

Expensive in area


Poly diffusion capacitor 1t dram

Poly-diffusion capacitor 1T-DRAM


Advanced 1t dram cells

Advanced 1T DRAM Cells

Word line

Capacitor dielectric layer

Cell plate

Insulating Layer

Cell Plate Si

Capacitor Insulator

Transfer gate

Isolation

Refilling Poly

Storage electrode

Storage Node Poly

Si Substrate

2nd Field Oxide

Stacked-capacitor Cell

Trench Cell


Advanced 1t dram cells1

Advanced 1T DRAM Cells

Word line

Capacitor dielectric layer

Cell plate

Insulating Layer

Cell Plate Si

Capacitor Insulator

Transfer gate

Isolation

Refilling Poly

Storage electrode

Storage Node Poly

Si Substrate

2nd Field Oxide

Stacked-capacitor Cell

Trench Cell


Cam in cache memory

CAM

SRAM

ARRAY

ARRAY

Input Drivers

Sense Amps / Input Drivers

Address

Tag

Hit

R/W

Data

CAM in Cache Memory

Hit Logic

Address Decoder

Cash memory is used to store frequently accessed data to lower the memory

access time and power. In cash CAM stores addresses and SRAM stores data.

Once the address of requested data matches the one in CAM Hit signal goes

high and data is read from SRAM otherwise the external slow memory

must be read


Periphery

Periphery

  • Decoders

  • Sense Amplifiers

  • Input/Output Buffers

  • Control / Timing Circuitry


Row decoders

Row Decoders

Collection of 2M complex logic gates

Organized in regular and dense fashion

(N)AND Decoder - followed by inverter

NOR Decoder


Hierarchical decoders

WL

1

WL

0

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

A

0

1

0

1

0

1

0

1

2

3

2

3

2

3

2

3

A

A

A

A

A

A

A

A

1

0

0

1

3

2

2

3

NAND decoder using

2-input pre-decoders

Hierarchical Decoders

Multi-stage implementation improves performance


Dynamic decoders

V

DD

V

DD

V

DD

V

DD

Dynamic Decoders

Precharge devices

GND

GND

WL

3

WL

3

WL

2

WL

2

WL

1

WL

1

WL

0

WL

0

V

A

A

A

A

f

DD

0

0

1

1

A

A

A

A

f

0

0

1

1

2-input NAND decoder

Identical to NAND ROM

2-input NOR decoder

Identical to NOR ROM


4 input pass transistor based column decoder

BL

BL

BL

BL

0

1

2

3

S

0

A

0

S

1

S

2

A

S

1

3

D

4-input pass-transistor based Column Decoder

2-input NOR decoder

Advantages: speed (tpd does not add to overall memory access time)

Only one extra transistor in signal path

Disadvantage: Large transistor count


4 to 1 tree based column decoder

4-to-1 tree based Column Decoder

BL

BL

BL

BL

0

1

2

3

A

0

A

0

A

1

A

1

D

Number of devices drastically reduced

Delay increases quadratically with # of sections; prohibitive for large decoders

Solutions:

buffers

progressive sizing

combination of tree and pass transistor approaches


Decoder for circular shift register

V

V

V

V

V

V

DD

DD

DD

DD

DD

DD

WL

WL

WL

0

1

2

f

f

f

f

f

f

R

R

R

f

f

f

f

f

f

V

DD

Decoder for circular shift-register

In serial access memories read or write address changes sequentially

Only one WLi is active (a pointer) and shifts by one with clock f

R is used for reset


Sense amplifiers

D

make

V as small

×

D

C

V

as possible

t

=

----------------

p

I

av

large

small

Sense Amplifiers

Idea: Use Sense Amplifer

small

s.a.

transition

input

output


Differential sense amplifier

Differential Sense Amplifier

V

DD

M

M

3

4

y

Out

M

M

bit

bit

1

2

M

SE

Directly applicable toSRAMs. Gain

Asense=-gm (ro2||ro4)

gm is transconductance

of the input transistors

5


Differential sensing sram

V

V

DD

DD

PC

BL

BL

V

V

DD

DD

EQ

M

M

-

y

y

3

4

WL

i

M

M

-

-

1

2

x

x

x

x

SE

SE

M

5

SE

SRAM cell

i

V

DD

Diff.

-

x

x

Output

Sense

y

Amp

SE

Output

(a) SRAM sensing scheme

(b) two stage differential amplifier

Differential Sensing ― SRAM

Precharge bit lines by pulling PC_bar low

Disable precharge to read

Set SE to activate the sense amplifier

Inputs from memory


Latch based sense amplifier dram

Latch-Based Sense Amplifier (DRAM)

EQ

BL

BL

V

DD

SE

SE

Initialized in its meta-stable point with EQ

Once adequate voltage gap is created, sense amp enabled with SE

Positive feedback quickly forces output to a stable operating point.


Charge redistribution amplifier

Charge-Redistribution Amplifier

V

ref

BL

V

V

L

S

M

1

Vin

C

small

C

M

M

large

2

3

Transient Response

Concept

Vs prechrged to VDD and VL to Vref-Vth

M1 is cut off

When M2 pulls down M1 conducts and Vs

is quickly lowered to equalize VL


Charge redistribution amplifier eprom

Charge-Redistribution Amplifier―EPROM

V

DD

Load

M

SE

4

Out

Cascode

C

out

M

device

V

3

casc

C

col

Column

decoder

M

WLC

2

BL

C

EPROM

M

BL

1

WL

array


Single to differential conversion

WL

BL

-

x

x

Diff.

+

V

S.A.

Cell

ref

-

Output

Single-to-Differential Conversion

How to make a good Vref for differential amplifier?

Must deliver x_bar smaller than x if “1” is stored

and larger than x if “0” is stored


Open bitline architecture with dummy cells

EQ

L

L

L

V

R

R

L

1

0

DD

0

1

SE

BLL

BLR

C

C

C

C

C

C

S

S

S

SE

S

S

S

Dummy cell

Dummy cell

Open bitline architecture with dummy cells

  • Bit line divided into left and right halves to reduce capacitance

  • Dummy cells are added for reference

  • When EQ signal is raised BLL and BLR are precharged to VDD/2

  • and L and L_bar are enabled charging Dummy cell to VDD/2

    • When reading Li word activate both Li and L

    • When reading Ri word activate both Ri and L_bar


Dram read process with dummy cell

DRAM Read Process with Dummy Cell

3

3

2

2

BL

BL

V

V

1

1

BL

BL

0

0

0

1

2

3

0

1

2

3

t

(ns)

t

(ns)

reading 0

reading 1

3

EQ

WL

2

SE

V

1

0

0

1

2

3

t

(ns)

control signals


Voltage down converter

IR

V

IR

DD

IL- IR

M

drive

IL

V

V

REF

DL

Equivalent Model

V

bias

V

REF

-

M

drive

+

V

DL

Voltage Down Converter

Memories require different level

of voltages

Boosted word-line voltage VDD+Vtn

Precharge half VDD voltage

Reduced internal VDD power supply

Negative substrate bias voltage

This circuit delivers the reference

voltage to VDL

When VDL<VREF => PMOS drive gate is discharged increasing VDL

When VDL>VREF=> PMOS drive gate is charged increasing VDL


Charge pump

-

-

Charge Pump

Transistors M1 and M2 are connected as diodes.

The charges stored at capacitor Q=Cpump(VDD-VT)

When B rises above Vload by more than threshold the Vload increases

Maximum load voltage Vload rises to 2*(VDD-VT) – higher than VDD


Dram timing

DRAM Timing

Total of 24 timing constraints must be observed


Rdram architecture

Bus

Clocks

k

Data

k

l

*

memory

network

bus

array

mux/demux

Column

packet dec.

demux

Row

packet dec.

demux

RDRAM Architecture

Very high speed packet transfer protocol is used to transfer

large amount of data

Narrow bus uses several clock cycles to transfer the data


Address transition detection

V

DD

DELAY

t

A

d

0

ATD

ATD

DELAY

t

A

d

1

DELAY

t

A

1

d

N

2

Address Transition Detection

SRAM memories are triggered byevents detected by ATD circuits


Reliability and yield

Reliability and Yield


Trends in dram parameters

1000

C

D

(fF)

V

smax

(mV)

Q

100

S

(fC)

C

S

(fF)

10

V

DD

(V)

Q

C

V

/

2

=

S

S

DD

V

Q

/

(

C

C

)

=

+

smax

S

S

D

4K

64K

1M

16M

256M

4G

64G

Memory Capacity (bits

/

chip)

Trends in DRAM Parameters

  • bit line capacitance

  • voltage swing

  • storage charge

  • storage capacitance

  • power supply voltage

  • are reduced in

  • new technologies

From [Itoh01]


Open bit line architecture cross coupling

Open Bit-line Architecture —Cross Coupling

Word line selected

creates charge redistribution

Dummy line selected to

compensate charge redistribution

EQ

WL

WL

WL

WL

WL

WL

1

1

0

D

D

0

C

C

WBL

WBL

BL

BL

C

Sense

C

BL

BL

Amplifier

C

C

C

C

C

C

Charge redistribution

If both sides of memory array were symmetrical then the injected bit line

noise would be compensated as a common mode signal for sense amplifiers


Folded bitline architecture

WL

WL

WL

WL

WL

WL

C

D

1

1

0

0

D

WBL

C

y

BL

x

BL

Sense

C

C

C

C

C

C

Amplifier

EQ

x

y

BL

C

BL

C

WBL

Folded-Bitline Architecture

Better matching of BL BL_bar capacitances, so the cross-coupling

noise can be suppressed


Transposed bitline architecture

C

cross

BL

BL

SA

BL

BL

‘’

(a) Straightforward bit-line routing

C

cross

BL

BL

SA

BL

BL

‘’

(b) Transposed bit-line architecture

equalizes cross coupling noise in BL and BL_bar

Transposed-Bitline Architecture


Sources of power dissipation in memories

V

DD

I

C

V

f

I

CHIP

=

S

D

+S

DD

i

i

DCP

nC

V

f

m

DE

INT

selected

mi

act

C

V

f

PT

INT

I

DCP

n

m(n

1)i

-

non-selected

ROW

hld

ARRAY

DEC

mC

V

f

DE

INT

PERIPHERY

COLUMN DEC

V

SS

Sources of Power Dissipation in Memories

iact active selector current

ihld data retention current

CDE decoder capacitance

CPT peripheral capacitance

IDCP static peripheral current

VINT internal supply voltage

From [Itoh00]


Noise sources in 1t dram

substrate

BL

Adjacent BL

C

WBL

a

-particles

WL

leakage

C

S

electrode

C

cross

Noise Sources in 1T DRam

Capacitance Cs must be above 30fF

otherwise a single a-particle

can destroy its charge

Free neutrons from cosmic rays

carry 10 time more charges than

alpha particles

Memories covered with polymide to

protect against alpha radiation

Error correction codes used to

correct most failures


Alpha particles

Alpha-particles

-particle

a

WL

V

DD

BL

SiO

2

+

n

-

+

-

+

-

-

+

+

-

Alpha particle have energy 8-9 MeV

And penetrates silicon up to 10mm depth

+

-

+

1 Particle generates ~ 1 Million electron-hole pairs

This is comparable with the 50 fF capacitance storage at 3.5V


Memory yield

Memory Yield

Yield curves at different stages of process maturity

(from [Veendrick92])

Memory yield problem is fought using redundancy and error correction


Redundancy

Redundancy

Row

Address

Redundant

rows

Fuse

:

Bank

Redundant

columns

Memory

Array

Row Decoder

Column

Column Decoder

Address


Error correcting codes

e.g. B3 Wrong

with

1

= 3

1

0

Error-Correcting Codes

Example: Hamming Codes


Redundancy and error correction

Redundancy and Error Correction


Redundancy and error correction1

Redundancy and Error Correction


Data retention in sram

1.30u

0.13 m CMOS

m

1.10u

900n

700n

Ileakage

500n

Factor 7

300n

0.18 m CMOS

m

100n

0.00

.600

1.20

1.80

VDD

Data Retention in SRAM

  • SRAM leakage increases with technology scaling yet for 64-Gb memory leakage current should be less than 3.5 aA at 25C

  • Reduce leakage by turning

    off unused memory blocks (cashes)

  • Increase thresholds by using

    body biasing

  • Increase resistance in the leakage path

  • Lower the supply voltage

  • Lower the junction temperature

  • Scale down the refresh period

(A)


Suppressing leakage in sram

Suppressing Leakage in SRAM

V

DD

V

V

low-threshold transistor

DD

DDL

sleep

V

sleep

DD,int

V

DD,int

SRAM

SRAM

SRAM

cell

cell

cell

SRAM

SRAM

SRAM

cell

cell

cell

V

SS,int

sleep

Inserting Extra Resistance

Reducing the supply voltage


Data retention in dram

Data Retention in DRAM

Data retention

(standby) current

Active current

Estimated current distribution for DRAM generations

From [Itoh00]


Case studies

Case Studies

  • Programmable Logic Array

  • SRAM

  • Flash Memory


Pla versus rom

PLA versus ROM

  • Programmable Logic Array

structured approach to random logic

“two level logic implementation”

NOR-NOR (product of sums)

NAND-NAND (sum of products)

IDENTICAL TO ROM!

  • Main difference

ROM: fully populated

PLA: one element per minterm

Note: Importance of PLA’s has drastically reduced

1.

slow

2.

better software techniques (mutli-level logic

synthesis) for random logic design


Programmable logic array

Programmable Logic Array

Pseudo-NMOS PLA

V

DD

GND

GND

GND

GND

GND

GND

GND

V

X

X

X

X

X

X

f

f

0

0

1

1

2

2

0

1

DD

AND-plane

OR-plane


Dynamic pla

Dynamic PLA

f

AND

V

GND

DD

f

OR

f

OR

f

AND

V

f

f

GND

X

X

X

X

X

X

0

1

0

0

1

1

2

2

DD

AND-plane

OR-plane


Clock signal generation for self timed dynamic pla

f

f

f

AND

Dummy AND row

f

AND

t

t

pre

eval

f

Dummy AND row

f

OR

AND

f

OR

(a) Clock signals

(b) Timing generation circuitry

Clock Signal Generation for self-timed dynamic PLA

  • Self-timing is recommended in PLA for maximum performance

  • Dummy AND row in PLA estimates maximum loading condition

    for the required precharge time

  • Worst case discharge time is estimated in a similar way


Pla layout

PLA Layout


4 mbit sram hierarchical word line architecture

4 Mbit SRAMHierarchical Word-line Architecture

To improve performance and reduce power consumption in large memories

the word line is hierarchically divided and sections are activated as needed


Bit line circuitry

Bit-line Circuitry

Block

Bit-line

ATD

select

load

BEQ

Local

WL

Memory cell

B

/

T

B

/

T

CD

CD

CD

I

/

O

I/O line

I

/

O

Sense amplifier


Sense amplifier and waveforms

Address

ATD

BEQ

Vdd

I/O Lines

GND

SEQ

Vdd

SA, SA

GND

DATA

Data-cut

Sense Amplifier (and Waveforms)

I

/

O

I

/

O

SEQ

Block

select

ATD

BS

SA

BS

SA

SEQ

SEQ

SEQ

SEQ

DATA

De

i

BS


1 gbit flash memory

1 Gbit Flash Memory

From [Nakamura02]


Writing flash memory

Writing Flash Memory

8

10

6

10

Number of cells

4

Read level (4.5 V)

10

2

10

0

10

0V

1V

2V

3V

4V

Vt of memory cells

Final Distribution

Evolution of thresholds

During erasure all bits are programmed to become depletion devices

It takes four cycles of write/erase to establish all device threshold > 0.8 V

From [Nakamura02]


125 mm 2 1gbit nand flash memory

125mm2 1Gbit NAND Flash Memory

32 word lines x 1024 blocks

10.7mm

Charge pump

2kB Page buffer & cache

16896 bit lines

11.7mm

From [Nakamura02]


125 mm 2 1gbit nand flash memory1

125mm2 1Gbit NAND Flash Memory

  • Technology 0.13m p-sub CMOS triple-well

    1poly, 1polycide, 1W, 2Al

  • Cell size 0.077m2

  • Chip size 125.2mm2

  • Organization 2112 x 8b x 64 page x 1k block

  • Power supply 2.7V-3.6V

  • Cycle time 50ns

  • Read time25s

  • Program time 200s / page

  • Erase time 2ms / block

From [Nakamura02]


Semiconductor memory trends up to the 90 s

Semiconductor Memory Trends(up to the 90’s)

Memory Size as a function of time: x 4 every three years


Semiconductor memory trends updated

Semiconductor Memory Trends(updated)

From [Itoh01]


Semiconductor memory trends

Semiconductor Memory Trends

  • There was an apparent shift in memory market due to

    the increase of flash memory use for video

    and other personalized storage devices other

    than personal computers

  • SOC technology implements systems integrated with

    memories and processors on a single die

  • Challenges of making even bigger and denser

    memories with lower market incentives responsible

    for the slow down in the memory chip capacities


Trends in memory cell area

Trends in Memory Cell Area

From [Itoh01]


Semiconductor memory trends1

Semiconductor Memory Trends

Technology feature size for different SRAM generations


  • Login