Variable retention time

Variable retention time (also known as VRT) is a reliability issue in dynamic random-access memory (DRAM) characterized by unpredictable fluctuations in the retention time of memory cells, that is, the duration for which a cell can reliably store data without being refreshed.^[1] If a cell's retention time becomes shorter than the refresh interval, it may lead to memory errors, potentially resulting in system crashes or Silent data corruption. VRT-affected bits that go undetected during product testing may pose a significant risk to device reliability. To mitigate the impact of VRT^[2] and soft errors, DRAM manufacturers have implemented error-correcting code (ECC) mechanisms directly within the memory chips. This approach has become a standard feature in DDR5 SDRAM.^[3]

Possible sources of VRT bits include high-voltage gate stress,^[4] exposure to high-energy particles radiation^[5] and high temperature stress.^[6]

Remove ads

Background

In dynamic random-access memory (DRAM), each bit of data is stored in a memory cell composed of a capacitor and a transistor. The amount of electrical charge stored in the capacitor determines whether the cell represents a binary "1" or "0". These cells are densely packed into integrated circuits, accompanied by control logic that manages data access. Due to the inherent leakage of charge from capacitors over time, DRAM cells must be periodically refreshed to maintain data integrity this involves rewriting the contents of each cell at regular intervals to prevent data loss.^[7]

Remove ads

Overview

The amount of time a cell can reliably store data without being refreshed is called cell's retention time ( $tRET$ ). In the case of a constant leakage current ( $I_{D}$ ), $tRET$ can be approximated as

tRET\approx {\frac {C\cdot \Delta V}{I_{D}}}

where $C$ is the storage node capacitance and $C\cdot \Delta V$ is the amount of charge loss required in order to have a failure.^[8] In modern devices, at operating temperatures, $I_{D}$ is dominated by generation current due to defects in the cell's access transistor. Variability in defect configuration is responsible for a wide spread of the value of leakage current, and therefore of $tRET$ , across different memory cells.^[9]

Only a few cells actually have $tRET$ approaching the refresh interval.^[10] To improve yield and reliability, DRAM chips include redundant rows or columns that can be used to replace faulty ones or single cells including those with retention times shorter than the refresh interval.^[11] However, this technique is less effective against VRT cells, which may begin to fail only after faulty cell replacement has been performed, typically at the die level.^[11]

Remove ads

Physics

Summarize

Perspective

Structural modification of the defect may alter its energy level

Electrical charge trapped in the gate-oxide may significantly affect electric field at the defect site

Physical models for VRT

At the microscopic level, defects located in the bulk or at the Si/SiO₂ interface of the access transistor are believed to be the primary source of leakage responsible for the discharge of the storage capacitor.^[9] According to Shockley–Read–Hall (SRH) theory, the generation rate depends on trap energy level ( $E_{T}$ ), free carrier concentration, and temperature.^[12] In the case of defects located in the depleted region, where free carrier concentrations are typically negligible and the generation rate is maximized, the current can be approximated as:

I_{SRH}\simeq {\frac {qn_{i}\sigma v_{th}}{2\cosh \left({\frac {|E_{T}-E_{i}|}{kT}}\right)}}

where $E_{i}$ is the intrinsic Fermi energy, $n_{i}$ is the intrinsic carrier concentration in silicon, $\sigma$ is the capture cross section which determines the probability of carrier capture and emission (assumed to be equal for electrons and holes for simplicity), $v_{th}$ is the thermal velocity of carriers.^[10] Large electric fields( $F$ ) are known to enhance $\sigma$ , resulting in increased generation current. Incorporating this effect, the total leakage current can be expressed as

I=(1+\Gamma (F,E_{T}))\cdot I_{SRH}(E_{T})

where $\Gamma (F,E_{T})$ is the field enhancement factor, a positive quantity that becomes significant under strong electric fields.^[13]

Generation current may fluctuate over time displaying a random telegraph noise (RTN) pattern,^[14] with transition rates having an Arrhenius dependence on temperature. To explain the origin of these instabilities, two main theoretical models have been proposed. One model attributes VRT to structural modifications of the defect, which cause changes in the trap energy level.^[15] The other model suggests that VRT arises from modulation of the local electric field, attributed to changes in the charge state of nearby defects, often located in the gate oxide.^[16]^[17] Both models have been supported by experimental evidence,^[18]^[15] suggesting that the VRT may originate from different physical phenomena.^[19]

Remove ads

Mitigation

Summarize

Perspective

Considerable effort has been spent to mitigate the effects of VRT, including modifications to the fabrication process and the introduction of error correction mechanisms.^[20]^[21]

Screening and in-DRAM ECC

There are no efficient mechanisms to screen VRT bits during production testing. Most manufacturers have been able to deal with it by increasing average retention time and by enforcing larger test screen margins, involving the replacement of possibly faulty cells with spare rows and columns.^[8] However, starting from sub-20nm node it became increasingly costly to screen and manage the growing number of defective cells, due to the sharply increasing area overhead required to fit adequate redundant resources.^[20]

In-DRAM ECC, coupled with traditional redundant sparing, was identified as the most effective solution,^[20] and became a JEDEC standard for DDR5 SDRAM.^[3] This technique involve dividing memory data into codewords and encode information adding extra parity bits, to enable the detection and correction of errors. This provides the ability to address faulty bits that were not identified as such during testing, such as VRT ones.^[20]

The key difference with the more traditional ECC DRAM lies in where the extra bits are stored. In in-DRAM ECC, parity bits are stored in the same chip, and error correction occurs internally to the chip, making it transparent to the memory controller. In ECC DRAM an extra chip is added to the DIMM to store the extra bits information, providing detection and correction of data transfer errors.^[22]

Physical treatments

Researchers have investigated passivation strategies to reduce the number of active defects in the silicon. Researchers have shown that hydrogen anneal at high temperature strongly reduces VRT,^[6] as confirmed by later experiments that highlighted a correlation with the reduction of charge pumping current, a metric typically used to assess the defect density at the Si/SiO₂ interface in MOSFETs.^[23]

Fluorine implantation was reported to reduce VRT in older technologies.^[15] This method was later observed to reduce the number of cells with gate-induced drain leakage, current that is a leakage mechanism induced by the presence of large electric fields at the gate drain overlap region of MOSFETs.^[24] Samsung researchers found that the number of VRT errors can be reduced by changes in the process steps for the formation of the metal gate in a 1znm process.^[21]

Remove ads

References

Loading content...

External links

Loading content...

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads

Background

Overview

Physics

Mitigation

Screening and in-DRAM ECC

Physical treatments

See also

References

External links