Skip to main content

2011 | Buch

Nanoscale Memory Repair

insite
SUCHEN

Über dieses Buch

Yield and reliability of memories have degraded with device and voltage scaling in the nano-scale era, due to ever-increasing hard/soft errors and device parameter variations. This book systematically describes these yield and reliability issues in terms of mathematics and engineering, as well as an array of repair techniques, based on the authors’ long careers in developing memories and low-voltage CMOS circuits. Nanoscale Memory Repair gives a detailed explanation of the various yield models and calculations, as well as various, practical logic and circuits that are critical for higher yield and reliability.

Inhaltsverzeichnis

Frontmatter
Chapter 1. An Introduction to Repair Techniques
With larger capacity, smaller feature size, and lower voltage operations of memory-rich CMOS LSIs (Fig. 1.1), various kinds of “errors (faults)” have been prominent and the repair techniques for them have become more important. The “errors” are categorized as hard/soft errors, timing/voltage margin errors, and speed-relevant errors. Hard/soft errors and timing/voltage margin errors, which occur in a chip, are prominent in a memory array because the array comprises memory cells having the smallest size and largest circuit count in the chip. In particular, coping with the margin errors is becoming increasingly important, and thus an emerging issue for low-voltage nanoscale LSIs, since the errors rapidly increase with device and voltage scaling. Increase in operating voltage is one of the best ways to tackle the issue. However, this approach is not acceptable due to intolerably increased power dissipation. Speed-relevant errors, which are prominent at a lower voltage operation, include speed-degradation errors of the chip itself and intolerably wide chip-to-chip speed-variation errors caused by the ever-larger interdie design-parameter variation. For the LSI industry in order to flourish and proliferate, solutions based on in-depth investigation of the errors are crucial.
Masashi Horiguchi, Kiyoo Itoh
Chapter 2. Redundancy
For designing redundancy circuit, the estimation of the advantages and disadvantages is indispensable. The introduction of redundancy in a memory chip results in yield improvement and fabrication-cost reduction. However, it also causes the following penalties. First, spare memory cells to replace faulty cells, programmable devices to memorize faulty addresses, and control circuitry to increase chip size. Second, the time required for the judgment whether the input address is faulty or not is added to the access time. Third, special process steps to fabricate the programmable devices and test time to store faulty addresses into the devices are required. Therefore, the design of redundancy circuit requires a trade-off between yield improvement and these penalties. The estimation of yield improvement requires a fault-distribution model. There are two representative models, Poisson distribution model and negative-binomial model, which are often used for the yield analysis of memory LSIs. The “replacement” of normal memory elements by spare elements requires checking whether the accessed address includes faulty elements, and if yes, inhibiting the faulty element from being activated and activating a spare element instead. These procedures should be realized with as small penalty as possible. One of the major issues for the replacement is memory-array division. Memory arrays are often divided into subarrays for the sake of access-time reduction, power reduction, and signal/noise ratio enhancement. There are two choices for memories with array division: (1) a faulty element in a subarray is replaced only by a spare element in the same subarray (intrasubarray replacement) and (2) a faulty element in a subarray may be replaced by a spare element in another subarray (intersubarray replacement). The former has smaller access penalty, while the latter realizes higher replacement efficiency. It is also possible that a subarray is replaced by a spare subarray. The devices for memorizing faulty addresses and test for finding out an effective replacement are also important issues for redundancy.
Masashi Horiguchi, Kiyoo Itoh
Chapter 3. Error Checking and Correction (ECC)
One of the key issues to design on-chip error checking and correction (ECC) circuits is the selection of an error-correcting code, which is the system to express data by a code, as described in Sect. 1.2. Error correcting codes proposed for the on-chip ECC of memory LSIs are linear codes. The linear codes feature that data are described by vectors and that coding (adding check bits) and decoding (checking and correcting errors) procedures are described by the operations between vectors and matrices. Therefore, the mathematical knowledge of linear algebra is required. In the case of an ordinary binary memory, a binary linear code is used, where each element of the vectors and matrices is “0” or “1.” In the case of a nonbinary (multilevel) memory, where plural bits are stored in a memory cell, a nonbinary linear code is more suitable, where each element may be not only “0” or “1” but also another value. What values should be used in addition to “0” and “1”? The answer is given by a set called Galois field.
Masashi Horiguchi, Kiyoo Itoh
Chapter 4. Combination of Redundancy and Error Correction
The effects of redundancy and error checking and correction (ECC) are separately discussed in the previous chapters. However, more faults can be repaired by the combination of the redundancy and the ECC than by simply adding the effects of them. This synergistic effect, which was first reported in 1990 [1], is especially effective for repairing random-bit faults. Repairing many random-bit faults by redundancy is not effective. Since the replacement unit is usually a row or a column, bit faults require as many spare rows/columns except for the case of two or more bit faults located on the same row/column. On the contrary, ECC can potentially repair many random-bit faults. However, ECC using a single-error correction code can practically repair far fewer bit faults because the probability of “fault collision” (two or more faults being located in a code word) cannot be neglected as described in Sect. 3.6. By combining the redundancy and ECC, most bit faults are repaired by ECC and a few “fault collisions” are repaired by redundancy, resulting in dramatic increase of repairable faults [1–3].
Masashi Horiguchi, Kiyoo Itoh
Chapter 5. Reduction Techniques for Margin Errors of Nanoscale Memories
The ever-increasing margin error, which results from reduced timing and voltage margins with device scaling under a given operating voltage V DD, must be reduced sufficiently. Reduction in the minimum operating voltage V DD (i.e., V min) is the key to reduce the error, as described in Chap.​ 1. However, it has strictly been prevented by low-voltage scaling limitations [1–5] that are the major problems in the nanoscale era. The problems stem from two device parameters that are unscalable as long as conventional devices and circuits are used: the first is the high value of the lowest necessary threshold voltage V t (i.e., V t0) of MOSFETs needed to keep the subthreshold leakage low. Although many intensive attempts to reduce V t0 through reducing leakage have been made since the late 1980s [4–6], V t0 is still not low enough to reduce V DD to the sub-1 V region.
Masashi Horiguchi, Kiyoo Itoh
Chapter 6. Reduction Techniques for Speed-Relevant Errors of Nanoscale Memories
There are two kinds of speed-relevant errors, regarded as emerging problems in the low-voltage nanoscale CMOS era, which are the speed-degradation error and the interdie speed-variation error. The errors become more prominent as operating voltage V DD approaches the unscalable and high value of the lowest necessary average V t (i.e., V t0 = 0.2 – 0.4 V, see Fig. 5.5). The speed-degradation error can occur for an average chip in a wafer whenever an average MOSFET in the chip slows down with reduced gate-over-drive voltage (= V DD − V t0), and the resultant speed of the chip does not meet the speed target. Note that even in the absence of any variation (i.e., interdie and intradie variations), all chips can be faulty with slow speed. The solutions are to develop new higher speed designs and higher speed MOSFETs to offset reduced gate-over-drive voltage, as described later.
Masashi Horiguchi, Kiyoo Itoh
Backmatter
Metadaten
Titel
Nanoscale Memory Repair
verfasst von
Masashi Horiguchi
Kiyoo Itoh
Copyright-Jahr
2011
Verlag
Springer New York
Electronic ISBN
978-1-4419-7958-2
Print ISBN
978-1-4419-7957-5
DOI
https://doi.org/10.1007/978-1-4419-7958-2

Neuer Inhalt