RAM parity

Last updated

RAM parity checking is the storing of a redundant parity bit representing the parity (odd or even) of a small amount of computer data (typically one byte) stored in random-access memory, and the subsequent comparison of the stored and the computed parity to detect whether a data error has occurred.

Contents

The parity bit was originally stored in additional individual memory chips; with the introduction of plug-in DIMM, SIMM, etc. modules, they became available in non-parity and parity (with an extra bit per byte, storing 9 bits for every 8 bits of actual data) versions.

History

A 30-pin SIMM memory modules with 9 one-bit-wide memory chips. The ninth chip is used to store parity. SIMM.jpg
A 30-pin SIMM memory modules with 9 one-bit-wide memory chips. The ninth chip is used to store parity.

Early computers sometimes required the use of parity RAM, and parity-checking could not be disabled. A parity error typically caused the machine to halt, with loss of unsaved data; this is usually a better option than saving corrupt data. Logic parity RAM, also known as fake parity RAM, is non-parity RAM that can be used in computers that require parity RAM. Logic parity RAM recalculates an always-valid parity bit each time a byte is read from memory, instead of storing the parity bit when the memory is written to; the calculated parity bit, which will not reveal if the data has been corrupted (hence the name "fake parity"), is presented to the parity-checking logic. It is a means of using cheaper 8-bit RAM in a system designed to use only 9-bit parity RAM.

Memory errors

In the 1970s-80s, RAM reliability was often less-than-perfect; in particular, the 4116 DRAMs which were an industry standard from 1975 to 1983 had a considerable failure rate as they used triple voltages (-5, +5, and +12) which resulted in high operating temperatures. By the mid-1980s, these had given way to single voltage DRAM such as the 4164 and 41256 with the result of improved reliability. However, RAM did not achieve modern standards of reliability until the 1990s. Since then errors have become less visible as simple parity RAM has fallen out of use; either they are invisible as they are not detected, or they are corrected invisibly with ECC RAM. Modern RAM is believed, with much justification, to be reliable, and error-detecting RAM has largely fallen out of use for non-critical applications. By the mid-1990s, most DRAM had dropped parity checking as manufacturers felt confident that it was no longer necessary. Some machines that support parity or ECC allow checking to be enabled or disabled in the BIOS, permitting cheaper non-parity RAM to be used. If parity RAM is used the chipset will usually use it to implement error correction, rather than halting the machine on a single-bit parity error.

However, as discussed in the article on ECC memory, errors, while not everyday events, are not negligibly infrequent. Even in the absence of manufacturing defects, naturally occurring radiation causes random errors; tests on Google's many servers found that memory errors were not rare events, and that the incidence of memory errors and the range of error rates across different DIMMs were much higher than previously reported. [1]

Error correction

Simple go/no go parity checking requires that the memory have extra, redundant bits beyond those needed to store the data; but if extra bits are available, they can be used to correct, as well as detect, errors. Earlier memory as used in, for example, the IBM PC/AT (FPM and EDO memory) were available in versions that supported either no checking or parity checking [2] (in earlier computers that used individual RAM chips rather than DIMM or SIMM modules, extra chips were used to store parity bits); if the computer detected a parity error it would display a message to that effect and stop. The SDRAM and DDR modules that replaced the earlier types are usually available either without error-checking or with ECC (full correction, not just parity). [2]

An example of a single-bit error that would be ignored by a system with no error-checking, would halt a machine with parity checking, or would be invisibly corrected by ECC: a single bit is stuck at 1 due to a faulty chip, or becomes changed to 1 due to background or cosmic radiation; a spreadsheet storing numbers in ASCII format is loaded, and the number "8" is stored in the byte which contains the stuck bit as its eighth bit; then another change is made to the spreadsheet and it is stored. However, the "8" (00111000 binary) has become a "9" (00111001).

If the stored parity is different from the parity computed from the stored data, at least one bit must have been changed due to data corruption. Undetected memory errors can have results ranging from undetectable and without consequence, to permanent corruption of stored data or machine crash. In the case of the home PC where data integrity is often perceived to be of little importance—certainly true for, say games and web browsing, less so for Internet banking and home finances—non-parity memory is an affordable option. However, if data integrity is required, parity memory will halt the computer and prevent the corrupt data from affecting results or stored data, although losing intermediate unstored data and preventing use until any faulty RAM is replaced. For the expense of some computational overhead, of negligible impact with modern fast computers, detected errors can be corrected—this is increasingly important on networked machines serving many users.

ECC type RAM

RAM with ECC or Error Correction Code can detect and correct errors. As with parity RAM, additional information needs to be stored and more processing needs to be done, making ECC RAM more expensive and a little slower than non-parity and logic parity RAM. This type of ECC memory is especially useful for any application where reliability or uptime is a concern: failing bits in a memory word are detected and corrected on the fly with no impact to the application. The occurrence of the error is typically logged by the operating system for analysis by a technical resource. In the case where the error is persistent, server downtime can be scheduled to replace the failing memory unit.

Wang lawsuit

In 1991, Wang won a judgment against Toshiba and NEC over its patents on SIMMs, based in part on its claim of using a ninth RAM chip for parity. [3] [4] In response, SIMMs with three chips rather than nine separate chips for each bit became popular under the theory they did not infringe. However, the switch from nine-chip SIMMs to three-chip SIMMs caused some compatibility issues. [5] One year later in 1992, Wang also sued Mitsubishi, but Wang did not ultimately prevail in that because the courts determined in 1997 that a licensing agreement was in place. [6] Eventually by the end of the 1990s, DIMMs had supplanted SIMMs in the marketplace, and DIMMs were not subject to Wang's lawsuits.

See also

Related Research Articles

<span class="mw-page-title-main">DDR SDRAM</span> Type of computer memory

Double Data Rate Synchronous Dynamic Random-Access Memory is a double data rate (DDR) synchronous dynamic random-access memory (SDRAM) class of memory integrated circuits used in computers. DDR SDRAM, also retroactively called DDR1 SDRAM, has been superseded by DDR2 SDRAM, DDR3 SDRAM, DDR4 SDRAM and DDR5 SDRAM. None of its successors are forward or backward compatible with DDR1 SDRAM, meaning DDR2, DDR3, DDR4 and DDR5 memory modules will not work on DDR1-equipped motherboards, and vice versa.

<span class="mw-page-title-main">Error detection and correction</span> Techniques that enable reliable delivery of digital data over unreliable communication channels

In information theory and coding theory with applications in computer science and telecommunication, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable communication channels. Many communication channels are subject to channel noise, and thus errors may be introduced during transmission from the source to a receiver. Error detection techniques allow detecting such errors, while error correction enables reconstruction of the original data in many cases.

<span class="mw-page-title-main">Dynamic random-access memory</span> Type of computer memory

Dynamic random-access memory is a type of random-access semiconductor memory that stores each bit of data in a memory cell, usually consisting of a tiny capacitor and a transistor, both typically based on metal–oxide–semiconductor (MOS) technology. While most DRAM memory cell designs use a capacitor and transistor, some only use two transistors. In the designs where a capacitor is used, the capacitor can either be charged or discharged; these two states are taken to represent the two values of a bit, conventionally called 0 and 1. The electric charge on the capacitors gradually leaks away; without intervention the data on the capacitor would soon be lost. To prevent this, DRAM requires an external memory refresh circuit which periodically rewrites the data in the capacitors, restoring them to their original charge. This refresh process is the defining characteristic of dynamic random-access memory, in contrast to static random-access memory (SRAM) which does not require data to be refreshed. Unlike flash memory, DRAM is volatile memory, since it loses its data quickly when power is removed. However, DRAM does exhibit limited data remanence.

<span class="mw-page-title-main">DIMM</span> Computer memory module

A DIMM, or Dual In-Line Memory Module, is a popular type of memory module used in computers. It is a printed circuit board with one or both sides holding DRAM chips and pins. The vast majority of DIMMs are standardized through JEDEC standards, although there are proprietary DIMMs. DIMMs come in a variety of speeds and sizes, but generally are one of two lengths - PC which are 133.35 mm (5.25 in) and laptop (SO-DIMM) which are about half the size at 67.60 mm (2.66 in).

<span class="mw-page-title-main">SIMM</span> Computer memory module

A SIMM is a type of memory module used in computers from the early 1980s to the early 2000s. It is a printed circuit board on which has random-access memory attached to one or both sides. It differs from a dual in-line memory module (DIMM), the most predominant form of memory module since the late 1990s, in that the contacts on a SIMM are redundant on both sides of the module. SIMMs were standardised under the JEDEC JESD-21C standard.

<span class="mw-page-title-main">DDR2 SDRAM</span> Second generation of double-data-rate synchronous dynamic random-access memory

Double Data Rate 2 Synchronous Dynamic Random-Access Memory is a double data rate (DDR) synchronous dynamic random-access memory (SDRAM) interface. It is a JEDEC standard (JESD79-2); first published in September 2003. DDR2 succeeded the original DDR SDRAM specification, and was itself succeeded by DDR3 SDRAM in 2007. DDR2 DIMMs are neither forward compatible with DDR3 nor backward compatible with DDR.

<span class="mw-page-title-main">Data corruption</span> Errors in computer data that introduce unintended changes to the original data

Data corruption refers to errors in computer data that occur during writing, reading, storage, transmission, or processing, which introduce unintended changes to the original data. Computer, transmission, and storage systems use a number of measures to provide end-to-end data integrity, or lack of errors.

Memory scrubbing consists of reading from each computer memory location, correcting bit errors with an error-correcting code (ECC), and writing the corrected data back to the same location.

In electronics and computing, a soft error is a type of error where a signal or datum is wrong. Errors may be caused by a defect, usually understood either to be a mistake in design or construction, or a broken component. A soft error is also a signal or datum which is wrong, but is not assumed to imply such a mistake or breakage. After observing a soft error, there is no implication that the system is any less reliable than before. One cause of soft errors is single event upsets from cosmic rays.

Registered memory is computer memory that has a register between the DRAM modules and the system's memory controller. A registered memory module places less electrical load on a memory controller compared to an unregistered one. Registered memory allows a computer system to remain stable with a higher number of memory modules than it would have otherwise.

Double Data Rate 3 Synchronous Dynamic Random-Access Memory is a type of synchronous dynamic random-access memory (SDRAM) with a high bandwidth interface, and has been in use since 2007. It is the higher-speed successor to DDR and DDR2 and predecessor to DDR4 synchronous dynamic random-access memory (SDRAM) chips. DDR3 SDRAM is neither forward nor backward compatible with any earlier type of random-access memory (RAM) because of different signaling voltages, timings, and other factors.

<span class="mw-page-title-main">ECC memory</span> Self-correcting computer data storage

Error correction code memory is a type of computer data storage that uses an error correction code (ECC) to detect and correct n-bit data corruption which occurs in memory.

Chipkill is IBM's trademark for a form of advanced error checking and correcting (ECC) computer memory technology that protects computer memory systems from any single memory chip failure as well as multi-bit errors from any portion of a single memory chip. One simple scheme to perform this function scatters the bits of a Hamming code ECC word across multiple memory chips, such that the failure of any single memory chip will affect only one ECC bit per word. This allows memory contents to be reconstructed despite the complete failure of one chip. Typical implementations use more advanced codes, such as a BCH code, that can correct multiple bits with less overhead.

<span class="mw-page-title-main">Memory module</span>

In computing, a memory module or RAM stick is a printed circuit board on which memory integrated circuits are mounted.

Double Data Rate 4 Synchronous Dynamic Random-Access Memory is a type of synchronous dynamic random-access memory with a high bandwidth interface.

<span class="mw-page-title-main">Random-access memory</span> Form of computer data storage

Random-access memory is a form of electronic computer memory that can be read and changed in any order, typically used to store working data and machine code. A random-access memory device allows data items to be read or written in almost the same amount of time irrespective of the physical location of data inside the memory, in contrast with other direct-access data storage media, where the time required to read and write data items varies significantly depending on their physical locations on the recording medium, due to mechanical limitations such as media rotation speeds and arm movement.

A memory rank is a set of DRAM chips connected to the same chip select, which are therefore accessed simultaneously. In practice all DRAM chips share all of the other command and control signals, and only the chip select pins for each rank are separate.

In the design of modern computers, memory geometry describes the internal structure of random-access memory. Memory geometry is of concern to consumers upgrading their computers, since older memory controllers may not be compatible with later products. Memory geometry terminology can be confusing because of the number of overlapping terms.

<span class="mw-page-title-main">DDR5 SDRAM</span> Fifth generation of double-data-rate synchronous dynamic random-access memory

Double Data Rate 5 Synchronous Dynamic Random-Access Memory is the latest type of synchronous dynamic random-access memory. Compared to its predecessor DDR4 SDRAM, DDR5 was planned to reduce power consumption, while doubling bandwidth. The standard, originally targeted for 2018, was released on July 14, 2020.

Row hammer is a computer security exploit that takes advantage of an unintended and undesirable side effect in dynamic random-access memory (DRAM) in which memory cells interact electrically between themselves by leaking their charges, possibly changing the contents of nearby memory rows that were not addressed in the original memory access. This circumvention of the isolation between DRAM memory cells results from the high cell density in modern DRAM, and can be triggered by specially crafted memory access patterns that rapidly activate the same memory rows numerous times.

References

  1. Cnet news - Google: Computer memory flakier than expected
  2. 1 2 crucial.com FAQ: Are ECC and parity the same thing? If not what's the difference? Archived 2012-04-01 at the Wayback Machine
  3. "Wang Stops Toshiba and NEC infringing its patents". techmonitor.ai. 1991-10-10. Retrieved 2024-05-03.
  4. Wang Laboratories, Inc. v. Toshiba Corp.(United States District Court for the Eastern District of Virginia1993-06-28), Text .
  5. Thompson, Robert (2003-07-24). PC Hardware in a Nutshell. O'Reilly Media. p. 245. ISBN   9780596552343.
  6. Wang Laboratories Inc. v. Mitsubishi Electronics America Inc.(United States Court of Appeals,Federal Circuit1997-01-03), Text .