Single-event upset

Last updated
A single-event upset in the flight computers of this Airbus A330 during Qantas Flight 72 on 7 October 2008 is suspected to have resulted in an aircraft upset that nearly ended in a crash after the computers experienced several malfunctions. Airbus A330-303, Qantas AN0743607.jpg
A single-event upset in the flight computers of this Airbus A330 during Qantas Flight 72 on 7 October 2008 is suspected to have resulted in an aircraft upset that nearly ended in a crash after the computers experienced several malfunctions.

A single-event upset (SEU), also known as a single-event error (SEE), is a change of state caused by one single ionizing particle (e.g. ions, electrons, photons) striking a sensitive node in a live micro-electronic device, such as in a microprocessor, semiconductor memory, or power transistors. The state change is a result of the free charge created by ionization in or close to an important node of a logic element (e.g. memory "bit"). The error in device output or operation caused as a result of the strike is called an SEU or a soft error.

Contents

The SEU itself is not considered permanently damaging to the transistors' or circuits' functionality, unlike the case of single-event latch-up (SEL), single-event gate rupture (SEGR), or single-event burnout (SEB). These are all examples of a general class of radiation effects in electronic devices called single-event effects (SEEs).

History

Single-event upsets were first described during above-ground nuclear testing, from 1954 to 1957, when many anomalies were observed in electronic monitoring equipment. Further problems were observed in space electronics during the 1960s, although it was difficult to separate soft failures from other forms of interference. In 1972, a Hughes satellite experienced an upset where the communication with the satellite was lost for 96 seconds and then recaptured. Scientists Dr. Edward C. Smith, Al Holman, and Dr. Dan Binder explained the anomaly as a single-event upset (SEU) and published the first SEU paper in the IEEE Transactions on Nuclear Science journal in 1975. [2] In 1978, the first evidence of soft errors from alpha particles in packaging materials was described by Timothy C. May and M.H. Woods. In 1979, James Ziegler of IBM, along with W. Lanford of Yale, first described the mechanism whereby a sea-level cosmic ray could cause a single-event upset in electronics. 1979 also saw the world's first heavy ion "single-event effects" test at a particle accelerator facility, conducted at Lawrence Berkeley National Laboratory's 88-Inch Cyclotron and Bevatron. [3]

Cause

Terrestrial SEUs arise due to cosmic particles colliding with atoms in the atmosphere, creating cascades or showers of neutrons and protons, which in turn may interact with electronic circuits. At deep sub-micron geometries, this affects semiconductor devices in the atmosphere.

In space, high-energy ionizing particles exist as part of the natural background, referred to as galactic cosmic rays (GCRs). Solar particle events and high-energy protons trapped in the Earth's magnetosphere (Van Allen radiation belts) exacerbate this problem. The high energies associated with the phenomenon in the space particle environment generally render increased spacecraft shielding useless in terms of eliminating SEUs and catastrophic single-event phenomena (e.g. destructive latch-up). Secondary atmospheric neutrons generated by cosmic rays can also have sufficiently high energy for producing SEUs in electronics on aircraft flights over the poles or at high altitudes. Trace amounts of radioactive elements in chip packages also lead to SEUs.

Testing for SEU sensitivity

The sensitivity of a device to SEU can be empirically estimated by placing a test device in a particle stream at a cyclotron or other particle accelerator facility. This particular test methodology is especially useful for predicting the SER (soft error rate) in known space environments but can be problematic for estimating terrestrial SER from neutrons. In this case, a large number of parts must be evaluated, possibly at different altitudes, to find the actual rate of upset.

Another way to empirically estimate SEU tolerance is to use a chamber shielded from radiation, with a known radiation source, such as Caesium-137.

When testing microprocessors for SEU, the software used to exercise the device must also be evaluated to determine which sections of the device were activated when SEUs occurred.

SEUs and circuit design

By definition, SEUs do not destroy the circuits involved, but they can cause errors. In space-based microprocessors, one of the most vulnerable portions is often the 1st and 2nd-level cache memories, because these must be very small and have very high speed, which means that they do not hold much charge. Often these caches are disabled if terrestrial designs are being configured to survive SEUs. Another point of vulnerability is the state machine in the microprocessor control, because of the risk of entering "dead" states (with no exits), however, these circuits must drive the entire processor, so they have relatively large transistors to provide relatively large electric currents and are not as vulnerable as one might think. Another vulnerable processor component is RAM, and more specifically static RAM (SRAM) used in cache memories. SRAM memories are usually designed with transistor sizes close to the minimum allowed by technology to allocate the maximum number of bits per unit area. Small transistor sizes and high bit density make memories one of the most susceptible components to SEUs. [4] To ensure resilience to SEUs, often an error correcting memory is used, together with circuitry to periodically read (leading to correction) or scrub (if reading does not lead to correction) the memory of errors, before the errors overwhelm the error-correcting circuitry.

In digital and analog circuits, a single event may cause one or more voltages pulses (i.e. glitches) to propagate through the circuit, in which case it is referred to as a single-event transient (SET). Since the propagating pulse is not technically a change of "state" as in a memory SEU, one should differentiate between SET and SEU. If a SET propagates through digital circuitry and results in an incorrect value being latched in a sequential logic unit, it is then considered an SEU.

Hardware problems can also occur for related reasons. Under certain circumstances (of both circuit design, process design, and particle properties) a "parasitic" thyristor inherent to CMOS designs can be activated, effectively causing an apparent short-circuit from power to ground. This condition is referred to as latch-up , and in absence of constructional countermeasures, often destroys the device due to thermal runaway. Most manufacturers design to prevent latch-up and test their products to ensure that latch-up does not occur from atmospheric particle strikes. In order to prevent latch-up in space, epitaxial substrates, silicon on insulator (SOI) or silicon on sapphire (SOS) are often used to further reduce or eliminate the susceptibility.

Notable SEU

See also

Related Research Articles

<span class="mw-page-title-main">Very-large-scale integration</span> Creating an integrated circuit by combining many transistors into a single chip

Very-large-scale integration (VLSI) is the process of creating an integrated circuit (IC) by combining millions or billions of MOS transistors onto a single chip. VLSI began in the 1970s when MOS integrated circuit chips were developed and then widely adopted, enabling complex semiconductor and telecommunication technologies. The microprocessor and memory chips are VLSI devices.

<span class="mw-page-title-main">Photodiode</span> Converts light into current

A photodiode is a semiconductor diode sensitive to photon radiation, such as visible light, infrared or ultraviolet radiation, X-rays and gamma rays. It produces an electrical current when it absorbs photons. This can be used for detection and measurement applications, or for the generation of electrical power in solar cells. Photodiodes are used in a wide range of applications throughout the electromagnetic spectrum from visible light photocells to gamma ray spectrometers.

<span class="mw-page-title-main">Static random-access memory</span> Type of computer memory

Static random-access memory is a type of random-access memory (RAM) that uses latching circuitry (flip-flop) to store each bit. SRAM is volatile memory; data is lost when power is removed.

<span class="mw-page-title-main">Dynamic random-access memory</span> Type of computer memory

Dynamic random-access memory is a type of random-access semiconductor memory that stores each bit of data in a memory cell, usually consisting of a tiny capacitor and a transistor, both typically based on metal–oxide–semiconductor (MOS) technology. While most DRAM memory cell designs use a capacitor and transistor, some only use two transistors. In the designs where a capacitor is used, the capacitor can either be charged or discharged; these two states are taken to represent the two values of a bit, conventionally called 0 and 1. The electric charge on the capacitors gradually leaks away; without intervention the data on the capacitor would soon be lost. To prevent this, DRAM requires an external memory refresh circuit which periodically rewrites the data in the capacitors, restoring them to their original charge. This refresh process is the defining characteristic of dynamic random-access memory, in contrast to static random-access memory (SRAM) which does not require data to be refreshed. Unlike flash memory, DRAM is volatile memory, since it loses its data quickly when power is removed. However, DRAM does exhibit limited data remanence.

Ionizing radiation (US) (or ionising radiation [UK]), including nuclear radiation, consists of subatomic particles or electromagnetic waves that have sufficient energy to ionize atoms or molecules by detaching electrons from them. Some particles can travel up to 99% of the speed of light, and the electromagnetic waves are on the high-energy portion of the electromagnetic spectrum.

<span class="mw-page-title-main">Neutron radiation</span> Ionizing radiation that presents as free neutrons

Neutron radiation is a form of ionizing radiation that presents as free neutrons. Typical phenomena are nuclear fission or nuclear fusion causing the release of free neutrons, which then react with nuclei of other atoms to form new nuclides—which, in turn, may trigger further neutron radiation. Free neutrons are unstable, decaying into a proton, an electron, plus an electron antineutrino. Free neutrons have a mean lifetime of 887 seconds.

Radiation hardening is the process of making electronic components and circuits resistant to damage or malfunction caused by high levels of ionizing radiation, especially for environments in outer space, around nuclear reactors and particle accelerators, or during nuclear accidents or nuclear warfare.

In electronics and computing, a soft error is a type of error where a signal or datum is wrong. Errors may be caused by a defect, usually understood either to be a mistake in design or construction, or a broken component. A soft error is also a signal or datum which is wrong, but is not assumed to imply such a mistake or breakage. After observing a soft error, there is no implication that the system is any less reliable than before. One cause of soft errors is single event upsets from cosmic rays.

LEON is a radiation-tolerant 32-bit central processing unit (CPU) microprocessor core that implements the SPARC V8 instruction set architecture (ISA) developed by Sun Microsystems. It was originally designed by the European Space Research and Technology Centre (ESTEC), part of the European Space Agency (ESA), without any involvement by Sun. Later versions have been designed by Gaisler Research, under a variety of owners. It is described in synthesizable VHSIC Hardware Description Language (VHDL). LEON has a dual license model: An GNU Lesser General Public License (LGPL) and GNU General Public License (GPL) free and open-source software (FOSS) license that can be used without licensing fee, or a proprietary license that can be purchased for integration in a proprietary product. The core is configurable through VHDL generics, and is used in system on a chip (SOC) designs both in research and commercial settings.

Borophosphosilicate glass, commonly known as BPSG, is a type of silicate glass that includes additives of both boron and phosphorus. Silicate glasses such as PSG and borophosphosilicate glass are commonly used in semiconductor device fabrication for intermetal layers, i.e., insulating layers deposited between succeedingly higher metal or conducting layers.

<span class="mw-page-title-main">Neutron activation</span> Induction of radioactivity by neutron radiation

Neutron activation is the process in which neutron radiation induces radioactivity in materials, and occurs when atomic nuclei capture free neutrons, becoming heavier and entering excited states. The excited nucleus decays immediately by emitting gamma rays, or particles such as beta particles, alpha particles, fission products, and neutrons. Thus, the process of neutron capture, even after any intermediate decay, often results in the formation of an unstable activation product. Such radioactive nuclei can exhibit half-lives ranging from small fractions of a second to many years.

<span class="mw-page-title-main">Latch-up</span> Short circuit which can occur in MOSFET circuits

In electronics, a latch-up is a type of short circuit which can occur in an integrated circuit (IC). More specifically, it is the inadvertent creation of a low-impedance path between the power supply rails of a MOSFET circuit, triggering a parasitic structure which disrupts proper functioning of the part, possibly even leading to its destruction due to overcurrent. A power cycle is required to correct this situation.

<span class="mw-page-title-main">ECC memory</span> Self-correcting computer data storage

Error correction code memory is a type of computer data storage that uses an error correction code (ECC) to detect and correct n-bit data corruption which occurs in memory.

<span class="mw-page-title-main">Linear energy transfer</span> Measure for the energy lost by ions per traversed distance

In dosimetry, linear energy transfer (LET) is the amount of energy that an ionizing particle transfers to the material traversed per unit distance. It describes the action of radiation into matter.

<span class="mw-page-title-main">Alpha particle</span> Ionizing radiation particle of two protons and two neutrons

Alpha particles, also called alpha rays or alpha radiation, consist of two protons and two neutrons bound together into a particle identical to a helium-4 nucleus. They are generally produced in the process of alpha decay but may also be produced in other ways. Alpha particles are named after the first letter in the Greek alphabet, α. The symbol for the alpha particle is α or α2+. Because they are identical to helium nuclei, they are also sometimes written as He2+
or 4
2
He2+
indicating a helium ion with a +2 charge. Once the ion gains electrons from its environment, the alpha particle becomes a normal helium atom 4
2
He
.

A semiconductor package is a metal, plastic, glass, or ceramic casing containing one or more discrete semiconductor devices or integrated circuits. Individual components are fabricated on semiconductor wafers before being diced into die, tested, and packaged. The package provides a means for connecting it to the external environment, such as printed circuit board, via leads such as lands, balls, or pins; and protection against threats such as mechanical impact, chemical contamination, and light exposure. Additionally, it helps dissipate heat produced by the device, with or without the aid of a heat spreader. There are thousands of package types in use. Some are defined by international, national, or industry standards, while others are particular to an individual manufacturer.

Alpha strike is a term referring to the event when an alpha particle, a composite charged particle composed of two protons and two neutrons, enters a computer and modifies the data or operation of a component in the computer.

<span class="mw-page-title-main">Altitude SEE Test European Platform</span> Observatory

The Altitude SEE Test European Platform (ASTEP) is a permanent mountain laboratory and a dual academic research platform created by Aix-Marseille University, CNRS and STMicroelectronics in 2004. The current platform, operated by IM2NP Laboratory, is dedicated to the problematic of Single Event Effect (SEE) induced by terrestrial radiation (atmospheric neutrons, protons and muons) in electronic components, circuits and systems. Located in the French Alps on the desert Plateau de Bure at 2552m (Dévoluy mountains), the platform is hosted by the IRAM Observatory ASTEP is fully operational since March 2006.

Physical unclonable function (PUF), sometimes also called physically unclonable function, is a physical entity that is embodied in a physical structure and is easy to evaluate but hard to predict.

Row hammer is a computer security exploit that takes advantage of an unintended and undesirable side effect in dynamic random-access memory (DRAM) in which memory cells interact electrically between themselves by leaking their charges, possibly changing the contents of nearby memory rows that were not addressed in the original memory access. This circumvention of the isolation between DRAM memory cells results from the high cell density in modern DRAM, and can be triggered by specially crafted memory access patterns that rapidly activate the same memory rows numerous times.

References

  1. Neutron-Induced Single Event Upset (SEU) FAQ, Microsemi Corporation, retrieved October 7, 2018, The cause has been traced to errors in an onboard computer suspected to have been induced by cosmic rays.
  2. Binder, Smith, Holman (1975). "Satellite Anomalies from Galactic Cosmic Rays". IEEE Transactions on Nuclear Science . NS-22, No. 6 (6): 2675–2680. Bibcode:1975ITNS...22.2675B. doi:10.1109/TNS.1975.4328188. S2CID   3032512 via IEEE Explore.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  3. Petersen, Koga, Shoga, Pickel, & Price (2013). "The Single Event Revolution". IEEE Transactions on Nuclear Science. Vol. 60, No.3.
  4. Torrens, G.; Alheyasat, A.; Alorda, B.; Barcelo, S.; Segura, J.; Bota, S. A. (2020). "Transistor Width Effect on the Power Supply Voltage Dependence of α-SER in CMOS 6T SRAM". IEEE Transactions on Nuclear Science. 67 (5): 811–817. Bibcode:2020ITNS...67..811T. doi:10.1109/TNS.2020.2983586. ISSN   0018-9499. S2CID   216198845.
  5. Ian Johnston (17 February 2017). "Cosmic particles can change elections and cause planes to fall through the sky, scientists warn". Independent. Retrieved 5 September 2018.
  6. The Invisible Neutron Threat (2012), Target 4 Flight Path 30L Publications, Los Alamos National Laboratory

Further reading

General SEU
SEU in programmable logic devices
SEU in microprocessors
SEU related masters theses and doctoral dissertations