Physics of failure

Last updated

Physics of failure is a technique under the practice of reliability design that leverages the knowledge and understanding of the processes and mechanisms that induce failure to predict reliability and improve product performance.

Contents

Other definitions of Physics of Failure include:

Overview

The concept of Physics of Failure, also known as Reliability Physics, involves the use of degradation algorithms that describe how physical, chemical, mechanical, thermal, or electrical mechanisms evolve over time and eventually induce failure. While the concept of Physics of Failure is common in many structural fields, [2] the specific branding evolved from an attempt to better predict the reliability of early generation electronic parts and systems.

The beginning

Within the electronics industry, the major driver for the implementation of Physics of Failure was the poor performance of military weapon systems during World War II. [3] During the subsequent decade, the United States Department of Defense funded an extensive amount of effort to especially improve the reliability of electronics, [4] with the initial efforts focused on after-the-fact or statistical methodology. [5] Unfortunately, the rapid evolution of electronics, with new designs, new materials, and new manufacturing processes, tended to quickly negate approaches and predictions derived from older technology. In addition, the statistical approach tended to lead to expensive and time-consuming testing. The need for different approaches led to the birth of Physics of Failure at the Rome Air Development Center (RADC). [6] Under the auspices of the RADC, the first Physics of Failure in Electronics Symposium was held in September 1962. [7] The goal of the program was to relate the fundamental physical and chemical behavior of materials to reliability parameters. [8]

Early history – integrated circuits

The initial focus of physics of failure techniques tended to be limited to degradation mechanisms in integrated circuits. This was primarily because the rapid evolution of the technology created a need to capture and predict performance several generations ahead of existing product.

One of the first major successes under predictive physics of failure was a formula [9] developed by James Black of Motorola to describe the behavior of electromigration. Electromigration occurs when collisions of electrons cause metal atoms in a conductor to dislodge and move downstream of current flow (proportional to current density). Black used this knowledge, in combination with experimental findings, to describe the failure rate due to electromigration as

where A is a constant based on the cross-sectional area of the interconnect, J is the current density, Ea is the activation energy (e.g. 0.7 eV for grain boundary diffusion in aluminum), k is the Boltzmann constant, T is the temperature and n is a scaling factor (usually set to 2 according to Black).

Physics of failure is typically designed to predict wearout, or an increasing failure rate, but this initial success by Black focused on predicting behavior during operational life, or a constant failure rate. This is because electromigration in traces can be designed out by following design rules, while electromigration at vias are primarily interfacial effects, which tend to be defect or process-driven.

Leveraging this success, additional physics-of-failure based algorithms have been derived for the three other major degradation mechanisms (time dependent dielectric breakdown [TDDB], hot carrier injection [HCI], and negative bias temperature instability [NBTI]) in modern integrated circuits (equations shown below). More recent work has attempted to aggregate these discrete algorithms into a system-level prediction. [10]

TDDB: τ = τo(T) exp[ G(T)/ εox] [11] where τo(T) = 5.4×10−7 exp(−Ea / kT), G(T) = 120 + 5.8/kT, and εox is the permittivity.

HCI: λHCI = A3 exp(−β/VD) exp(−Ea / kT) [12] where λHCI is the failure rate of HCI, A3 is an empirical fitting parameter, β is an empirical fitting parameter, VD is the drain voltage, Ea is the activation energy of HCI, typically −0.2 to −0.1 eV, k is the Boltzmann constant, and T is absolute temperature.

NBTI: λ = A εoxm VTμp exp(−Ea / kT) [13] where A is determined empirically by normalizing the above equation, m = 2.9, VT is the thermal voltage, μp is the surface mobility constant, Ea is the activation energy of NBTI, k is the Boltzmann constant, and T is the absolute temperature.

Next stage – electronic packaging

The resources and successes with integrated circuits, and a review of some of the drivers of field failures, subsequently motivated the reliability physics community to initiate physics of failure investigations into package-level degradation mechanisms. An extensive amount of work was performed to develop algorithms that could accurately predict the reliability of interconnects. Specific interconnects of interest resided at 1st level (wire bonds, solder bumps, die attach), 2nd level (solder joints), and 3rd level (plated through holes).

Just as integrated circuit community had four major successes with physics of failure at the die-level, the component packaging community had four major successes arise from their work in the 1970s and 1980s. These were

Peck: [14] Predicts time to failure of wire bond / bond pad connections when exposed to elevated temperature / humidity

where A is a constant, RH is the relative humidity, f(V) is a voltage function (often cited as voltage squared), Ea is the activation energy, kB is the Boltzmann constant, and T is absolute temperature.

Engelmaier: [15] Predicts time to failure of solder joints exposed to temperature cycling

where εf is a fatigue ductility coefficient, c is a time and temperature dependent constant, F is an empirical constant, LD is the distance from the neutral point, α is the coefficient of thermal expansion, ΔT is the change in temperature, and h is solder joint thickness.

Steinberg: [16] Predicts time to failure of solder joints exposed to vibration

where Z is maximum displacement, PSD is the power spectral density (g2/Hz), fn is the natural frequency of the CCA, Q is transmissibility (assumed to be square root of natural frequency), Zc is the critical displacement (20 million cycles to failure), B is the length of PCB edge parallel to component located at the center of the board, c is a component packaging constant, h is PCB thickness, r is a relative position factor, and L is component length.

IPC-TR-579: [17] Predicts time to failure of plated through holes exposed to temperature cycling

where a is coefficient of thermal expansion (CTE), T is temperature, E is elastic modules, h is board thickness, d is hole diameter, t is plating thickness, and E and Cu label corresponding board and copper properties, respectively, Su being the ultimate tensile strength and Df being ductility of the plated copper, and De is the strain range.

Each of the equations above uses a combination of knowledge of the degradation mechanisms and test experience to develop first-order equations that allow the design or reliability engineer to be able to predict time to failure behavior based on information on the design architecture, materials, and environment.

Recent work

More recent work in the area of physics of failure has been focused on predicting the time to failure of new materials (i.e., lead-free solder, [18] [19] high-K dielectric [20] ), software programs, [21] using the algorithms for prognostic purposes, [22] and integrating physics of failure predictions into system-level reliability calculations. [23]

Limitations

There are some limitations with the use of physics of failure in design assessments and reliability prediction. The first is physics of failure algorithms typically assume a 'perfect design'. Attempting to understand the influence of defects can be challenging and often leads to Physics of Failure (PoF) predictions limited to end of life behavior (as opposed to infant mortality or useful operating life). In addition, some companies have so many use environments (think personal computers) that performing a PoF assessment for each potential combination of temperature / vibration / humidity / power cycling / etc. would be onerous and potentially of limited value.

See also

Related Research Articles

<span class="mw-page-title-main">Maxwell–Boltzmann distribution</span> Specific probability distribution function, important in physics

In physics, the Maxwell–Boltzmann distribution, or Maxwell(ian) distribution, is a particular probability distribution named after James Clerk Maxwell and Ludwig Boltzmann.

<span class="mw-page-title-main">Solder</span> Alloy used to join metal pieces

Solder is a fusible metal alloy used to create a permanent bond between metal workpieces. Solder is melted in order to wet the parts of the joint, where it adheres to and connects the pieces after cooling. Metals or alloys suitable for use as solder should have a lower melting point than the pieces to be joined. The solder should also be resistant to oxidative and corrosive effects that would degrade the joint over time. Solder used in making electrical connections also needs to have favorable electrical characteristics.

<span class="mw-page-title-main">Exponential distribution</span> Probability distribution

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

In physical chemistry, the Arrhenius equation is a formula for the temperature dependence of reaction rates. The equation was proposed by Svante Arrhenius in 1889, based on the work of Dutch chemist Jacobus Henricus van 't Hoff who had noted in 1884 that the van 't Hoff equation for the temperature dependence of equilibrium constants suggests such a formula for the rates of both forward and reverse reactions. This equation has a vast and important application in determining the rate of chemical reactions and for calculation of energy of activation. Arrhenius provided a physical justification and interpretation for the formula. Currently, it is best seen as an empirical relationship. It can be used to model the temperature variation of diffusion coefficients, population of crystal vacancies, creep rates, and many other thermally induced processes and reactions. The Eyring equation, developed in 1935, also expresses the relationship between rate and energy.

Mean time between failures (MTBF) is the predicted elapsed time between inherent failures of a mechanical or electronic system during normal system operation. MTBF can be calculated as the arithmetic mean (average) time between failures of a system. The term is used for repairable systems while mean time to failure (MTTF) denotes the expected time to failure for a non-repairable system.

<span class="mw-page-title-main">Reaction rate</span> Speed at which a chemical reaction takes place

The reaction rate or rate of reaction is the speed at which a chemical reaction takes place, defined as proportional to the increase in the concentration of a product per unit time and to the decrease in the concentration of a reactant per unit time. Reaction rates can vary dramatically. For example, the oxidative rusting of iron under Earth's atmosphere is a slow reaction that can take many years, but the combustion of cellulose in a fire is a reaction that takes place in fractions of a second. For most reactions, the rate decreases as the reaction proceeds. A reaction's rate can be determined by measuring the changes in concentration over time.

<span class="mw-page-title-main">Electromigration</span> Movement of ions in an electrical field

Electromigration is the transport of material caused by the gradual movement of the ions in a conductor due to the momentum transfer between conducting electrons and diffusing metal atoms. The effect is important in applications where high direct current densities are used, such as in microelectronics and related structures. As the structure size in electronics such as integrated circuits (ICs) decreases, the practical significance of this effect increases.

Failure rate is the frequency with which an engineered system or component fails, expressed in failures per unit of time. It is usually denoted by the Greek letter λ (lambda) and is often used in reliability engineering.

Prognostics is an engineering discipline focused on predicting the time at which a system or a component will no longer perform its intended function. This lack of performance is most often a failure beyond which the system can no longer be used to meet desired performance. The predicted time then becomes the remaining useful life (RUL), which is an important concept in decision making for contingency mitigation. Prognostics predicts the future performance of a component by assessing the extent of deviation or degradation of a system from its expected normal operating conditions. The science of prognostics is based on the analysis of failure modes, detection of early signs of wear and aging, and fault conditions. An effective prognostics solution is implemented when there is sound knowledge of the failure mechanisms that are likely to cause the degradations leading to eventual failures in the system. It is therefore necessary to have initial information on the possible failures in a product. Such knowledge is important to identify the system parameters that are to be monitored. Potential uses for prognostics is in condition-based maintenance. The discipline that links studies of failure mechanisms to system lifecycle management is often referred to as prognostics and health management (PHM), sometimes also system health management (SHM) or—in transportation applications—vehicle health management (VHM) or engine health management (EHM). Technical approaches to building models in prognostics can be categorized broadly into data-driven approaches, model-based approaches, and hybrid approaches.

In the semiconductor industry, the term high-κ dielectric refers to a material with a high dielectric constant, as compared to silicon dioxide. High-κ dielectrics are used in semiconductor manufacturing processes where they are usually used to replace a silicon dioxide gate dielectric or another dielectric layer of a device. The implementation of high-κ gate dielectrics is one of several strategies developed to allow further miniaturization of microelectronic components, colloquially referred to as extending Moore's Law.

<span class="mw-page-title-main">Flat no-leads package</span> Integrated circuit package with contacts on all 4 sides, on the underside of the package

Flat no-leads packages such as quad-flat no-leads (QFN) and dual-flat no-leads (DFN) physically and electrically connect integrated circuits to printed circuit boards. Flat no-leads, also known as micro leadframe (MLF) and SON, is a surface-mount technology, one of several package technologies that connect ICs to the surfaces of PCBs without through-holes. Flat no-lead is a near chip scale plastic encapsulated package made with a planar copper lead frame substrate. Perimeter lands on the package bottom provide electrical connections to the PCB. Flat no-lead packages usually, but not always, include an exposed thermally conductive pad to improve heat transfer out of the IC. Heat transfer can be further facilitated by metal vias in the thermal pad. The QFN package is similar to the quad-flat package (QFP), and a ball grid array (BGA).

Hot carrier injection (HCI) is a phenomenon in solid-state electronic devices where an electron or a “hole” gains sufficient kinetic energy to overcome a potential barrier necessary to break an interface state. The term "hot" refers to the effective temperature used to model carrier density, not to the overall temperature of the device. Since the charge carriers can become trapped in the gate dielectric of a MOS transistor, the switching characteristics of the transistor can be permanently changed. Hot-carrier injection is one of the mechanisms that adversely affects the reliability of semiconductors of solid-state devices.

Black's Equation is a mathematical model for the mean time to failure (MTTF) of a semiconductor circuit due to electromigration: a phenomenon of molecular rearrangement (movement) in the solid phase caused by an electromagnetic field.

Diffusivity, mass diffusivity or diffusion coefficient is usually written as the proportionality constant between the molar flux due to molecular diffusion and the negative value of the gradient in the concentration of the species. More accurately, the diffusion coefficient times the local concentration is the proportionality constant between the negative value of the mole fraction gradient and the molar flux. This distinction is especially significant in gaseous systems with strong temperature gradients. Diffusivity derives its definition from Fick's law and plays a role in numerous other equations of physical chemistry.

In chemical kinetics, an Arrhenius plot displays the logarithm of a reaction rate constant, (, ordinate axis) plotted against reciprocal of the temperature (, abscissa). Arrhenius plots are often used to analyze the effect of temperature on the rates of chemical reactions. For a single rate-limited thermally activated process, an Arrhenius plot gives a straight line, from which the activation energy and the pre-exponential factor can both be determined.

Thermo-mechanical fatigue is the overlay of a cyclical mechanical loading, that leads to fatigue of a material, with a cyclical thermal loading. Thermo-mechanical fatigue is an important point that needs to be considered, when constructing turbine engines or gas turbines.

The highly accelerated stress test (HAST) method was first proposed by Jeffrey E. Gunn, Sushil K. Malik, and Purabi M. Mazumdar of IBM.

Sherlock Automated Design Analysis is a software tool developed by DfR Solutions for analyzing, grading, and certifying the expected reliability of products at the circuit card assembly level. Based on the science of Physics of Failure, Sherlock predicts failure mechanism-specific failure rates over time using a combination of finite element method and material properties to capture stress values and first order analytical equations to evaluate damage evolution. The software is designed for use by design and reliability engineers and managers in the electronics industry. DfR Solutions is based in Beltsville, Maryland, USA, and was acquired by ANSYS, Inc. in May 2019.

Solder fatigue is the mechanical degradation of solder due to deformation under cyclic loading. This can often occur at stress levels below the yield stress of solder as a result of repeated temperature fluctuations, mechanical vibrations, or mechanical loads. Techniques to evaluate solder fatigue behavior include finite element analysis and semi-analytical closed-form equations.

<span class="mw-page-title-main">Aristos Christou</span> American engineer

Aristos Christou is an American engineer and scientist, academic professor and researcher. He is a Professor of Materials Science, Professor of Mechanical Engineering and Professor of Reliability Engineering at the University of Maryland.

References

  1. JEDEC JEP148, April 2004, Reliability Qualification of Semiconductor Devices Based on Physics of Failure Risk and Opportunity Assessment
  2. http://www.iagtcommittee.com/downloads/08-3-1%20Prakash%20Patnaik%20-%20Life%20Evaluation%20and%20Extension%20Program.pdf, Gas Turbine Materials/Components Life Evaluation & Extension Programs, Dr. Prakash Patnaik, Director SMPL, National Research Council Canada, Institute for Aerospace Research, Ottawa, Canada, 21 October 2008
  3. http://theriac.org/DeskReference/PDFs/2011Q1/2011Q1-article2.pdf, A Short History of Reliability.
  4. R. Lusser, Unreliability of Electronics – Cause and Cure, Redstone Arsenal, Huntsville, AL, DTIC Document
  5. J. Spiegel and E.M. Bennett, Military System Reliability: Department of Defense Contributions, IRE Transactions on Reliability and Quality Control, Dec. 1960, Volume: RQC-9 Issue:3
  6. George H. Ebel, Reliability Physics in Electronics: A Historical View, IEEE TRANSACTIONS ON RELIABILITY, VOL 47, NO. 3-SP 1998 SEPTEMBER SP-379
  7. This would eventually evolve into the current International Reliability Physics of Symposium (IRPS)
  8. Vaccaro “Reliability and the physics of failure program at RADC”, Physics of Failure in Electronics, 1963, pp. 4–10; Spartan.
  9. James Black, Mass Transport of Aluminum by Momentum Exchange with Conducting Electrons, 6th Annual Reliability Physics Symposium, November 1967
  10. http://www.dfrsolutions.com/uploads/publications/ICWearout_Paper.pdf, E. Wyrwas, L. Condra, and A. Hava, Accurate Quantitative Physics-of-Failure Approach to Integrated Circuit Reliability, IPC APEX Expo, Las Vegas, NV, April 2011
  11. Schuegraf and Hu, "A Model for Gate Oxide Breakdown", IEEE Trans. Electron Dev., May 1994.
  12. Takeda, E. Suzuki, N. "An empirical model for device degradation due to Hot-Carrier Injection", IEEE Electron Device Letters, Vol 4, Num 4, 1983, pp. 111–113.
  13. Chen, Y.F. Lin, M.H. Chou, C.H. Chang, W.C. Huang, S.C. Chang, Y.J. Fu, K.Y. "Negative Bias Temperature Instability (NBTI) in Deep Sub-micron p+-gate pMOSFETS", 2000 IRW Final Report, p98-101
  14. Peck, D.S.; "New concerns about integrated circuit reliability", Electron Devices, IEEE Transactions on, vol. 26, no. 1, pp. 38–43, Jan 1979
  15. Engelmaier, W.; "Fatigue Life of Leadless Chip Carrier Solder Joints During Power Cycling", Components, Hybrids, and Manufacturing Technology, IEEE Transactions on, vol. 6, no. 3, pp. 232–237, Sep 1983
  16. D. S. Steinberg, Vibration Analysis For Electronic Equipment, John Wiley & Sons Inc., New York, first ed. 1973, second ed. 1988, third ed. 2000
  17. IPC-TR-579, Round Robin Reliability Evaluation of Small Diameter Plated-Through Holes in Printed Wiring Boards, September 1988
  18. http://www.dfrsolutions.com/uploads/publications/2006_Blattau_IPC_working.pdf, N. Blattau and C. Hillman "An Engelmaier model for leadless ceramic chip devices with Pb-free solder", J. Reliab. Inf. Anal. Cntr., vol. First Quarter, p. 7, 2007.
  19. O. Salmela, K. Andersson, A. Perttula, J. Sarkka and M. Tammenmaa "Modified Engelmaier's model taking account of different stress levels", Microelectron. Reliab., vol. 48, p. 773, 2008
  20. Raghavan, N.; Prasad, K.; "Statistical outlook into the physics of failure for copper low-k intra-metal dielectric breakdown", Reliability Physics Symposium, 2009 IEEE International, vol., no., pp. 819–824, 26–30 April 2009
  21. Bukowski, J.V.; Johnson, D.A.; Goble, W.M.; "Software-reliability feedback: a physics-of-failure approach", Reliability and Maintainability Symposium, 1992. Proceedings., Annual, vol., no., pp. 285–289, 21–23 Jan 1992
  22. NASA.gov NASA Prognostic Center of Excellence
  23. http://www.dfrsolutions.com/uploads/publications/2010_01_RAMS_Paper.pdf, McLeish, J.G.; "Enhancing MIL-HDBK-217 reliability predictions with physics of failure methods", Reliability and Maintainability Symposium (RAMS), 2010 Proceedings – Annual, vol., no., pp. 1–6, 25–28 Jan. 2010