Fault (technology)

Last updated

In Engineering, a fault is a defect or problem in a system that causes it to fail or act abnormally.

Contents

The ISO document 10303-226 defines fault as an abnormal condition or defect at the component, equipment, or sub-system level which may lead to a failure.

The United States Glossary of Telecommunication Terms defines fault for telecommunications as:

  1. An accidental condition that causes a functional unit to fail to perform its required function. See § Random fault.
  2. A defect that causes a reproducible or catastrophic malfunction. A malfunction is considered reproducible if it occurs consistently under the same circumstances. See § Systematic fault.
  3. In power systems, an unintentional short circuit, or partial short circuit, between energized conductors or between an energized conductor and ground. A distinction can be made between symmetric and asymmetric faults. See Fault (power engineering).

Random fault

A random fault occurs as a result of wear or other deterioration.

Since deterioration progresses somewhat randomly, predicting when a particular unit will develop a fault is not possible. But the rate at which a particular fault occurs among a large number of units often can be predicted with significant accuracy.

Manufacturers often accept random faults as a risk if the chances are virtually negligible.

A fault can happen in virtually any object or appliance, most common with electronics and machinery.

For example, an Xbox 360 console will deteriorate over time due to dust buildup in the fans. This will cause the Xbox to overheat, cause an error, and shut the console down.

Systematic fault

A Systematic fault results from an error in design such that every copy has the same fault. Sometimes a systematic fault remains undetected for a long time even if many copies are in use. The fault might be triggered when conditions change and could fail in every copy at the same time.

Software can have faults, a.k.a. bugs, but since software cannot deteriorate, all faults are systematic.[ citation needed ]

See also


Related Research Articles

In computing, a segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory. On standard x86 computers, this is a form of general protection fault. The operating system kernel will, in response, usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal. Processes can in some cases install a custom signal handler, allowing them to recover on their own, but otherwise the OS default signal handler is used, generally causing abnormal termination of the process, and sometimes a core dump.

<span class="mw-page-title-main">Software testing</span> Checking software against a standard

Software testing is the act of checking whether software satisfies expectations.

A software bug is an error, flaw or fault in the design, development, or operation of computer software that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. The process of finding and correcting bugs is termed "debugging" and often uses formal techniques or tools to pinpoint bugs. Since the 1950s, some computer systems have been designed to detect or auto-correct various software errors during operations.

Fault commonly refers to:

<span class="mw-page-title-main">Debugger</span> Computer program used to test and debug other programs

A debugger or debugging tool is a computer program used to test and debug other programs. The main use of a debugger is to run the target program under controlled conditions that permit the programmer to track its execution and monitor changes in computer resources that may indicate malfunctioning code. Typical debugging facilities include the ability to run or halt the target program at specific points, display the contents of memory, CPU registers or storage devices, and modify memory or register contents in order to enter selected test data that might be a cause of faulty program execution.

<span class="mw-page-title-main">Crash (computing)</span> When a computer program stops functioning properly and self-terminates

In computing, a crash, or system crash, occurs when a computer program such as a software application or an operating system stops functioning properly and exits. On some operating systems or individual applications, a crash reporting service will report the crash and any details relating to it, usually to the developer(s) of the application. If the program is a critical part of the operating system, the entire system may crash or hang, often resulting in a kernel panic or fatal system error.

<span class="mw-page-title-main">Glitch</span> Short-lived fault in a computer system

A glitch is a short-lived fault in a system, such as a transient fault that corrects itself, making it difficult to troubleshoot. The term is particularly common in the computing and electronics industries, in circuit bending, as well as among players of video games. More generally, all types of systems including human organizations and nature experience glitches.

<span class="mw-page-title-main">Ariane flight V88</span> Failed maiden flight of Ariane 5, 1996

Ariane flight V88 was the failed maiden flight of the Arianespace Ariane 5 rocket, vehicle no. 501, on 4 June 1996. It carried the Cluster spacecraft, a constellation of four European Space Agency research satellites.

In systems engineering, dependability is a measure of a system's availability, reliability, maintainability, and in some cases, other characteristics such as durability, safety and security. In real-time computing, dependability is the ability to provide services that can be trusted within a time-period. The service guarantees must hold even when the system is subject to attacks or natural failures.

In computing, a page fault is an exception that the memory management unit (MMU) raises when a process accesses a memory page without proper preparations. Accessing the page requires a mapping to be added to the process's virtual address space. Besides, the actual page contents may need to be loaded from a backing store, such as a disk. The MMU detects the page fault, but the operating system's kernel handles the exception by making the required page accessible in the physical memory or denying an illegal memory access.

Troubleshooting is a form of problem solving, often applied to repair failed products or processes on a machine or a system. It is a logical, systematic search for the source of a problem in order to solve it, and make the product or process operational again. Troubleshooting is needed to identify the symptoms. Determining the most likely cause is a process of elimination—eliminating potential causes of a problem. Finally, troubleshooting requires confirmation that the solution restores the product or process to its working state.

Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.

Game testing, also called quality assurance (QA) testing within the video game industry, is a software testing process for quality control of video games. The primary function of game testing is the discovery and documentation of software defects. Interactive entertainment software testing is a highly technical field requiring computing expertise, analytic competence, critical evaluation skills, and endurance. In recent years the field of game testing has come under fire for being extremely strenuous and unrewarding, both financially and emotionally.

<span class="mw-page-title-main">Xbox 360 technical problems</span> Errors on Xbox 360

The Xbox 360 video game console is subject to a number of technical problems and failures that can render it unusable. However, many of the issues can be identified by a series of glowing red lights flashing on the face of the console; the three flashing red lights nicknamed the "Red Ring of Death" or the "RRoD" being the most infamous. There are also other issues that arise with the console, such as discs becoming scratched in the drive and "bricking" of consoles due to dashboard updates. Since its release on November 22, 2005, many articles have appeared in the media portraying the Xbox 360's failure rates, with the latest estimate by warranty provider SquareTrade to be 23.7% in 2009, and currently the highest estimate being 54.2% by a Game Informer survey.

The term downtime is used to refer to periods when a system is unavailable. The unavailability is the proportion of a time-span that a system is unavailable or offline. This is usually a result of the system failing to function because of an unplanned event, or because of routine maintenance.

<span class="mw-page-title-main">Blue screen of death</span> Error screen displayed after a fatal system error on a computer running Microsoft Windows or ReactOS

The blue screen of death is a critical error screen displayed by the Microsoft Windows and ReactOS operating systems in the event of a fatal system error. It indicates a system crash, in which the operating system has reached a critical condition where it can no longer operate safely.

In computer programming and software development, debugging is the process of finding and resolving bugs within computer programs, software, or systems.

ISO 26262, titled "Road vehicles – Functional safety", is an international standard for functional safety of electrical and/or electronic systems that are installed in serial production road vehicles, defined by the International Organization for Standardization (ISO) in 2011, and revised in 2018.

Condition monitoring of transformers in electrical engineering is the process of acquiring and processing data related to various parameters of transformers to determine their state of quality and predict their failure. This is done by observing the deviation of the transformer parameters from their expected values. Transformers are the most critical assets of electrical transmission and distribution systems, and their failures could cause power outages, personal and environmental hazards, and expensive rerouting or purchase of power from other suppliers. Identifying a transformer which is near failure can allow it to be replaced under controlled conditions at a non-critical time and avoid a system failure.

In engineering, a bug is a design defect in an engineered system that causes an undesired result.