Human reliability

Last updated

In the field of human factors and ergonomics, human reliability (also known as human performance or HU) is the probability that a human performs a task to a sufficient standard. [1] Reliability of humans can be affected by many factors such as age, physical health, mental state, attitude, emotions, personal propensity for certain mistakes, and cognitive biases.

Contents

Human reliability is important to the resilience of socio-technical systems, and has implications for fields like manufacturing, medicine and nuclear power. Attempts made to decrease human error and increase reliability in human interaction with technology include user-centered design and error-tolerant design.

Factors Affecting Human Performance

Human error, human performance, and human reliability are especially important to consider when work is performed in a complex and high-risk environment. [2]

Strategies for dealing with performance-shaping factors such as psychological stress, cognitive load, fatigue include heuristics and biases such as confirmation bias, availability heuristic, and frequency bias.

Human reliability analysis

A variety of methods exist for human reliability analysis (HRA). [3] [4] Two general classes of methods are those based on probabilistic risk assessment (PRA) and those based on a cognitive theory of control.

PRA-based techniques

One method for analyzing human reliability is a straightforward extension of probabilistic risk assessment (PRA): in the same way that equipment can fail in a power plant, so can a human operator commit errors. In both cases, an analysis (functional decomposition for equipment and task analysis for humans) would articulate a level of detail for which failure or error probabilities can be assigned. This basic idea is behind the Technique for Human Error Rate Prediction (THERP). [5] THERP is intended to generate human error probabilities that would be incorporated into a PRA. The Accident Sequence Evaluation Program (ASEP) human reliability procedure is a simplified form of THERP; an associated computational tool is Simplified Human Error Analysis Code (SHEAN). [6] More recently, the US Nuclear Regulatory Commission has published the Standardized Plant Analysis Risk – Human Reliability Analysis (SPAR-H) method to take account of the potential for human error. [7] [8]

Cognitive control based techniques

Erik Hollnagel has developed this line of thought in his work on the Contextual Control Model (COCOM) [9] and the Cognitive Reliability and Error Analysis Method (CREAM). [10] COCOM models human performance as a set of control modesstrategic (based on long-term planning), tactical (based on procedures), opportunistic (based on present context), and scrambled (random) – and proposes a model of how transitions between these control modes occur. This model of control mode transition consists of a number of factors, including the human operator's estimate of the outcome of the action (success or failure), the time remaining to accomplish the action (adequate or inadequate), and the number of simultaneous goals of the human operator at that time. CREAM is a human reliability analysis method that is based on COCOM.

Related techniques in safety engineering and reliability engineering include failure mode and effects analysis, hazop, fault tree, and SAPHIRE (Systems Analysis Programs for Hands-on Integrated Reliability Evaluations).

Human Factors Analysis and Classification System (HFACS)

The Human Factors Analysis and Classification System (HFACS) was developed initially as a framework to understand the role of human error in aviation accidents. [11] [12] It is based on James Reason's Swiss cheese model of human error in complex systems. HFACS distinguishes between the "active failures" of unsafe acts, and "latent failures" of preconditions for unsafe acts, unsafe supervision, and organizational influences. These categories were developed empirically on the basis of many aviation accident reports.

"Unsafe acts" are performed by the human operator "on the front line" (e.g., the pilot, the air traffic controller, or the driver). Unsafe acts can be either errors (in perception, decision making or skill-based performance) or violations. Violations, or the deliberate disregard for rules and procedures, can be routine or exceptional. Routine violations occur habitually and are usually tolerated by the organization or authority. Exceptional violations are unusual and often extreme. For example, driving 60 mph in a 55-mph speed limit zone is a routine violation, while driving 130 mph in the same zone is exceptional.

There are two types of preconditions for unsafe acts: those that relate to the human operator's internal state and those that relate to the human operator's practices or ways of working. Adverse internal states include those related to physiology (e.g., illness) and mental state (e.g., mentally fatigued, distracted). A third aspect of 'internal state' is really a mismatch between the operator's ability and the task demands. Four types of unsafe supervision are: inadequate supervision; planned inappropriate operations; failure to correct a known problem; and supervisory violations.

Organizational influences include those related to resources management (e.g., inadequate human or financial resources), organizational climate (structures, policies, and culture), and organizational processes (such as procedures, schedules, oversight).

See also

Footnotes

  1. Calixto, Eduardo (2016-01-01), Calixto, Eduardo (ed.), "Chapter 5 - Human Reliability Analysis", Gas and Oil Reliability Engineering (Second Edition), Boston: Gulf Professional Publishing, pp. 471–552, ISBN   978-0-12-805427-7 , retrieved 2023-12-18
  2. https://www.standards.doe.gov/standards-documents/1000/1028-BHdbk-2009-v1/@@images/file DOE-HDBK-1028-2009
  3. Kirwan and Ainsworth, 1992
  4. Kirwan, 1994
  5. Swain & Guttmann, 1983
  6. Simplified Human Error Analysis Code (Wilson, 1993)
  7. SPAR-H
  8. Gertman et al., 2005
  9. (Hollnagel, 1993)
  10. (Hollnagel, 1998)
  11. Shappell and Wiegmann, 2000
  12. Wiegmann and Shappell, 2003

Related Research Articles

<span class="mw-page-title-main">Safety engineering</span> Engineering discipline which assures that engineered systems provide acceptable levels of safety

Safety engineering is an engineering discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety engineering. Safety engineering assures that a life-critical system behaves as needed, even when components fail.

<span class="mw-page-title-main">Fault tree analysis</span> Failure analysis system used in safety engineering and reliability engineering

Fault tree analysis (FTA) is a type of failure analysis in which an undesired state of a system is examined. This analysis method is mainly used in safety engineering and reliability engineering to understand how systems can fail, to identify the best ways to reduce risk and to determine event rates of a safety accident or a particular system level (functional) failure. FTA is used in the aerospace, nuclear power, chemical and process, pharmaceutical, petrochemical and other high-hazard industries; but is also used in fields as diverse as risk factor identification relating to social service system failure. FTA is also used in software engineering for debugging purposes and is closely related to cause-elimination technique used to detect bugs.

Task analysis is a fundamental tool of human factors engineering. It entails analyzing how a task is accomplished, including a detailed description of both manual and mental activities, task and element durations, task frequency, task allocation, task complexity, environmental conditions, necessary clothing and equipment, and any other unique factors involved in or required for one or more people to perform a given task.

The term workload can refer to several different yet related entities.

<span class="mw-page-title-main">Safety culture</span> Risk-averse attitudes

Safety culture is the element of organizational culture which is concerned with the maintenance of safety and compliance with safety standards. It is informed by the organization's leadership and the beliefs, perceptions and values that employees share in relation to risks within the organization, workplace or community. Safety culture has been described in a variety of ways: notably, the National Academies of Science and the Association of Land Grant and Public Universities have published summaries on this topic in 2014 and 2016.

Human error is an action that has been done but that was "not intended by the actor; not desired by a set of rules or an external observer; or that led the task or system outside its acceptable limits". Human error has been cited as a primary cause and contributing factor in disasters and accidents in industries as diverse as nuclear power, aviation, space exploration, and medicine. Prevention of human error is generally seen as a major contributor to reliability and safety of (complex) systems. Human error is one of the many contributing causes of risk events.

The system safety concept calls for a risk management strategy based on identification, analysis of hazards and application of remedial controls using a systems-based approach. This is different from traditional safety strategies which rely on control of conditions and causes of an accident based either on the epidemiological analysis or as a result of investigation of individual past accidents. The concept of system safety is useful in demonstrating adequacy of technologies when difficulties are faced with probabilistic risk analysis. The underlying principle is one of synergy: a whole is more than sum of its parts. Systems-based approach to safety requires the application of scientific, technical and managerial skills to hazard identification, hazard analysis, and elimination, control, or management of hazards throughout the life-cycle of a system, program, project or an activity or a product. "Hazop" is one of several techniques available for identification of hazards.

Human Cognitive Reliability Correlation (HCR) is a technique used in the field of Human Reliability Assessment (HRA), for the purposes of evaluating the probability of a human error occurring throughout the completion of a specific task. From such analyses measures can then be taken to reduce the likelihood of errors occurring within a system and therefore lead to an improvement in the overall levels of safety. There exist three primary reasons for conducting an HRA; error identification, error quantification and error reduction. As there exist a number of techniques used for such purposes, they can be split into one of two classifications; first generation techniques and second generation techniques. First generation techniques work on the basis of the simple dichotomy of ‘fits/doesn’t fit’ in the matching of the error situation in context with related error identification and quantification and second generation techniques are more theory based in their assessment and quantification of errors. HRA techniques have been utilised in a range of industries including healthcare, engineering, nuclear, transportation and business sector; each technique has varying uses within different disciplines.

Tecnica Empirica Stima Errori Operatori (TESEO) is a technique in the field of Human reliability Assessment (HRA), that evaluates the probability of a human error occurring throughout the completion of a specific task. From such analyses measures can then be taken to reduce the likelihood of errors occurring within a system and therefore lead to an improvement in the overall levels of safety. There exist three primary reasons for conducting an HRA; error identification, error quantification and error reduction. As there exist a number of techniques used for such purposes, they can be split into one of two classifications; first generation techniques and second generation techniques. First generation techniques work on the basis of the simple dichotomy of ‘fits/doesn’t fit’ in the matching of the error situation in context with related error identification and quantification and second generation techniques are more theory based in their assessment and quantification of errors. ‘HRA techniques have been utilised in a range of industries including healthcare, engineering, nuclear, transportation and business sector; each technique has varying uses within different disciplines.

The Technique for human error-rate prediction (THERP) is a technique that is used in the field of Human Reliability Assessment (HRA) to evaluate the probability of human error occurring throughout the completion of a task. From such an analysis, some corrective measures could be taken to reduce the likelihood of errors occurring within a system. The overall goal of THERP is to apply and document probabilistic methodological analyses to increase safety during a given process. THERP is used in fields such as error identification, error quantification and error reduction.

Human error assessment and reduction technique (HEART) is a technique used in the field of human reliability assessment (HRA), for the purposes of evaluating the probability of a human error occurring throughout the completion of a specific task. From such analyses measures can then be taken to reduce the likelihood of errors occurring within a system and therefore lead to an improvement in the overall levels of safety. There exist three primary reasons for conducting an HRA: error identification, error quantification, and error reduction. As there exist a number of techniques used for such purposes, they can be split into one of two classifications: first-generation techniques and second generation techniques. First generation techniques work on the basis of the simple dichotomy of 'fits/doesn't fit' in the matching of the error situation in context with related error identification and quantification and second generation techniques are more theory based in their assessment and quantification of errors. HRA techniques have been used in a range of industries including healthcare, engineering, nuclear, transportation, and business sectors. Each technique has varying uses within different disciplines.

A Technique for Human Event Analysis (ATHEANA) is a technique used in the field of human reliability assessment (HRA). The purpose of ATHEANA is to evaluate the probability of human error while performing a specific task. From such analyses, preventative measures can then be taken to reduce human errors within a system and therefore lead to improvements in the overall level of safety.

The healthcare error proliferation model is an adaptation of James Reason’s Swiss Cheese Model designed to illustrate the complexity inherent in the contemporary healthcare delivery system and the attribution of human error within these systems. The healthcare error proliferation model explains the etiology of error and the sequence of events typically leading to adverse outcomes. This model emphasizes the role organizational and external cultures contribute to error identification, prevention, mitigation, and defense construction.

Cognitive bias mitigation is the prevention and reduction of the negative effects of cognitive biases – unconscious, automatic influences on human judgment and decision making that reliably produce reasoning errors.

Human factors are the physical or cognitive properties of individuals, or social behavior which is specific to humans, and which influence functioning of technological systems as well as human-environment equilibria. The safety of underwater diving operations can be improved by reducing the frequency of human error and the consequences when it does occur. Human error can be defined as an individual's deviation from acceptable or desirable practice which culminates in undesirable or unexpected results. Human factors include both the non-technical skills that enhance safety and the non-technical factors that contribute to undesirable incidents that put the diver at risk.

[Safety is] An active, adaptive process which involves making sense of the task in the context of the environment to successfully achieve explicit and implied goals, with the expectation that no harm or damage will occur. – G. Lock, 2022

Dive safety is primarily a function of four factors: the environment, equipment, individual diver performance and dive team performance. The water is a harsh and alien environment which can impose severe physical and psychological stress on a diver. The remaining factors must be controlled and coordinated so the diver can overcome the stresses imposed by the underwater environment and work safely. Diving equipment is crucial because it provides life support to the diver, but the majority of dive accidents are caused by individual diver panic and an associated degradation of the individual diver's performance. – M.A. Blumenberg, 1996

Aviation accident analysis is performed to determine the cause of errors once an accident has happened. In the modern aviation industry, it is also used to analyze a database of past accidents in order to prevent an accident from happening. Many models have been used not only for the accident investigation but also for educational purpose.

David D. Woods is an American safety systems researcher who studies human coordination and automation issues in a wide range safety-critical fields such as nuclear power, aviation, space operations, critical care medicine, and software services. He is one of the founding researchers of the fields of cognitive systems engineering and resilience engineering.

Resilience engineering is a subfield of safety science research that focuses on understanding how complex adaptive systems cope when encountering a surprise. The term resilience in this context refers to the capabilities that a system must possess in order to deal effectively with unanticipated events. Resilience engineering examines how systems build, sustain, degrade, and lose these capabilities.

Cognitive systems engineering (CSE) is a field of study that examines the intersection of people, work, and technology, with a focus on safety-critical systems. The central tenet of cognitive systems engineering is that it views a collection of people and technology as a single unit that is capable of cognitive work, which is called a joint cognitive system.

Dr. Alan D. Swain III was a human factors engineer who specialized in weapons systems and nuclear power plants. He was a Distinguished Member of Technical Staff at Sandia National Laboratories, where he developed the technique for human error-rate prediction (THERP). According to a bibliometrics analysis performed in 2020, Swain is the most highly cited author in the field of human reliability analysis.

References

Further reading

Standards and guidance documents

Tools

Research labs

Media coverage

Networking