Human reliability

Last updated

In the field of human factors and ergonomics, human reliability (also known as human performance or HU) is the probability that a human performs a task to a sufficient standard. [1] Reliability of humans can be affected by many factors such as age, physical health, mental state attitude, emotions, personal propensity for certain mistakes, and cognitive biases.

Contents

Human reliability is important to the resilience of socio-technical systems, and has implications for fields like manufacturing, medicine and nuclear power. Attempts made to decrease human error and increase reliability in human interaction with technology include user-centered design and error-tolerant design.

Common traps of human nature

People tend to overestimate their ability to maintain control when they are doing work. The common characteristics of human nature addressed below are especially accentuated when work is performed in a complex work environment. [2]

Stress – The problem with stress is that it can accumulate and overpower a person, thus becoming detrimental to performance.

Avoidance of mental strain – Humans are reluctant to engage in lengthy concentrated thinking, as it requires high levels of attention for extended periods. This is sometimes called the cognitive miser tendency.

The mental biases, or shortcuts, often used to reduce mental effort and expedite decision-making include:

Limited working memory – The mind's short-term memory is the “workbench” for problem solving and decision-making.

Limited attention resources – The limited ability to concentrate on two or more activities challenges the ability to process information needed to solve problems.

Mind-set – People tend to focus more on what they want to accomplish (a goal) and less on what needs to be avoided because human beings are primarily goal-oriented by nature. As such, people tend to “see” only what the mind expects, or wants, to see.

Difficulty seeing one's own error – Individuals, especially when working alone, are particularly susceptible to missing errors.

Limited perspective – Humans cannot see all there is to see. The inability of the human mind to perceive all facts pertinent to a decision challenges problem-solving.

Susceptibility to emotional/social factors – Anger and embarrassment adversely influence team and individual performance.

Fatigue – People get tired. Physical, emotional, and mental fatigue can lead to error and poor judgment.

Presenteeism – The act or culture of employees continuing to work as a performative measure, despite having reduced productivity levels or negative consequences.

Analysis techniques

A variety of methods exist for human reliability analysis (HRA). [3] [4] Two general classes of methods are those based on probabilistic risk assessment (PRA) and those based on a cognitive theory of control.

PRA-based techniques

One method for analyzing human reliability is a straightforward extension of probabilistic risk assessment (PRA): in the same way that equipment can fail in a power plant, so can a human operator commit errors. In both cases, an analysis (functional decomposition for equipment and task analysis for humans) would articulate a level of detail for which failure or error probabilities can be assigned. This basic idea is behind the Technique for Human Error Rate Prediction (THERP). [5] THERP is intended to generate human error probabilities that would be incorporated into a PRA. The Accident Sequence Evaluation Program (ASEP) human reliability procedure is a simplified form of THERP; an associated computational tool is Simplified Human Error Analysis Code (SHEAN). [6] More recently, the US Nuclear Regulatory Commission has published the Standardized Plant Analysis Risk – Human Reliability Analysis (SPAR-H) method to take account of the potential for human error. [7] [8]

Cognitive control based techniques

Erik Hollnagel has developed this line of thought in his work on the Contextual Control Model (COCOM) [9] and the Cognitive Reliability and Error Analysis Method (CREAM). [10] COCOM models human performance as a set of control modesstrategic (based on long-term planning), tactical (based on procedures), opportunistic (based on present context), and scrambled (random) – and proposes a model of how transitions between these control modes occur. This model of control mode transition consists of a number of factors, including the human operator's estimate of the outcome of the action (success or failure), the time remaining to accomplish the action (adequate or inadequate), and the number of simultaneous goals of the human operator at that time. CREAM is a human reliability analysis method that is based on COCOM.

Related techniques in safety engineering and reliability engineering include failure mode and effects analysis, hazop, fault tree, and SAPHIRE (Systems Analysis Programs for Hands-on Integrated Reliability Evaluations).

Human Factors Analysis and Classification System (HFACS)

The Human Factors Analysis and Classification System (HFACS) was developed initially as a framework to understand the role of human error in aviation accidents. [11] [12] It is based on James Reason's Swiss cheese model of human error in complex systems. HFACS distinguishes between the "active failures" of unsafe acts, and "latent failures" of preconditions for unsafe acts, unsafe supervision, and organizational influences. These categories were developed empirically on the basis of many aviation accident reports.

"Unsafe acts" are performed by the human operator "on the front line" (e.g., the pilot, the air traffic controller, or the driver). Unsafe acts can be either errors (in perception, decision making or skill-based performance) or violations. Violations, or the deliberate disregard for rules and procedures, can be routine or exceptional. Routine violations occur habitually and are usually tolerated by the organization or authority. Exceptional violations are unusual and often extreme. For example, driving 60 mph in a 55-mph speed limit zone is a routine violation, while driving 130 mph in the same zone is exceptional.

There are two types of preconditions for unsafe acts: those that relate to the human operator's internal state and those that relate to the human operator's practices or ways of working. Adverse internal states include those related to physiology (e.g., illness) and mental state (e.g., mentally fatigued, distracted). A third aspect of 'internal state' is really a mismatch between the operator's ability and the task demands. Four types of unsafe supervision are: inadequate supervision; planned inappropriate operations; failure to correct a known problem; and supervisory violations.

Organizational influences include those related to resources management (e.g., inadequate human or financial resources), organizational climate (structures, policies, and culture), and organizational processes (such as procedures, schedules, oversight).

See also

Footnotes

  1. Calixto, Eduardo (2016-01-01), Calixto, Eduardo (ed.), "Chapter 5 - Human Reliability Analysis", Gas and Oil Reliability Engineering (Second Edition), Boston: Gulf Professional Publishing, pp. 471–552, ISBN   978-0-12-805427-7 , retrieved 2023-12-18
  2. https://www.standards.doe.gov/standards-documents/1000/1028-BHdbk-2009-v1/@@images/file DOE-HDBK-1028-2009
  3. Kirwan and Ainsworth, 1992
  4. Kirwan, 1994
  5. Swain & Guttmann, 1983
  6. Simplified Human Error Analysis Code (Wilson, 1993)
  7. SPAR-H
  8. Gertman et al., 2005
  9. (Hollnagel, 1993)
  10. (Hollnagel, 1998)
  11. Shappell and Wiegmann, 2000
  12. Wiegmann and Shappell, 2003

Related Research Articles

<span class="mw-page-title-main">Safety engineering</span> Engineering discipline which assures that engineered systems provide acceptable levels of safety

Safety engineering is an engineering discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety engineering. Safety engineering assures that a life-critical system behaves as needed, even when components fail.

<span class="mw-page-title-main">Fault tree analysis</span> Failure analysis system used in safety engineering and reliability engineering

Fault tree analysis (FTA) is a type of failure analysis in which an undesired state of a system is examined. This analysis method is mainly used in safety engineering and reliability engineering to understand how systems can fail, to identify the best ways to reduce risk and to determine event rates of a safety accident or a particular system level (functional) failure. FTA is used in the aerospace, nuclear power, chemical and process, pharmaceutical, petrochemical and other high-hazard industries; but is also used in fields as diverse as risk factor identification relating to social service system failure. FTA is also used in software engineering for debugging purposes and is closely related to cause-elimination technique used to detect bugs.

Task analysis is a fundamental tool of human factors engineering. It entails analyzing how a task is accomplished, including a detailed description of both manual and mental activities, task and element durations, task frequency, task allocation, task complexity, environmental conditions, necessary clothing and equipment, and any other unique factors involved in or required for one or more people to perform a given task.

The term workload can refer to several different yet related entities.

<span class="mw-page-title-main">Safety culture</span> Attitude, beliefs, perceptions and values that employees share in relation to risks in the workplace

Safety culture is the collection of the beliefs, perceptions and values that employees share in relation to risks within an organization, such as a workplace or community. Safety culture is a part of organizational culture, and has been described in a variety of ways; notably the National Academies of Science and the Association of Land Grant and Public Universities have published summaries on this topic in 2014 and 2016.

Situational awareness or situation awareness (SA) is the understanding of an environment, its elements, and how it changes with respect to time or other factors. Situational awareness is important for effective decision making in many environments. It is formally defined as:

“the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future”.

Human error is an action that has been done but that was "not intended by the actor; not desired by a set of rules or an external observer; or that led the task or system outside its acceptable limits". Human error has been cited as a primary cause contributing factor in disasters and accidents in industries as diverse as nuclear power, aviation, space exploration, and medicine. Prevention of human error is generally seen as a major contributor to reliability and safety of (complex) systems. Human error is one of the many contributing causes of risk events.

The system safety concept calls for a risk management strategy based on identification, analysis of hazards and application of remedial controls using a systems-based approach. This is different from traditional safety strategies which rely on control of conditions and causes of an accident based either on the epidemiological analysis or as a result of investigation of individual past accidents. The concept of system safety is useful in demonstrating adequacy of technologies when difficulties are faced with probabilistic risk analysis. The underlying principle is one of synergy: a whole is more than sum of its parts. Systems-based approach to safety requires the application of scientific, technical and managerial skills to hazard identification, hazard analysis, and elimination, control, or management of hazards throughout the life-cycle of a system, program, project or an activity or a product. "Hazop" is one of several techniques available for identification of hazards.

<span class="mw-page-title-main">Human Factors Analysis and Classification System</span> Method to identify causes of accidents and analysis to plan preventive training

The Human Factors Analysis and Classification System (HFACS) identifies the human causes of an accident and offers tools for analysis as a way to plan preventive training. It was developed by Dr Scott Shappell of the Civil Aviation Medical Institute and Dr Doug Wiegmann of the University of Illinois at Urbana-Campaign in response to a trend that showed some form of human error was a primary causal factor in 80% of all flight accidents in the Navy and Marine Corps.

Human Cognitive Reliability Correlation (HCR) is a technique used in the field of Human reliability Assessment (HRA), for the purposes of evaluating the probability of a human error occurring throughout the completion of a specific task. From such analyses measures can then be taken to reduce the likelihood of errors occurring within a system and therefore lead to an improvement in the overall levels of safety. There exist three primary reasons for conducting an HRA; error identification, error quantification and error reduction. As there exist a number of techniques used for such purposes, they can be split into one of two classifications; first generation techniques and second generation techniques. First generation techniques work on the basis of the simple dichotomy of ‘fits/doesn’t fit’ in the matching of the error situation in context with related error identification and quantification and second generation techniques are more theory based in their assessment and quantification of errors. HRA techniques have been utilised in a range of industries including healthcare, engineering, nuclear, transportation and business sector; each technique has varying uses within different disciplines.

The technique for human error-rate prediction (THERP) is a technique used in the field of human reliability assessment (HRA), for the purposes of evaluating the probability of a human error occurring throughout the completion of a specific task. From such analyses, measures can then be taken to reduce the likelihood of errors occurring within a system and therefore lead to an improvement in the overall levels of safety. There exist three primary reasons for conducting an HRA: error identification, error, quantification, and error, reduction.

A Technique for Human Event Analysis (ATHEANA) is a technique used in the field of human reliability assessment (HRA). The purpose of ATHEANA is to evaluate the probability of human error while performing a specific task. From such analyses, preventative measures can then be taken to reduce human errors within a system and therefore lead to improvements in the overall level of safety.

The healthcare error proliferation model is an adaptation of James Reason’s Swiss Cheese Model designed to illustrate the complexity inherent in the contemporary healthcare delivery system and the attribution of human error within these systems. The healthcare error proliferation model explains the etiology of error and the sequence of events typically leading to adverse outcomes. This model emphasizes the role organizational and external cultures contribute to error identification, prevention, mitigation, and defense construction.

Cognitive bias mitigation is the prevention and reduction of the negative effects of cognitive biases – unconscious, automatic influences on human judgment and decision making that reliably produce reasoning errors.

Human factors are the physical or cognitive properties of individuals, or social behavior which is specific to humans, and influence functioning of technological systems as well as human-environment equilibria. The safety of underwater diving operations can be improved by reducing the frequency of human error and the consequences when it does occur. Human error can be defined as an individual's deviation from acceptable or desirable practice which culminates in undesirable or unexpected results.

Dive safety is primarily a function of four factors: the environment, equipment, individual diver performance and dive team performance. The water is a harsh and alien environment which can impose severe physical and psychological stress on a diver. The remaining factors must be controlled and coordinated so the diver can overcome the stresses imposed by the underwater environment and work safely. Diving equipment is crucial because it provides life support to the diver, but the majority of dive accidents are caused by individual diver panic and an associated degradation of the individual diver's performance. - M.A. Blumenberg, 1996

Automation bias is the propensity for humans to favor suggestions from automated decision-making systems and to ignore contradictory information made without automation, even if it is correct. Automation bias stems from the social psychology literature that found a bias in human-human interaction that showed that people assign more positive evaluations to decisions made by humans than to a neutral object. The same type of positivity bias has been found for human-automation interaction, where the automated decisions are rated more positively than neutral. This has become a growing problem for decision making as intensive care units, nuclear power plants, and aircraft cockpits have increasingly integrated computerized system monitors and decision aids to mostly factor out possible human error. Errors of automation bias tend to occur when decision-making is dependent on computers or other automated aids and the human is in an observatory role but able to make decisions. Examples of automation bias range from urgent matters like flying a plane on automatic pilot to such mundane matters as the use of spell-checking programs.

David D. Woods is an American safety systems researcher who studies human coordination and automation issues in a wide range safety-critical fields such as nuclear power, aviation, space operations, critical care medicine, and software services. He is one of the founding researchers of the fields of cognitive systems engineering and resilience engineering.

Resilience engineering is a subfield of safety science research that focuses on understanding how complex adaptive systems cope when encountering a surprise. The term resilience in this context refers to the capabilities that a system must possess in order to deal effectively with unanticipated events. Resilience engineering examines how systems build, sustain, degrade, and lose these capabilities.

Cognitive systems engineering (CSE) is a field of study that examines the intersection of people, work, and technology, with a focus on safety-critical systems. The central tenet of cognitive systems engineering is that it views a collection of people and technology as a single unit that is capable of cognitive work, which is called a joint cognitive system.

Dr. Alan D. Swain III was a human factors engineer who specialized in weapons systems and nuclear power plants. He was a Distinguished Member of Technical Staff at Sandia National Laboratories, where he developed the technique for human error-rate prediction (THERP). According to a bibliometrics analysis performed in 2020, Swain is the most highly cited author in the field of human reliability analysis.

References

Further reading

Standards and guidance documents

Tools

Research labs

Media coverage

Networking