Event tree analysis

Last updated

Event tree analysis (ETA) is a forward, top-down, logical modeling technique for both success and failure that explores responses through a single initiating event and lays a path for assessing probabilities of the outcomes and overall system analysis. [1] This analysis technique is used to analyze the effects of functioning or failed systems given that an event has occurred. [2]

Contents

ETA is a powerful tool that will identify all consequences of a system that have a probability of occurring after an initiating event that can be applied to a wide range of systems including: nuclear power plants, spacecraft, and chemical plants. This technique may be applied to a system early in the design process to identify potential issues that may arise, rather than correcting the issues after they occur. [3] With this forward logic process, use of ETA as a tool in risk assessment can help to prevent negative outcomes from occurring, by providing a risk assessor with the probability of occurrence. ETA uses a type of modeling technique called "event tree", which branches events from one single event using Boolean logic.

History

The name "Event Tree" was first introduced during the WASH-1400 nuclear power plant safety study (circa 1974), where the WASH-1400 team needed an alternate method to fault tree analysis due to the fault trees being too large. Though not using the name event tree, the UKAEA first introduced ETA in its design offices in 1968, initially to try to use whole plant risk assessment to optimize the design of a 500MW Steam-Generating Heavy Water Reactor. This study showed ETA condensed the analysis into a manageable form. [1] ETA was not initially developed during WASH-1400, this was one of the first cases in which it was thoroughly used. The UKAEA study used the assumption that protective systems either worked or failed, with the probability of failure per demand being calculated using fault trees or similar analysis methods. ETA identifies all sequences which follow an initiating event. Many of these sequences can be eliminated from the analysis because their frequency or effect are too small to affect the overall result. A paper presented at a CREST symposium in Munich, Germany, in 1971 indicated how this was done[ citation needed ]. The conclusions of the US EPA study of the Draft WASH-1400 [3] acknowledges the role of Ref 1 and its criticism of the Maximum Credible Accident approach used by AEC. MCA sets the reliability target for the containment but those for all other safety systems are set by smaller but more frequent accidents and would be missed by MCA.

In 2009 a risk analysis was conducted on underwater tunnel excavation under the Han River in Korea using an earth pressure balance type tunnel boring machine. ETA was used to quantify risk, by providing the probability of occurrence of an event, in the preliminary design stages of the tunnel construction to prevent any injuries or fatalities because tunnel construction in Korea has the highest injury and fatality rates within the construction category. [4]

Theory

Performing a probabilistic risk assessment starts with a set of initiating events that change the state or configuration of the system. [3] An initiating event is an event that starts a reaction, such as the way a spark (initiating event) can start a fire that could lead to other events (intermediate events) such as a tree burning down, and then finally an outcome, for example, the burnt tree no longer provides apples for food. Each initiating event leads to another event and continuing through this path, where each intermediate event's probability of occurrence may be calculated by using fault tree analysis, until an end state is reached (the outcome of a tree no longer providing apples for food). [3] Intermediate events are commonly split into a binary (success/failure or yes/no) but may be split into more than two as long as the events are mutually exclusive, meaning that they can not occur at the same time. If a spark is the initiating event there is a probability that the spark will start a fire or will not start a fire (binary yes or no) as well as the probability that the fire spreads to a tree or does not spread to a tree. End states are classified into groups that can be successes or severity of consequences. An example of a success would be that no fire started and the tree still provided apples for food while the severity of consequence would be that a fire did start and we lose apples as a source of food. Loss end states can be any state at the end of the pathway that is a negative outcome of the initiating event. The loss end state is highly dependent upon the system, for example if you were measuring a quality process in a factory a loss or end state would be that the product has to be reworked or thrown in the trash. Some common loss end states: [3]

Event tree diagram example Event Tree Diagram.JPG
Event tree diagram example

Methodology

The overall goal of event tree analysis is to determine the probability of possible negative outcomes that can cause harm and result from the chosen initiating event. It is necessary to use detailed information about a system to understand intermediate events, accident scenarios, and initiating events to construct the event tree diagram. The event tree begins with the initiating event where consequences of this event follow in a binary (success/failure) manner. Each event creates a path in which a series of successes or failures will occur where the overall probability of occurrence for that path can be calculated. The probabilities of failures for intermediate events can be calculated using fault tree analysis and the probability of success can be calculated from 1 = probability of success (ps) + probability of failure (pf). [3] For example, in the equation 1 = (ps) + (pf) if we know that pf=.1 from fault tree analysis then through simple algebra we can solve for ps where ps = (1) - (pf) then we would have ps = (1) - (.1) and ps=.9.

The event tree diagram models all possible pathways from the initiating event. The initiating event starts at the left side as a horizontal line that branches vertically. the vertical branch is representative of the success/failure of the initiating event. At the end of the vertical branch a horizontal line is drawn on each the top and the bottom representing the success or failure of the first event where a description (usually success or failure) is written with a tag that represents the path such as 1s where s is a success and 1 is the event number similarly with 1f where 1 is the event number and f denotes a failure (see attached diagram). This process continues until the end state is reached. When the event tree diagram has reached the end state for all pathways the outcome probability equation is written. [1] [3]

Steps to perform an event tree analysis: [1] [3]

  1. Define the system: Define what needs to be involved or where to draw the boundaries.
  2. Identify the accident scenarios: Perform a system assessment to find hazards or accident scenarios within the system design.
  3. Identify the initiating events: Use a hazard analysis to define initiating events.
  4. Identify intermediate events: Identify countermeasures associated with the specific scenario.
  5. Build the event tree diagram
  6. Obtain event failure probabilities: If the failure probability can not be obtained use fault tree analysis to calculate it.
  7. Identify the outcome risk: Calculate the overall probability of the event paths and determine the risk.
  8. Evaluate the outcome risk: Evaluate the risk of each path and determine its acceptability.
  9. Recommend corrective action: If the outcome risk of a path is not acceptable develop design changes that change the risk.
  10. Document the ETA: Document the entire process on the event tree diagrams and update for new information as needed.

Mathematical concepts

1 = (probability of success) + (probability of failure)

The probability of success can be derived from the probability of failure.

Overall path probability = (probability of event 1) × (probability of event 2) × ... × (probability of event n)

In risk analysis

The event tree analysis can be used in risk assessments by determining the probability that is used to determine risk when multiply by the hazard of events. Event Tree Analysis makes it easy to see what pathway creating the biggest probability of failure for a specific system. It is common to find single-point failures that do not have any intervening events between the initiating event and a failure. With Event Tree Analysis single-point failure can be targeted to include an intervening step that will reduce the overall probability of failure and thus reducing the risk of the system. The idea of adding an intervening event can happen anywhere in the system for any pathway that generates too great of a risk, the added intermediate event can reduce the probability and thus reduce the risk.

Advantages

Limitations

Software

Though ETA can be relatively simple, software can be used for more complex systems to build the diagram and perform calculations more quickly with reduction of human errors in the process. There are many types of software available to assist in conducting an ETA. In nuclear industry, RiskSpectrum PSA software is widely used which has both event tree analysis and fault tree analysis. Professional-grade free software solutions are also widely available. SCRAM is an example open-source tool that implements the Open-PSA Model Exchange Format open standard for probabilistic safety assessment applications.

See also

Related Research Articles

<span class="mw-page-title-main">Risk management</span> Identification, evaluation and control of risks

Risk management is the identification, evaluation, and prioritization of risks followed by coordinated and economical application of resources to minimize, monitor, and control the probability or impact of unfortunate events or to maximize the realization of opportunities.

<span class="mw-page-title-main">Safety engineering</span> Engineering discipline which assures that engineered systems provide acceptable levels of safety

Safety engineering is an engineering discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety engineering. Safety engineering assures that a life-critical system behaves as needed, even when components fail.

<span class="mw-page-title-main">Fault tree analysis</span> Failure analysis system used in safety engineering and reliability engineering

Fault tree analysis (FTA) is a type of failure analysis in which an undesired state of a system is examined. This analysis method is mainly used in safety engineering and reliability engineering to understand how systems can fail, to identify the best ways to reduce risk and to determine event rates of a safety accident or a particular system level (functional) failure. FTA is used in the aerospace, nuclear power, chemical and process, pharmaceutical, petrochemical and other high-hazard industries; but is also used in fields as diverse as risk factor identification relating to social service system failure. FTA is also used in software engineering for debugging purposes and is closely related to cause-elimination technique used to detect bugs.

SAPHIRE is a probabilistic risk and reliability assessment software tool. SAPHIRE stands for Systems Analysis Programs for Hands-on Integrated Reliability Evaluations. The system was developed for the U.S. Nuclear Regulatory Commission (NRC) by the Idaho National Laboratory.

Failure mode and effects analysis is the process of reviewing as many components, assemblies, and subsystems as possible to identify potential failure modes in a system and their causes and effects. For each component, the failure modes and their resulting effects on the rest of the system are recorded in a specific FMEA worksheet. There are numerous variations of such worksheets. An FMEA can be a qualitative analysis, but may be put on a quantitative basis when mathematical failure rate models are combined with a statistical failure mode ratio database. It was one of the first highly structured, systematic techniques for failure analysis. It was developed by reliability engineers in the late 1950s to study problems that might arise from malfunctions of military systems. An FMEA is often the first step of a system reliability study.

Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.

Probabilistic risk assessment (PRA) is a systematic and comprehensive methodology to evaluate risks associated with a complex engineered technological entity or the effects of stressors on the environment.

A hazard analysis is used as the first step in a process used to assess risk. The result of a hazard analysis is the identification of different types of hazards. A hazard is a potential condition and exists or not. It may, in single existence or in combination with other hazards and conditions, become an actual Functional Failure or Accident (Mishap). The way this exactly happens in one particular sequence is called a scenario. This scenario has a probability of occurrence. Often a system has many potential failure scenarios. It also is assigned a classification, based on the worst case severity of the end condition. Risk is the combination of probability and severity. Preliminary risk levels can be provided in the hazard analysis. The validation, more precise prediction (verification) and acceptance of risk is determined in the risk assessment (analysis). The main goal of both is to provide the best selection of means of controlling or eliminating the risk. The term is used in several engineering specialties, including avionics, chemical process safety, safety engineering, reliability engineering and food safety.

<span class="mw-page-title-main">ARP4761</span>

ARP4761, Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and Equipment is an Aerospace Recommended Practice from SAE International. In conjunction with ARP4754, ARP4761 is used to demonstrate compliance with 14 CFR 25.1309 in the U.S. Federal Aviation Administration (FAA) airworthiness regulations for transport category aircraft, and also harmonized international airworthiness regulations such as European Aviation Safety Agency (EASA) CS–25.1309.

Failure mode effects and criticality analysis (FMECA) is an extension of failure mode and effects analysis (FMEA).

Attack trees are conceptual diagrams showing how an asset, or target, might be attacked. Attack trees have been used in a variety of applications. In the field of information technology, they have been used to describe threats on computer systems and possible attacks to realize those threats. However, their use is not restricted to the analysis of conventional information systems. They are widely used in the fields of defense and aerospace for the analysis of threats against tamper resistant electronics systems. Attack trees are increasingly being applied to computer control systems. Attack trees have also been used to understand threats to physical systems.

<span class="mw-page-title-main">Event chain methodology</span> Network analysis technique

Event chain methodology is a network analysis technique that is focused on identifying and managing events and relationship between them that affect project schedules. It is an uncertainty modeling schedule technique. Event chain methodology is an extension of quantitative project risk analysis with Monte Carlo simulations. It is the next advance beyond critical path method and critical chain project management. Event chain methodology tries to mitigate the effect of motivational and cognitive biases in estimating and scheduling. It improves accuracy of risk assessment and helps to generate more realistic risk adjusted project schedules.

An event tree is an inductive analytical diagram in which an event is analyzed using Boolean logic to examine a chronological series of subsequent events or consequences. For example, event tree analysis is a major component of nuclear reactor safety engineering.

The technique for human error-rate prediction (THERP) is a technique used in the field of human reliability assessment (HRA), for the purposes of evaluating the probability of a human error occurring throughout the completion of a specific task. From such analyses measures can then be taken to reduce the likelihood of errors occurring within a system and therefore lead to an improvement in the overall levels of safety. There exist three primary reasons for conducting an HRA: error identification, error quantification and error reduction. As there exist a number of techniques used for such purposes, they can be split into one of two classifications: first-generation techniques and second-generation techniques. First-generation techniques work on the basis of the simple dichotomy of ‘fits/doesn’t fit’ in matching an error situation in context with related error identification and quantification. Second generation techniques are more theory-based in their assessment and quantification of errors. ‘HRA techniques have been utilised for various applications in a range of disciplines and industries including healthcare, engineering, nuclear, transportation and business.

Influence Diagrams Approach (IDA) is a technique used in the field of Human reliability Assessment (HRA), for the purposes of evaluating the probability of a human error occurring throughout the completion of a specific task. From such analyses measures can then be taken to reduce the likelihood of errors occurring within a system and therefore lead to an improvement in the overall levels of safety. There exist three primary reasons for conducting an HRA; error identification, error quantification and error reduction. As there exist a number of techniques used for such purposes, they can be split into one of two classifications; first generation techniques and second generation techniques. First generation techniques work on the basis of the simple dichotomy of ‘fits/doesn’t fit’ in the matching of the error situation in context with related error identification and quantification and second generation techniques are more theory based in their assessment and quantification of errors. ‘HRA techniques have been utilised in a range of industries including healthcare, engineering, nuclear, transportation and business sector; each technique has varying uses within different disciplines.

<span class="mw-page-title-main">Single point of failure</span> A part whose failure will disrupt the entire system

A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working. SPOFs are undesirable in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial system.

ISO 26262, titled "Road vehicles – Functional safety", is an international standard for functional safety of electrical and/or electronic systems that are installed in serial production road vehicles, defined by the International Organization for Standardization (ISO) in 2011, and revised in 2018.

<span class="mw-page-title-main">Reliability block diagram</span>

A reliability block diagram (RBD) is a diagrammatic method for showing how component reliability contributes to the success or failure of a redundant. RBD is also known as a dependence diagram (DD).

<span class="mw-page-title-main">Network theory in risk assessment</span>

A network is an abstract structure capturing only the basics of connection patterns and little else. Because it is a generalized pattern, tools developed for analyzing, modeling and understanding networks can theoretically be implemented across disciplines. As long as a system can be represented by a network, there is an extensive set of tools – mathematical, computational, and statistical – that are well-developed and if understood can be applied to the analysis of the system of interest.

A bow-tie diagram, when used in the field of pure risk, is a partial and simplified model of the process leading to adverse Consequences. A process model of this nature is of use in risk/safety science education and practice as the constituent terms can be defined objectively and comprehensively.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Clemens, P.L.; Rodney J. Simmons (March 1998). "System Safety and Risk Management". NIOSH Instructional Module, A Guide for Engineering Educators. Cincinnati,OH: National Institute for Occupational Safety and Health: IX-3–IX-7.
  2. Wang, John et al. (2000). What Every Engineer Should Know About Risk Engineering and Management, p. 69. , p. 69, at Google Books
  3. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Ericson, Clifton A. (2005). Hazard Analysis Techniques for System Safety. John Wiley & Sons, Inc.
  4. Hong, Eun-Soo; In-Mo Lee; Hee-Soon Shin; Seok-Woo Nam; Jung-Sik Kong (2009). "Quantitative risk evaluation based on event tree analysis technique: Application to the design of shield TBM". Tunneling and Underground Space Technology. 24 (3): 269–277. doi:10.1016/j.tust.2008.09.004.