A failure reporting, analysis, and corrective action system (FRACAS) is a system, sometimes carried out using software, that provides a process for reporting, classifying, analyzing failures, and planning corrective actions in response to those failures. It is typically used in an industrial environment to collect data, record and analyze system failures. A FRACAS system may attempt to manage multiple failure reports and produces a history of failure and corrective actions. FRACAS records the problems related to a product or process and their associated root causes and failure analyses to assist in identifying and implementing corrective actions.
The FRACAS method [1] was developed by the US Govt. and first introduced for use by the US Navy and all department of defense agencies in 1985. The FRACAS process is a closed loop with the following steps:
Common FRACAS outputs may include: Part Number, Part Name, OEM, Field MTBF, MTBR, MTTR, spares consumption, reliability growth, failure/incidents distribution by type, location, part no., serial no, symptom, etc.
Safety engineering is an engineering discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety engineering. Safety engineering assures that a life-critical system behaves as needed, even when components fail.
Fault tree analysis (FTA) is a type of failure analysis in which an undesired state of a system is examined. This analysis method is mainly used in safety engineering and reliability engineering to understand how systems can fail, to identify the best ways to reduce risk and to determine event rates of a safety accident or a particular system level (functional) failure. FTA is used in the aerospace, nuclear power, chemical and process, pharmaceutical, petrochemical and other high-hazard industries; but is also used in fields as diverse as risk factor identification relating to social service system failure. FTA is also used in software engineering for debugging purposes and is closely related to cause-elimination technique used to detect bugs.
Hazard analysis and critical control points, or HACCP, is a systematic preventive approach to food safety from biological, chemical, and physical hazards in production processes that can cause the finished product to be unsafe and designs measures to reduce these risks to a safe level. In this manner, HACCP attempts to avoid hazards rather than attempting to inspect finished products for the effects of those hazards. The HACCP system can be used at all stages of a food chain, from food production and preparation processes including packaging, distribution, etc. The Food and Drug Administration (FDA) and the United States Department of Agriculture (USDA) require mandatory HACCP programs for juice and meat as an effective approach to food safety and protecting public health. Meat HACCP systems are regulated by the USDA, while seafood and juice are regulated by the FDA. All other food companies in the United States that are required to register with the FDA under the Public Health Security and Bioterrorism Preparedness and Response Act of 2002, as well as firms outside the US that export food to the US, are transitioning to mandatory hazard analysis and risk-based preventive controls (HARPC) plans.
In science and engineering, root cause analysis (RCA) is a method of problem solving used for identifying the root causes of faults or problems. It is widely used in IT operations, manufacturing, telecommunications, industrial process control, accident analysis (e.g., in aviation, rail transport, or nuclear plants), medical diagnosis, the healthcare industry (e.g., for epidemiology), etc. Root cause analysis is a form of inductive inference (first create a theory, or root, based on empirical evidence, or causes) and deductive inference (test the theory, i.e., the underlying causal mechanisms, with empirical data).
A cycle count is a perpetual inventory auditing procedure, where you follow a regularly repeated sequence of checks on a subset of inventory. Cycle counts contrast with traditional physical inventory in that a traditional physical inventory ceases operations at a facility while all items are counted. Cycle counts are less disruptive to daily operations, provide an ongoing measure of inventory accuracy and procedure execution, and can be tailored to focus on items with higher value, higher movement volume, or that are critical to business processes. Although some say that cycle counting should only be performed in facilities with a high degree of inventory accuracy, cycle counting is a means of achieving and sustaining high degrees of accuracy. Cycle counting can be used to identify root causes of problems in control processes and then monitor the effectiveness of the actions to eliminate the root causes. In contrast, identifying root causes of inventory errors, agreeing on actions to eliminate them to the point of perfecting control processes is virtually impossible with traditional inventory audit approaches.
Failure mode and effects analysis is the process of reviewing as many components, assemblies, and subsystems as possible to identify potential failure modes in a system and their causes and effects. For each component, the failure modes and their resulting effects on the rest of the system are recorded in a specific FMEA worksheet. There are numerous variations of such worksheets. A FMEA can be a qualitative analysis, but may be put on a quantitative basis when mathematical failure rate models are combined with a statistical failure mode ratio database. It was one of the first highly structured, systematic techniques for failure analysis. It was developed by reliability engineers in the late 1950s to study problems that might arise from malfunctions of military systems. An FMEA is often the first step of a system reliability study.
A sentinel event is "any unanticipated event in a healthcare setting that results in death or serious physical or psychological injury to a patient, not related to the natural course of the patient's illness". Sentinel events can be caused by major mistakes and negligence on the part of a healthcare provider, and are closely investigated by healthcare regulatory authorities. Sentinel events are identified under The Joint Commission (TJC) accreditation policies to help aid in root cause analysis and to assist in development of preventive measures. The Joint Commission tracks events in a database to ensure events are adequately analyzed, and that undesirable trends or decreases in performance are caught early and mitigated.
Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability is defined as the probability that a product, system, or service will perform its intended function adequately for a specified period of time, OR will operate in a defined environment without failure. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.
Failure analysis is the process of collecting and analyzing data to determine the cause of a failure, often with the goal of determining corrective actions or liability. According to Bloch and Geitner, ”machinery failures reveal a reaction chain of cause and effect… usually a deficiency commonly referred to as the symptom…”. Failure analysis can save money, lives, and resources if done correctly and acted upon. It is an important discipline in many branches of manufacturing industry, such as the electronics industry, where it is a vital tool used in the development of new products and for the improvement of existing products. The failure analysis process relies on collecting failed components for subsequent examination of the cause or causes of failure using a wide array of methods, especially microscopy and spectroscopy. Nondestructive testing (NDT) methods are valuable because the failed products are unaffected by analysis, so inspection sometimes starts using these methods.
Software assurance (SwA) is a critical process in software development that ensures the reliability, safety, and security of software products. It involves a variety of activities, including requirements analysis, design reviews, code inspections, testing, and formal verification. One crucial component of software assurance is secure coding practices, which follow industry-accepted standards and best practices, such as those outlined by the Software Engineering Institute (SEI) in their CERT Secure Coding Standards (SCS).
An incident is an event that could lead to loss of, or disruption to, an organization's operations, services or functions. Incident management (IcM) is a term describing the activities of an organization to identify, analyze, and correct hazards to prevent a future re-occurrence. These incidents within a structured organization are normally dealt with by either an incident response team (IRT), an incident management team (IMT), or Incident Command System (ICS). Without effective incident management, an incident can disrupt business operations, information security, IT systems, employees, customers, or other vital business functions.
Failure mode effects and criticality analysis (FMECA) is an extension of failure mode and effects analysis (FMEA).
Eight Disciplines Methodology (8D) is a method or model developed at Ford Motor Company used to approach and to resolve problems, typically employed by quality engineers or other professionals. Focused on product and process improvement, its purpose is to identify, correct, and eliminate recurring problems. It establishes a permanent corrective action based on statistical analysis of the problem and on the origin of the problem by determining the root causes. Although it originally comprised eight stages, or 'disciplines', it was later augmented by an initial planning stage. 8D follows the logic of the PDCA cycle. The disciplines are:
Accident analysis is a process carried out in order to determine the cause or causes of an accident so as to prevent further accidents of a similar kind. It is part of accident investigation or incident investigation. These analyses may be performed by a range of experts, including forensic scientists, forensic engineers or health and safety advisers. Accident investigators, particularly those in the aircraft industry, are colloquially known as "tin-kickers". Health and safety and patient safety professionals prefer using the term "incident" in place of the term "accident". Its retrospective nature means that accident analysis is primarily an exercise of directed explanation; conducted using the theories or methods the analyst has to hand, which directs the way in which the events, aspects, or features of accident phenomena are highlighted and explained. These analyses are also invaluable in determining ways to prevent future incidents from occurring. They provide good insight by determining root causes, into what failures occurred that lead to the incident.
Operational intelligence (OI) is a category of real-time dynamic, business analytics that delivers visibility and insight into data, streaming events and business operations. OI solutions run queries against streaming data feeds and event data to deliver analytic results as operational instructions. OI provides organizations the ability to make decisions and immediately act on these analytic insights, through manual or automated actions.
Corrective and preventive action consists of improvements to an organization's processes taken to eliminate causes of non-conformities or other undesirable situations. It is usually a set of actions, laws or regulations required by an organization to take in manufacturing, documentation, procedures, or systems to rectify and eliminate recurring non-conformance. Non-conformance is identified after systematic evaluation and analysis of the root cause of the non-conformance. Non-conformance may be a market complaint or customer complaint or failure of machinery or a quality management system, or misinterpretation of written instructions to carry out work. The corrective and preventive action is designed by a team that includes quality assurance personnel and personnel involved in the actual observation point of non-conformance. It must be systematically implemented and observed for its ability to eliminate further recurrence of such non-conformation. The Eight disciplines problem solving method, or 8D framework, can be used as an effective method of structuring a CAPA.
Event correlation is a technique for making sense of a large number of events and pinpointing the few events that are really important in that mass of information. This is accomplished by looking for and analyzing relationships between events.
The system safety concept calls for a risk management strategy based on identification, analysis of hazards and application of remedial controls using a systems-based approach. This is different from traditional safety strategies which rely on control of conditions and causes of an accident based either on the epidemiological analysis or as a result of investigation of individual past accidents. The concept of system safety is useful in demonstrating adequacy of technologies when difficulties are faced with probabilistic risk analysis. The underlying principle is one of synergy: a whole is more than sum of its parts. Systems-based approach to safety requires the application of scientific, technical and managerial skills to hazard identification, hazard analysis, and elimination, control, or management of hazards throughout the life-cycle of a system, program, project or an activity or a product. "Hazop" is one of several techniques available for identification of hazards.
In requirements engineering, requirements elicitation is the practice of researching and discovering the requirements of a system from users, customers, and other stakeholders. The practice is also sometimes referred to as "requirement gathering".
PTC Windchill is a family of Product Lifecycle Management (PLM) software products that is offered by PTC. In 2004, as part of their expansion in the area of collaboration tools, they arranged having "a hosted version of Windchill to small- and medium-sized customers." As of 2011, products from its marketer, PTC, were being used by over 1.1 million users worldwide.