Failure reporting, analysis, and corrective action system

Last updated

A failure reporting, analysis, and corrective action system (FRACAS) is a system, sometimes carried out using software, that provides a process for reporting, classifying, analyzing failures, and planning corrective actions in response to those failures. It is typically used in an industrial environment to collect data, record and analyze system failures. A FRACAS system may attempt to manage multiple failure reports and produces a history of failure and corrective actions. FRACAS records the problems related to a product or process and their associated root causes and failure analyses to assist in identifying and implementing corrective actions.

The FRACAS method [1] was developed by the US Govt. and first introduced for use by the US Navy and all department of defense agencies in 1985. The FRACAS process is a closed loop with the following steps:

  1. Failure Reporting (FR). The failures and the faults related to a system, a piece of equipment, a piece of software or a process are formally reported through a standard form (Defect Report, Failure Report).
  2. Analysis (A). Perform analysis in order to identify the root cause of failure.
  3. Corrective Actions (CA). Identify, implement and verify corrective actions to prevent further recurrence of the failure.

Common FRACAS outputs may include: Part Number, Part Name, OEM, Field MTBF, MTBR, MTTR, spares consumption, reliability growth, failure/incidents distribution by type, location, part no., serial no, symptom, etc.

Related Research Articles

<span class="mw-page-title-main">Safety engineering</span> Engineering discipline which assures that engineered systems provide acceptable levels of safety

Safety engineering is an engineering discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety engineering. Safety engineering assures that a life-critical system behaves as needed, even when components fail.

<span class="mw-page-title-main">Fault tree analysis</span> Failure analysis system used in safety engineering and reliability engineering

Fault tree analysis (FTA) is a type of failure analysis in which an undesired state of a system is examined. This analysis method is mainly used in safety engineering and reliability engineering to understand how systems can fail, to identify the best ways to reduce risk and to determine event rates of a safety accident or a particular system level (functional) failure. FTA is used in the aerospace, nuclear power, chemical and process, pharmaceutical, petrochemical and other high-hazard industries; but is also used in fields as diverse as risk factor identification relating to social service system failure. FTA is also used in software engineering for debugging purposes and is closely related to cause-elimination technique used to detect bugs.

<span class="mw-page-title-main">Hazard analysis and critical control points</span> Systematic preventive approach to food safety

Hazard analysis and critical control points, or HACCP, is a systematic preventive approach to food safety from biological, chemical, and physical hazards in production processes that can cause the finished product to be unsafe and designs measures to reduce these risks to a safe level. In this manner, HACCP attempts to avoid hazards rather than attempting to inspect finished products for the effects of those hazards. The HACCP system can be used at all stages of a food chain, from food production and preparation processes including packaging, distribution, etc. The Food and Drug Administration (FDA) and the United States Department of Agriculture (USDA) require mandatory HACCP programs for juice and meat as an effective approach to food safety and protecting public health. Meat HACCP systems are regulated by the USDA, while seafood and juice are regulated by the FDA. All other food companies in the United States that are required to register with the FDA under the Public Health Security and Bioterrorism Preparedness and Response Act of 2002, as well as firms outside the US that export food to the US, are transitioning to mandatory hazard analysis and risk-based preventive controls (HARPC) plans.

In science and engineering, root cause analysis (RCA) is a method of problem solving used for identifying the root causes of faults or problems. It is widely used in IT operations, manufacturing, telecommunications, industrial process control, accident analysis, medicine, healthcare industry, etc. Root cause analysis is a form of inductive and deductive inference.

Failure mode and effects analysis is the process of reviewing as many components, assemblies, and subsystems as possible to identify potential failure modes in a system and their causes and effects. For each component, the failure modes and their resulting effects on the rest of the system are recorded in a specific FMEA worksheet. There are numerous variations of such worksheets. An FMEA can be a qualitative analysis, but may be put on a quantitative basis when mathematical failure rate models are combined with a statistical failure mode ratio database. It was one of the first highly structured, systematic techniques for failure analysis. It was developed by reliability engineers in the late 1950s to study problems that might arise from malfunctions of military systems. An FMEA is often the first step of a system reliability study.

A sentinel event is "any unanticipated event in a healthcare setting that results in death or serious physical or psychological injury to a patient, not related to the natural course of the patient's illness". Sentinel events can be caused by major mistakes and negligence on the part of a healthcare provider, and are closely investigated by healthcare regulatory authorities. Sentinel events are identified under The Joint Commission (TJC) accreditation policies to help aid in root cause analysis and to assist in development of preventive measures. The Joint Commission tracks events in a database to ensure events are adequately analyzed, and that undesirable trends or decreases in performance are caught early and mitigated.

Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.

Business analysis is a professional discipline focused on identifying business needs and determining solutions to business problems. Solutions may include a software-systems development component, process improvements, or organizational changes, and may involve extensive analysis, strategic planning and policy development. A person dedicated to carrying out these tasks within an organization is called a business analyst or BA.

Failure analysis is the process of collecting and analyzing data to determine the cause of a failure, often with the goal of determining corrective actions or liability. According to Bloch and Geitner, ”machinery failures reveal a reaction chain of cause and effect… usually a deficiency commonly referred to as the symptom…”. Failure analysis can save money, lives, and resources if done correctly and acted upon. It is an important discipline in many branches of manufacturing industry, such as the electronics industry, where it is a vital tool used in the development of new products and for the improvement of existing products. The failure analysis process relies on collecting failed components for subsequent examination of the cause or causes of failure using a wide array of methods, especially microscopy and spectroscopy. Nondestructive testing (NDT) methods are valuable because the failed products are unaffected by analysis, so inspection sometimes starts using these methods.

Software assurance (SwA) is a critical process in software development that ensures the reliability, safety, and security of software products. It involves a variety of activities, including requirements analysis, design reviews, code inspections, testing, and formal verification. One crucial component of software assurance is secure coding practices, which follow industry-accepted standards and best practices, such as those outlined by the Software Engineering Institute (SEI) in their CERT Secure Coding Standards (SCS).

An incident is an event that could lead to loss of, or disruption to, an organization's operations, services or functions. Incident management (IcM) is a term describing the activities of an organization to identify, analyze, and correct hazards to prevent a future re-occurrence. These incidents within a structured organization are normally dealt with by either an incident response team (IRT), an incident management team (IMT), or Incident Command System (ICS). Without effective incident management, an incident can disrupt business operations, information security, IT systems, employees, customers, or other vital business functions.

Failure mode effects and criticality analysis (FMECA) is an extension of failure mode and effects analysis (FMEA).

In software engineering, software system safety optimizes system safety in the design, development, use, and maintenance of software systems and their integration with safety-critical hardware systems in an operational environment.

Eight Disciplines Methodology (8D) is a method or model developed at Ford Motor Company used to approach and to resolve problems, typically employed by quality engineers or other professionals. Focused on product and process improvement, its purpose is to identify, correct, and eliminate recurring problems. It establishes a permanent corrective action based on statistical analysis of the problem and on the origin of the problem by determining the root causes. Although it originally comprised eight stages, or 'disciplines', it was later augmented by an initial planning stage. 8D follows the logic of the PDCA cycle. The disciplines are:

<span class="mw-page-title-main">Accident analysis</span> Process to determine the causes of accidents to prevent recurrence

Accident analysis is a process carried out in order to determine the cause or causes of an accident so as to prevent further accidents of a similar kind. It is part of accident investigation or incident investigation. These analyses may be performed by a range of experts, including forensic scientists, forensic engineers or health and safety advisers. Accident investigators, particularly those in the aircraft industry, are colloquially known as "tin-kickers". Health and safety and patient safety professionals prefer using the term "incident" in place of the term "accident". Its retrospective nature means that accident analysis is primarily an exercise of directed explanation; conducted using the theories or methods the analyst has to hand, which directs the way in which the events, aspects, or features of accident phenomena are highlighted and explained. These analyses are also invaluable in determining ways to prevent future incidents from occurring. They provide good insight by determining root causes, into what failures occurred that lead to the incident.

Operational intelligence (OI) is a category of real-time dynamic, business analytics that delivers visibility and insight into data, streaming events and business operations. OI solutions run queries against streaming data feeds and event data to deliver analytic results as operational instructions. OI provides organizations the ability to make decisions and immediately act on these analytic insights, through manual or automated actions.

Corrective and preventive action consists of improvements to an organization's processes taken to eliminate causes of non-conformities or other undesirable situations. It is usually a set of actions, laws or regulations required by an organization to take in manufacturing, documentation, procedures, or systems to rectify and eliminate recurring non-conformance. Non-conformance is identified after systematic evaluation and analysis of the root cause of the non-conformance. Non-conformance may be a market complaint or customer complaint or failure of machinery or a quality management system, or misinterpretation of written instructions to carry out work. The corrective and preventive action is designed by a team that includes quality assurance personnel and personnel involved in the actual observation point of non-conformance. It must be systematically implemented and observed for its ability to eliminate further recurrence of such non-conformation. The Eight disciplines problem solving method, or 8D framework, can be used as an effective method of structuring a CAPA.

Event correlation is a technique for making sense of a large number of events and pinpointing the few events that are really important in that mass of information. This is accomplished by looking for and analyzing relationships between events.

The system safety concept calls for a risk management strategy based on identification, analysis of hazards and application of remedial controls using a systems-based approach. This is different from traditional safety strategies which rely on control of conditions and causes of an accident based either on the epidemiological analysis or as a result of investigation of individual past accidents. The concept of system safety is useful in demonstrating adequacy of technologies when difficulties are faced with probabilistic risk analysis. The underlying principle is one of synergy: a whole is more than sum of its parts. Systems-based approach to safety requires the application of scientific, technical and managerial skills to hazard identification, hazard analysis, and elimination, control, or management of hazards throughout the life-cycle of a system, program, project or an activity or a product. "Hazop" is one of several techniques available for identification of hazards.

PTC Windchill is a family of Product Lifecycle Management (PLM) software products that is offered by PTC. In 2004, as part of their expansion in the area of collaboration tools, they arranged having "a hosted version of Windchill to small- and medium-sized customers." As of 2011, products from its marketer, PTC, were being used by over 1.1 million users worldwide.

References

  1. MIL-HDBK-2155 at ASSIST