Fault reporting

Last updated

Fault reporting is a maintenance concept that increases operational availability and that reduces operating cost by three mechanisms:

Contents

That is a prerequisite for condition-based maintenance. [1]

Active redundancy can be integrated with fault reporting to reduce the down time to a few minutes per year.

History

Formal maintenance philosophies are required by organizations whose primary responsibility is to ensure systems are ready when expected, such as space agencies and military. [2]

Labor-intensive planned maintenance began during the rise of the Industrial Revolution and depends upon periodic diagnostic evaluation based upon calendar dates, distance, or use. The intent is to accomplish diagnostic evaluations that indicate when maintenance is required to prevent inconvenience and safety issues that will occur when critical equipment failures occur during use.

The electronic revolution allowed inexpensive sensors and controls to be integrated into most equipment. That includes diagnostic indicators, fluid sensors, temperature sensors, ignition sensors, exhaust monitoring, voltage sensors, and similar monitoring equipment that indicates when maintenance is required. Sensor displays are often located in inaccessible locations that cannot be observed during normal operation. Labor-intensive periodic maintenance is often required to inspect indicators.

Some organizations have eliminated most labor-intensive periodic maintenance and diagnostic down time by implementing designs that bring all sensor status to fault indicators near users.

Principle

Maintenance requires three actions.

Fault discovery requires diagnostic maintenance, which requires system down time and labor costs.

Down time and cost requirements associated with diagnostics are eliminated for every item that satisfies the following criteria.

Implementation

Fault reporting is an optional feature that can be forwarded to remote displays using simple configuration setting in all modern computing equipment. The system level of reporting that is appropriate for Condition Based Maintenance are critical, alert, and emergency, which indicate software termination due to failure. Specific failure reporting, like interface failure, can be integrated into applications linked with these reporting systems. There is no development cost if they are incorporated into designs.

Other kinds of fault reporting involves painting green, yellow, and red zones onto temperature gages, pressure gages, flow gages, vibration sensors, strain gages, and similar sensors. Remote viewing can be implemented using a video camera.

Benefits

The historical approach for Fault discovery is periodic diagnostic testing, which eliminates the following operational availability penalty.

Fault reporting eliminates maintenance costs associated manual diagnostic testing.

Labor is eliminated in redundant designs by using the fault discovery and fault isolation functions to automatically reconfigure equipment for degraded operation.

Maintenance savings can be re-allocated to upgrades and improvements that increase organizational competitiveness.

Problems

Faults that do not trigger a sustained requirement for fault isolation and fault recovery actions should not be displayed for management action.

For example, lighting up a fault indicator in situations if human intervention is not required induces breakage by causing maintenance personnel to perform work when nothing is already broken.

Another example is that enabling fault reporting for Internet network packet delivery failure increases network loading when the network is already busy, which causes a total network outage.

See also

Related Research Articles

In reliability engineering, the term availability has the following meanings:

<span class="mw-page-title-main">Maintenance</span> Maintaining a device in working condition

The technical meaning of maintenance involves functional checks, servicing, repairing or replacing of necessary devices, equipment, machinery, building infrastructure, and supporting utilities in industrial, business, and residential installations. Over time, this has come to include multiple wordings that describe various cost-effective practices to keep equipment operational; these activities occur either before or after a failure.

Failure mode and effects analysis is the process of reviewing as many components, assemblies, and subsystems as possible to identify potential failure modes in a system and their causes and effects. For each component, the failure modes and their resulting effects on the rest of the system are recorded in a specific FMEA worksheet. There are numerous variations of such worksheets. An FMEA can be a qualitative analysis, but may be put on a quantitative basis when mathematical failure rate models are combined with a statistical failure mode ratio database. It was one of the first highly structured, systematic techniques for failure analysis. It was developed by reliability engineers in the late 1950s to study problems that might arise from malfunctions of military systems. An FMEA is often the first step of a system reliability study.

A built-in self-test (BIST) or built-in test (BIT) is a mechanism that permits a machine to test itself. Engineers design BISTs to meet requirements such as:

<span class="mw-page-title-main">On-board diagnostics</span> Automotive engineering terminology

On-board diagnostics (OBD) is a term referring to a vehicle's self-diagnostic and reporting capability. OBD systems give the vehicle owner or repair technician access to the status of the various vehicle sub-systems. The amount of diagnostic information available via OBD has varied widely since its introduction in the early 1980s versions of on-board vehicle computers. Early versions of OBD would simply illuminate a malfunction indicator light (MIL) or "idiot light" if a problem was detected, but would not provide any information as to the nature of the problem. Modern OBD implementations use a standardized digital communications port to provide real-time data in addition to a standardized series of diagnostic trouble codes, or DTCs, which allow a person to rapidly identify and remedy malfunctions within the vehicle.

Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.

Condition monitoring is the process of monitoring a parameter of condition in machinery, in order to identify a significant change which is indicative of a developing fault. It is a major component of predictive maintenance. The use of condition monitoring allows maintenance to be scheduled, or other actions to be taken to prevent consequential damages and avoid its consequences. Condition monitoring has a unique benefit in that conditions that would shorten normal lifespan can be addressed before they develop into a major failure. Condition monitoring techniques are normally used on rotating equipment, auxiliary systems and other machinery like belt-driven equipment,, while periodic inspection using non-destructive testing (NDT) techniques and fit for service (FFS) evaluation are used for static plant equipment such as steam boilers, piping and heat exchangers.

<span class="mw-page-title-main">Predictive maintenance</span> Method to predict when equipment should be maintained

Predictive maintenance techniques are designed to help determine the condition of in-service equipment in order to estimate when maintenance should be performed. This approach promises cost savings over routine or time-based preventive maintenance, because tasks are performed only when warranted. Thus, it is regarded as condition-based maintenance carried out as suggested by estimations of the degradation state of an item.

High availability (HA) is a characteristic of a system which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.

Operations, administration, and management or operations, administration, and maintenance are the processes, activities, tools, and standards involved with operating, administering, managing and maintaining any system. This commonly applies to telecommunication, computer networks, and computer hardware.

In organizational management, mean down time (MDT) is the average time that a system is non-operational. This includes all downtime associated with repair, corrective and preventive maintenance, self-imposed downtime, and any logistics or administrative delays.

The term downtime is used to refer to periods when a system is unavailable.

Fault detection, isolation, and recovery (FDIR) is a subfield of control engineering which concerns itself with monitoring a system, identifying when a fault has occurred, and pinpointing the type of fault and its location. Two approaches can be distinguished: A direct pattern recognition of sensor readings that indicate a fault and an analysis of the discrepancy between the sensor readings and expected values, derived from some model. In the latter case, it is typical that a fault is said to be detected if the discrepancy or residual goes above a certain threshold. It is then the task of fault isolation to categorize the type of fault and its location in the machinery. Fault detection and isolation (FDI) techniques can be broadly classified into two categories. These include model-based FDI and signal processing based FDI.

Partial stroke testing is a technique used in a control system to allow the user to test a percentage of the possible failure modes of a shut down valve without the need to physically close the valve. PST is used to assist in determining that the safety function will operate on demand. PST is most often used on high integrity emergency shutdown valves (ESDVs) in applications where closing the valve will have a high cost burden yet proving the integrity of the valve is essential to maintaining a safe facility. In addition to ESDVs PST is also used on high integrity pressure protection systems or HIPPS. Partial stroke testing is not a replacement for the need to fully stroke valves as proof testing is still a mandatory requirement.

Availability is the probability that a system will work as required when required during the period of a mission. The mission could be the 18-hour span of an aircraft flight. The mission period could also be the 3 to 15-month span of a military deployment. Availability includes non-operational periods associated with reliability, maintenance, and logistics.

Maintenance Philosophy is the mix of strategies that ensure an item works as expected when needed.

Integrated vehicle health management (IVHM) or integrated system health management (ISHM) is the unified capability of systems to assess the current or future state of the member system health and integrate that picture of system health within a framework of available resources and operational demand.

Operational availability in systems engineering is a measurement of how long a system has been available to use when compared with how long it should have been available to be used.

Active redundancy is a design concept that increases operational availability and that reduces operating cost by automating most critical maintenance actions.

<span class="mw-page-title-main">Rebreather diving</span> Underwater diving using self contained breathing gas recycling apparatus

Rebreather diving is underwater diving using diving rebreathers, a class of underwater breathing apparatus which recirculate the breathing gas exhaled by the diver after replacing the oxygen used and removing the carbon dioxide metabolic product. Rebreather diving is practiced by recreational, military and scientific divers in applications where it has advantages over open circuit scuba, and surface supply of breathing gas is impracticable. The main advantages of rebreather diving are extended gas endurance, low noise levels, and lack of bubbles.

References

  1. "Opnav Instruction 4790.16: Condition Based Maintenance". US Navy Operations. Archived from the original on 2013-02-15. Retrieved 2012-08-15.
  2. "Opnav Instruction 4790.7: Maintenance Policy for United States Navy Ships". US Navy Operations. Archived from the original on 2013-02-15. Retrieved 2012-08-15.