Critical system

Last updated

A critical system is a system which must be highly reliable and retain this reliability as it evolves without incurring prohibitive costs. [1]

Contents

There are four types of critical systems: safety critical, mission critical, business critical and security critical. [1]

Description

For such systems, trusted methods and techniques must be used for development. Consequently, critical systems are usually developed using well-tested techniques rather than newer techniques that have not been subject to extensive practical experience. Developers of critical systems are naturally conservative, preferring to use older techniques whose strengths and weaknesses are understood, rather than new techniques which may appear to be better, but whose long-term problems are unknown. [2]

Expensive software engineering techniques that are not cost-effective for non-critical systems may sometimes be used for critical systems development. For example, formal mathematical methods of software development have been successfully used for safety and security critical systems. One reason why these formal methods are used is that it helps reduce the amount of testing required. For critical systems, the costs of verification and validation are usually very high—more than 50% of the total system development costs. [2]

Classification

A critical system is distinguished by the consequences associated with system or function failure. Likewise, critical systems are further distinguished between fail-operational and fail safe systems, according to the tolerance they must exhibit to failures: [3]

Safety critical

Safety critical systems deal with scenarios that may lead to loss of life, serious personal injury, or damage to the natural environment. Examples of safety-critical systems are a control system for a chemical manufacturing plant, aircraft, the controller of an unmanned train metro system, a controller of a nuclear plant, etc. [2] [1] [3]

Mission critical

Mission critical systems are made to avoid inability to complete the overall system, project objectives or one of the goals for which the system was designed. Examples of mission-critical systems are a navigational system for a spacecraft, software controlling a baggage handling system of an airport, etc. [2] [1] [3]

Business critical

Business critical systems are programmed to avoid significant tangible or intangible economic costs; e.g., loss of business or damage to reputation. This is often due to the interruption of service caused by the system being unusable. Examples of a business-critical systems are the customer accounting system in a bank, stock-trading system, ERP system of a company, Internet search engine, etc. [2] [1] [3]

Security critical

Security critical systems deal with the loss of sensitive data through theft or accidental loss. [1]

See also

Notes

  1. 1 2 3 4 5 6 Hinchey, Mike; Coyle, Lorcan (2010). "Evolving Critical Systems: A Research Agenda for Computer-Based Systems". 2010 17th IEEE International Conference and Workshops on Engineering of Computer Based Systems. pp. 430–435. doi:10.1109/ECBS.2010.56. hdl: 10344/2085 . ISBN   978-1-4244-6537-8. S2CID   17986471.
  2. 1 2 3 4 5 "Mission Critical vs. Business Critical: HUH?". Activestate ActiveBlog. 16 March 2010.
  3. 1 2 3 4 Bozzano, Marco; Villafiorita, Adolfo (2010). Design and Safety Assessment of Critical Systemss. Austin, Texas: Auerbach Publications. p. 298. ISBN   9781439803318.

Related Research Articles

<span class="mw-page-title-main">Safety engineering</span> Engineering discipline which assures that engineered systems provide acceptable levels of safety

Safety engineering is an engineering discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety engineering. Safety engineering assures that a life-critical system behaves as needed, even when components fail.

In engineering, a fail-safe is a design feature or practice that, in the event of a specific type of failure, inherently responds in a way that will cause minimal or no harm to other equipment, to the environment or to people. Unlike inherent safety to a particular hazard, a system being "fail-safe" does not mean that failure is impossible or improbable, but rather that the system's design prevents or mitigates unsafe consequences of the system's failure. That is, if and when a "fail-safe" system fails, it remains at least as safe as it was before the failure. Since many types of failure are possible, failure mode and effects analysis is used to examine failure situations and recommend safety design and procedures.

In computer science, formal methods are mathematically rigorous techniques for the specification, development, analysis, and verification of software and hardware systems. The use of formal methods for software and hardware design is motivated by the expectation that, as in other engineering disciplines, performing appropriate mathematical analysis can contribute to the reliability and robustness of a design.

<span class="mw-page-title-main">Safety</span> State of being secure from harm, injury, danger, or other non-desirable outcomes

Safety is the state of being "safe", the condition of being protected from harm or other danger. Safety can also refer to the control of recognized hazards in order to achieve an acceptable level of risk.

<span class="mw-page-title-main">Safety-critical system</span> System whose failure would be serious

A safety-critical system or life-critical system is a system whose failure or malfunction may result in one of the following outcomes:

A mission critical factor of a system is any factor that is essential to business operation or to an organization. Failure or disruption of mission critical factors will result in serious impact on business operations or upon an organization, and even can cause social turmoil and catastrophes.

Failure mode and effects analysis is the process of reviewing as many components, assemblies, and subsystems as possible to identify potential failure modes in a system and their causes and effects. For each component, the failure modes and their resulting effects on the rest of the system are recorded in a specific FMEA worksheet. There are numerous variations of such worksheets. An FMEA can be a qualitative analysis, but may be put on a quantitative basis when mathematical failure rate models are combined with a statistical failure mode ratio database. It was one of the first highly structured, systematic techniques for failure analysis. It was developed by reliability engineers in the late 1950s to study problems that might arise from malfunctions of military systems. An FMEA is often the first step of a system reliability study.

In the context of software engineering, software quality refers to two related but distinct notions:

Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.

<span class="mw-page-title-main">Redundancy (engineering)</span> Duplication of critical components to increase reliability of a system

In engineering, redundancy is the intentional duplication of critical components or functions of a system with the goal of increasing reliability of the system, usually in the form of a backup or fail-safe, or to improve actual system performance, such as in the case of GNSS receivers, or multi-threaded computer processing.

Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of one or more faults within some of its components. If its operating quality decreases at all, the decrease is proportional to the severity of the failure, as compared to a naively designed system, in which even a small failure can cause total breakdown. Fault tolerance is particularly sought after in high-availability, mission-critical, or even life-critical systems. The ability of maintaining functionality when portions of a system break down is referred to as graceful degradation.

Integrated logistics support (ILS) is a technology in the system engineering to lower a product life cycle cost and decrease demand for logistics by the maintenance system optimization to ease the product support. Although originally developed for military purposes, it is also widely used in commercial customer service organisations.

Software assurance (SwA) is a critical process in software development that ensures the reliability, safety, and security of software products. It involves a variety of activities, including requirements analysis, design reviews, code inspections, testing, and formal verification. One crucial component of software assurance is secure coding practices, which follow industry-accepted standards and best practices, such as those outlined by the Software Engineering Institute (SEI) in their CERT Secure Coding Standards (SCS).

A hazard analysis is used as the first step in a process used to assess risk. The result of a hazard analysis is the identification of different types of hazards. A hazard is a potential condition and exists or not. It may, in single existence or in combination with other hazards and conditions, become an actual Functional Failure or Accident (Mishap). The way this exactly happens in one particular sequence is called a scenario. This scenario has a probability of occurrence. Often a system has many potential failure scenarios. It also is assigned a classification, based on the worst case severity of the end condition. Risk is the combination of probability and severity. Preliminary risk levels can be provided in the hazard analysis. The validation, more precise prediction (verification) and acceptance of risk is determined in the risk assessment (analysis). The main goal of both is to provide the best selection of means of controlling or eliminating the risk. The term is used in several engineering specialties, including avionics, food safety, occupational safety and health, process safety, reliability engineering.

Failure mode effects and criticality analysis (FMECA) is an extension of failure mode and effects analysis (FMEA).

Operational risk management (ORM) is defined as a continual recurring process that includes risk assessment, risk decision making, and the implementation of risk controls, resulting in the acceptance, mitigation, or avoidance of risk.

<span class="mw-page-title-main">Software security assurance</span>

Software security assurance is a process that helps design and implement software that protects the data and resources contained in and controlled by that software. Software is itself a resource and thus must be afforded appropriate security.

Motor Industry Software Reliability Association (MISRA) is an organization that produces guidelines for the software developed for electronic components used in the automotive industry. It is a collaboration between vehicle manufacturers, component suppliers and engineering consultancies. In 2021, the loose consortium restructured as The MISRA Consortium Limited.

Software reliability testing is a field of software-testing that relates to testing a software's ability to function, given environmental conditions, for a particular amount of time. Software reliability testing helps discover many problems in the software design and functionality.

<span class="mw-page-title-main">AC 25.1309-1</span> American aviation regulatory document

AC 25.1309–1 is an FAA Advisory Circular (AC) that identifies acceptable means for showing compliance with the airworthiness requirements of § 25.1309 of the Federal Aviation Regulations. Revision A was releases in 1988. In 2002, work was done on Revision B, but it was not formally released; the result is the Rulemaking Advisory Committee-recommended revision B-Arsenal Draft (2002). The Arsenal Draft is "considered to exist as a relatively mature draft". The FAA and EASA have subsequently accepted proposals by type certificate applicants to use the Arsenal Draft on development programs.