Mission critical

Last updated

A mission critical factor of a system is any factor (component, equipment, personnel, process, procedure, software, etc.) that is essential to business operation or to an organization. Failure or disruption of mission critical factors will result in serious impact on business operations or upon an organization, and even can cause social turmoil and catastrophes. [1]

Contents

Mission critical systems

A mission critical system is a system that is essential to the survival of a business or organization. When a mission critical system fails or is interrupted, business operations are significantly impacted. Mission essential equipment and mission critical application are also known as mission critical system. [2] Examples of mission critical systems are: an online banking system, railway/aircraft operating and control systems, electric power systems, and many other computer systems that will adversely affect business and society when they fail.

A good example of a mission critical system is a navigational system for a spacecraft. The difference between mission critical and business critical lies in the major adverse impact and the very real possibilities of loss of life, serious injury and/or financial loss. [3] [4]

There are four different types of critical systems: mission critical, business critical, safety critical and security critical. The key difference between a safety critical system and mission critical system, is that safety critical system is a system that, if it fails, may result in serious environmental damage, injury, or loss of life, while mission critical system may result in failure in goal-directed activity. [5] An example of a safety critical system is a chemical manufacturing plant control system. Mission critical system and business critical system are similar terms, but a business critical system fault can influence only a single company or an organization and can partially stop lifetime activity (hours or days). Security critical system may lead to loss of sensitive data through theft or accidental loss. All these four systems are generalized as critical system. [6] [5]

As a rule in crisis management, if a triage-type decision is made in which certain components must be eliminated or delayed, e.g. because of resource or personnel constraints, mission critical ones must not be among them.

Examples

Every business companies and organizations will have mission critical systems if they are functioning. [7] A downed filtration system will cause the water filtration company to malfunction. In this case, the water filtration system is a mission critical system. If a gas system is downed, many restaurants and bakeries will have to shut down until the system functions well again. In this case, the gas system is a mission critical system. There are various other mission critical systems that, if they malfunction, will have serious impacts on other industries or organizations.

The aircraft is highly dependent on the navigating system. Air navigation is accomplished with many methods. Dead reckoning utilizes visual checkpoints along with distance and time calculations. The flight computer system aids the pilots to calculate the time and distance of the checkpoint that they set. The radio navigation aid (NAVAIDS) enables the pilots to navigate more accurately than dead reckoning alone, and in conditions of low visibility, radio navigation is handy. GPS is also used by pilots and uses 24 U.S. Department of Defence satellites to provide precise locational data, which includes speed, position, and track. [8]

If two-way radio communication malfunctions, the pilots have to follow the steps in the Title 14 of the Code of Federal Regulations (14 CFR) part 91. Pneumatic system failure, the associated loss of altitude, and various unfamiliar situations may cause stress and loss of situational awareness. In this case, pilot should use instruments such as navigators to seek more information about the situational data. In this case, the malfunction of the navigation system would be mission critical and would cause serious consequences. [9] [10] [11]

Nuclear reactor safety system

Nuclear reactor is a system that controls and contains the sustained nuclear chain reaction. It is usually used for generating electricity, but can also be used for conducting research and producing medical isotopes. [12] Nuclear reactors have been one of the most concerning systems for public safety worldwide because the malfunction of a nuclear reactor can cause a serious disaster. [1] Controlling the nuclear reactor system is accomplished by stopping, decreasing, or increasing the chain reaction inside the nuclear reactor. Varying the water level in the vertical cylinder and moving adjuster rods are the methods of controlling the chain reaction when the reactor is operating. Temperatures, reactor power levels, and pressure are constantly monitored by the sensitive detectors. [13]

History

The mission critical is a business's quintessence and if failed, will cause serious financial and reputational damages. Today, as the companies develop and world becomes more web-based community, the range of mission critical has extended. But the mission critical computing has been evolving since the pre-Web era (before 1995). In the entirely text-based pre-web internet, gopher was one of the ASCII-based end-user programs. The mission critical system was basically used in transactional applications during this era. A business process management software, ERP, and airline reservation systems were usually mission critical. These applications were run on dedicated system in the data center. There were limited number of end users and usually accessed via terminals and personal computers. [14]

After the pre-Web era, the Web era (1995 - 2010) rose. The range of mission critical increased to include electronic devices and web applications. More users were able to use to the internet and electronic devices, so larger number of end users were able to access increased mission critical applications. Therefore, the customers are expecting limitless availability and stronger security in the devices they are using. The businesses also start to become more web-based and this correspondingly increases the criminal associated with the money and fraud. This increase in range of mission critical made the security to become stronger and increased the security industries. Between 1995 and 2010, number of web users globally increased from 16 million to 1.7 billion. This shows increase in global reliance on web system. [15]

After the Web era, consumerization era (2010 and beyond) has risen. The range of mission critical is even more increased due to increase in social, mobile, and customer-facing applications. The consumerization of IT became greater, organizations increased and web and IT availabilities to the people increased. Social business, customer service, and customer support applications have increased greatly, so mission critical was expanded further. According to Gartner, native PC projects will be outnumbered by mobile development projects by the ratio of 4:1. Therefore, today's mission critical now encloses all subjects crucial for customer based service, business operation, employee productivity, and finance. The customers' expectations rose and small disruption can cause tremendous loss in the business. It was estimated that Amazon could have lost as much as $1,100 per second in net sales when it was suffering from an outage, and a five-minute outage of Google lost Google more than $545,000 [ citation needed ]. Failure in mission critical and even short time of outage can cause high price of downtime due to reputations damages. Longer periods of downtime of mission critical systems can result in even more serious problems to the industries or organizations. [16]

Safety & security

Mission critical systems should remain very secured in all industries or organization using it. Therefore, the industries are using various security systems to avoid mission critical failures. Mainframes or workstations based companies are all dependent on database and process control, so database and process control would be mission critical for them. Hospital patient recording, call centers, stock exchanges, data storage centers, flight control tower, and many other industries that are dependent on communication system and computer should be protected from the shutdown of the system and they are considered mission critical. All the companies and industries are unavoidable to the unexpected or extraordinary problems that can cause shutdown to the mission critical. To avoid this, using the safety systems is considered very important part in the business. [1]

Transport Layer Security (TLS)

The Transport Layer Security (TLS; formerly, Secure Socket Layers, SSL) refers to the standard security technology of networking protocol that controls and manages client and server authentication, and encrypt communication. This is usually used in the online transaction websites such as PayPal [17] and Bank of America, [18] which if systems are downed or hacked, will cause serious problems to the society and the companies themselves. In TLS, public-key and symmetric-key (encryption) are used together to secure the connection between two machines. Usually it is utilized in mail services or client machines that communicates via internet. To use this technology, the web server requires a digital certificate and this can be obtained through completing several questions about the identity of the website and get public keys and private keys (cryptographic keys). The industries using this technology may be also required to pay certain amount of money annually.

Shutdown systems

Nuclear power plants need safety systems to avoid mission-critical failures. The worst possible consequence that can result is leakage of radioactive materials (U-235 or Pu-239). One of the systems to avoid mission critical failures for nuclear power plants is shutdown system. It has two different forms: rod controls and safety injection control. When a problem occurs in the nuclear power plant, the rod control shutdown system drops the rods automatically and stops the chain reaction. The safety injection control injects liquid immediately when the system faces the problem in nuclear reactor and stops the chain reaction. Both systems are usually automatically operated, but also can be manually activated. [13] [19]

Real time and mission critical

Real time and mission critical are often confused by many people but they are not the same concept.

Real time

Real time is responsiveness of a computer that makes the computer to continually update on external processes, and should process the procedure or information in a specified time, or could result in serious consequences. [20] Video games are examples of real time since they are rendered by computer so rapidly, it is hard to notice the delay by the user. Each frame must be rendered in a short time to maintain the experience of interactivity. [ citation needed ] The speed of rendering graphics may vary according to the computer systems. [21]

Types of real time systems

  • Hard real-time system shouldn't miss the specified time or can result in serious consequences. It is non-negotiable in timing and it is "wrong answer" if the deadline is missed. The example of hard real time system is airbags for cars. [22]
  • Soft real-time system has more loose deadline. The system can handle the problems and functions normally even though the deadline is missed, but their functionality depends on fast-paced processes. An example for soft-real time can be typing, which, if delayed, people will get annoyed, but answer still is correct. [23]
  • Non real-time system doesn't have certain or absolute deadlines. However, the throughput of the activity of performance can still be very essential. [23]

Differences

Real time is a software that if specific time is not met, it fails, but mission critical is a system if failed, will result in catastrophic consequences. Although they go hand-in-hand, since real time can be mission critical, they are not the same concepts. These two are often confused by many people, but they are different concepts, but associated with each other. [22] [23]

Mission critical personnel and mission critical systems planning

Social survival

From the perspective of social function (i.e.: preserving society's life support structure, and overall structure intact), Mission critical aspects of social function would necessitate the provision of basic needs for society. Such basic needs are often said to include food (this includes food production and distribution), water, clothing (not an immediate need in an emergency), sanitation (sewage is an immediate need, but physical waste/garbage/rubbish disposal is not an immediate need in an emergency), housing/shelter, energy (not immediate) and health needs (not immediate in a healthy population). The prior list is not exhaustive. Longer term needs might include communications/transport needs in a developed population. This list of needs is associated with mission critical personnel in a clear manner - food production requires farmers, food distribution requires transportation personnel, water requires water-infrastructure maintenance personnel (a long term requirement if existing water infrastructure has been maintained to a high standard), clothing tends to require individuals to maintain clothes production infrastructure, similarly for sanitation. In emergencies, housing/shelter requires someone to build the shelter and maintain it over the long term if necessary. Health needs are met by doctors, nurses and surgeons. Implicit use of infrastructure requires personnel to maintain that infrastructure also - so food transportation requires not only that there are drivers for food trucks, but also that (over the long term) there are highways maintenance personnel who can maintain the roads, traffic infrastructure and signs for the roads, this in turn requires power supply personnel to provide power for traffic lights, etc... In this light, mission-critical systems have a complex dependency network which enables analysis of the level of interconnected dependencies between different aspects of a mission-critical system, which can be useful in planning, or just for gaining a truthful picture of how mission criticality is organised in complex systems. This would enable the determination of choke points in a complex system, points at which a mission-critical system (or set of systems) is vulnerable in one sense or another. Ideas that relate to human resources and human resources planning (making use of Gantt charts for project management, etc...) are also relevant.

Mission criticality depends upon the timescale associated with basic needs or other deemed mission-critical factors. Over the medium term (often taken to be 10 years) and the long term (which can go into 50-60 year timescales), the planning for mission-critical systems will clearly differ from short-term mission-critical systems planning. Mission-critical personnel can be considered part of the mission-critical systems planning paradigm but require a different approach to technological or mechanical aspects of mission-critical systems (i.e.: they require human resources planning).

Attributes of mission critical personnel

Psychometrics enables the determination and characterisation of various psychological aspects of mission-critical personnel (e.g.: their IQ in the case of highly skilled work, such as nuclear physics, for example). Some jobs require physical standards (for instance, in the army), or physical dexterity (e.g.: surgeons). There exist methods and means of characterising the types of skills, qualities and other attributes that certain mission-critical job roles require, and these can be used as benchmarks for determining whether certain individuals are well suited to a particular mission-critical job role, or what assistance a less qualified (or even less capable) individual would need in performing a certain mission-critical job role which might be beyond their abilities (such measures might have to be taken in emergency situations).

See also

Related Research Articles

<span class="mw-page-title-main">Computer security</span> Protection of computer systems from information disclosure, theft or damage

Computer security, cybersecurity, digital security or information technology security is the protection of computer systems and networks from attacks by malicious actors that may result in unauthorized information disclosure, theft of, or damage to hardware, software, or data, as well as from the disruption or misdirection of the services they provide.

In engineering, a fail-safe is a design feature or practice that, in the event of a failure of the design feature, inherently responds in a way that will cause minimal or no harm to other equipment, to the environment or to people. Unlike inherent safety to a particular hazard, a system being "fail-safe" does not mean that failure is impossible or improbable, but rather that the system's design prevents or mitigates unsafe consequences of the system's failure. That is, if and when a "fail-safe" system fails, it remains at least as safe as it was before the failure. Since many types of failure are possible, failure mode and effects analysis is used to examine failure situations and recommend safety design and procedures.

SCADA is a control system architecture comprising computers, networked data communications and graphical user interfaces for high-level supervision of machines and processes. It also covers sensors and other devices, such as programmable logic controllers, which interface with process plant or machinery.

<span class="mw-page-title-main">Boiling water reactor</span> Type of nuclear reactor that directly boils water

A boiling water reactor (BWR) is a type of light water nuclear reactor used for the generation of electrical power. It is the second most common type of electricity-generating nuclear reactor after the pressurized water reactor (PWR), which is also a type of light water nuclear reactor.

<span class="mw-page-title-main">Nuclear meltdown</span> Reactor accident due to core overheating

A nuclear meltdown is a severe nuclear reactor accident that results in core damage from overheating. The term nuclear meltdown is not officially defined by the International Atomic Energy Agency or by the United States Nuclear Regulatory Commission. It has been defined to mean the accidental melting of the core of a nuclear reactor, however, and is in common usage a reference to the core's either complete or partial collapse.

<span class="mw-page-title-main">Safety-critical system</span> System whose failure would be serious

A safety-critical system or life-critical system is a system whose failure or malfunction may result in one of the following outcomes:

<span class="mw-page-title-main">Nuclear and radiation accidents and incidents</span> Severe disruptive events involving fissile or fusile materials

A nuclear and radiation accident is defined by the International Atomic Energy Agency (IAEA) as "an event that has led to significant consequences to people, the environment or the facility." Examples include lethal effects to individuals, large radioactivity release to the environment, reactor core melt." The prime example of a "major nuclear accident" is one in which a reactor core is damaged and significant amounts of radioactive isotopes are released, such as in the Chernobyl disaster in 1986 and Fukushima nuclear disaster in 2011.

<span class="mw-page-title-main">Idaho National Laboratory</span> Laboratory in Idaho Falls, Idaho, United States

Idaho National Laboratory (INL) is one of the national laboratories of the United States Department of Energy and is managed by the Battelle Energy Alliance. Historically, the lab has been involved with nuclear research, although the laboratory does other research as well. Much of current knowledge about how nuclear reactors behave and misbehave was discovered at what is now Idaho National Laboratory. John Grossenbacher, former INL director, said, "The history of nuclear energy for peaceful application has principally been written in Idaho".

Passive nuclear safety is a design approach for safety features, implemented in a nuclear reactor, that does not require any active intervention on the part of the operator or electrical/electronic feedback in order to bring the reactor to a safe shutdown state, in the event of a particular type of emergency. Such design features tend to rely on the engineering of components such that their predicted behaviour would slow down, rather than accelerate the deterioration of the reactor state; they typically take advantage of natural forces or phenomena such as gravity, buoyancy, pressure differences, conduction or natural heat convection to accomplish safety functions without requiring an active power source. Many older common reactor designs use passive safety systems to a limited extent, rather, relying on active safety systems such as diesel-powered motors. Some newer reactor designs feature more passive systems; the motivation being that they are highly reliable and reduce the cost associated with the installation and maintenance of systems that would otherwise require multiple trains of equipment and redundant safety class power supplies in order to achieve the same level of reliability. However, weak driving forces that power many passive safety features can pose significant challenges to effectiveness of a passive system, particularly in the short term following an accident.

<span class="mw-page-title-main">Fort Saint Vrain Nuclear Power Plant</span> Decommissioned nuclear power plant

The Fort St. Vrain Nuclear Power Plant is a former commercial nuclear power station located near the town of Platteville in northern Colorado in the United States. It originally operated from 1979 until 1989. It had a 330 MWe High-temperature gas reactor (HTGR). The plant was decommissioned between 1989 and 1992.

Given organizations' increasing dependency on information technology to run their operations, business continuity planning covers the entire organization, and disaster recovery focuses on IT.

Fault tolerance is the ability of a system to maintain proper operation in the event of failures or faults in one or more of its components. If its operating quality decreases at all, the decrease is proportional to the severity of the failure, as compared to a naively designed system, in which even a small failure can lead to total breakdown. Fault tolerance is particularly sought after in high-availability, mission-critical, or even life-critical systems. The ability of maintaining functionality when portions of a system break down is referred to as graceful degradation.

<span class="mw-page-title-main">Forsmark Nuclear Power Plant</span> Nuclear power plant in Forsmark, Sweden

Forsmark Nuclear Power Plant is a nuclear power plant in Forsmark, Sweden that provides 14% of Sweden's total electricity output, and also the site of the Swedish Final repository for radioactive operational waste. It is operated by a company mainly owned by Vattenfall.

<span class="mw-page-title-main">U.S. critical infrastructure protection</span>

In the U.S., critical infrastructure protection (CIP) is a concept that relates to the preparedness and response to serious incidents that involve the critical infrastructure of a region or the nation. The American Presidential directive PDD-63 of May 1998 set up a national program of "Critical Infrastructure Protection". In 2014 the NIST Cybersecurity Framework was published after further presidential directives.

<span class="mw-page-title-main">Nuclear safety and security</span> Regulations for uses of radioactive materials

Nuclear safety is defined by the International Atomic Energy Agency (IAEA) as "The achievement of proper operating conditions, prevention of accidents or mitigation of accident consequences, resulting in protection of workers, the public and the environment from undue radiation hazards". The IAEA defines nuclear security as "The prevention and detection of and response to, theft, sabotage, unauthorized access, illegal transfer or other malicious acts involving nuclear materials, other radioactive substances or their associated facilities".

Infrastructure security is the security provided to protect infrastructure, especially critical infrastructure, such as airports, highways rail transport, hospitals, bridges, transport hubs, network communications, media, the electricity grid, dams, power plants, seaports, oil refineries, liquefied natural gas terminals and water systems. Infrastructure security seeks to limit vulnerability of these structures and systems to sabotage, terrorism, and contamination.

A unidirectional network is a network appliance or device that allows data to travel in only one direction. Data diodes can be found most commonly in high security environments, such as defense, where they serve as connections between two or more networks of differing security classifications. Given the rise of industrial IoT and digitization, this technology can now be found at the industrial control level for such facilities as nuclear power plants, power generation and safety critical systems like railway networks.

A critical system is a system which must be highly reliable and retain this reliability as it evolves without incurring prohibitive costs.

A cyberattack is any offensive maneuver that targets computer information systems, computer networks, infrastructures, personal computer devices, or smartphones. An attacker is a person or process that attempts to access data, functions, or other restricted areas of the system without authorization, potentially with malicious intent. Depending on the context, cyberattacks can be part of cyber warfare or cyberterrorism. A cyberattack can be employed by sovereign states, individuals, groups, societies or organizations and it may originate from an anonymous source. A product that facilitates a cyberattack is sometimes called a cyber weapon. Cyberattacks have increased over the last few years. A well-known example of a cyberattack is a distributed denial of service attack (DDoS).

Operational technology (OT) is hardware and software that detects or causes a change, through the direct monitoring and/or control of industrial equipment, assets, processes and events. The term has become established to demonstrate the technological and functional differences between traditional information technology (IT) systems and industrial control systems environment, the so-called "IT in the non-carpeted areas".

References

  1. 1 2 3 "Mission Critical: Overview, Examples, FAQ". Investopedia.
  2. "What is a Mission Critical System? - Definition from Techopedia". 20 March 2017.
  3. "Mission Critical vs. Business Critical: HUH?". Activestate ActiveBlog. 16 March 2010.
  4. "Business-critical systems". king-ict.com. Archived from the original on 2014-12-25.
  5. 1 2 "Critical systems". ifs.host.cs.st-andrews.ac.uk.
  6. Hinchey, Mike; Coyle, Lorcan (2010). Evolving Critical Systems: a Research Agenda for Computer-Based Systems (PDF). 2010 17th IEEE International Conference and Workshops on Engineering of Computer Based Systems. pp. 430–435. doi:10.1109/ECBS.2010.56.
  7. "What is mission critical operations?" (PDF). NCMCO.
  8. "How do pilots navigate". aviation.about.com. Archived from the original on 2015-12-08. Retrieved 2015-11-05.
  9. "Emergency Operation". Instrument Flying Handbook.
  10. "Code of Federal Regulation Section 91". faa.gov.
  11. "Aeronautical Decision Making". Risk Management Handbook.
  12. Touran, Nick. "What is a nuclear reactor?". What is nuclear?.
  13. 1 2 "Nuclear Power Plant Safety Systems". www.cnsc-ccsn.gc.ca.
  14. "The new scope of mission-critical computing". Computer World. 17 September 2013.
  15. Steven J. Vaughan-Nichols (17 April 2011). "Before the Web: the Internet in 1991". zdnet.com.
  16. "The new scope of mission-critical computing". Computer World. 17 September 2013.
  17. "What is PayPal and How Does it Work | PayPal US". www.paypal.com.
  18. "Privacy & Security Glossary of Terms from Bank of America". Bank of America.
  19. "Secure Socket Layers (SSL) Definition". Margaret Rouse. November 2014.
  20. John Huntington; Margaret Rouse (5 April 2006). "Real time". whatis.techtarget.com.
  21. "Real-time definition". techterms.com. 8 January 2007.
  22. 1 2 Fernando S. Schlindwein (March 2004). "EG717—Real time DSP". le.ac.uk.
  23. 1 2 3 "Real Time vs Mission Critical". c2.com. 27 April 2010.