Failing badly

Last updated

Failing badly and failing well are concepts in systems security and network security (and engineering in general) describing how a system reacts to failure. The terms have been popularized by Bruce Schneier, a cryptographer and security consultant. [1] [2]

Contents

Failing badly

A system that fails badly is one that has a catastrophic result when failure occurs. A single point of failure can thus bring down the whole system. Examples include:

Failing well

A system that fails well is one that compartmentalizes or contains its failure. Examples include:

Designing a system to 'fail well' has also been alleged to be a better use of limited security funds than the typical quest to eliminate all potential sources of errors and failure. [4]

See also

Related Research Articles

<span class="mw-page-title-main">Network address translation</span> Technique for making connections between IP address spaces

Network address translation (NAT) is a method of mapping an IP address space into another by modifying network address information in the IP header of packets while they are in transit across a traffic routing device. The technique was originally used to bypass the need to assign a new address to every host when a network was moved, or when the upstream Internet service provider was replaced, but could not route the network's address space. It has become a popular and essential tool in conserving global address space in the face of IPv4 address exhaustion. One Internet-routable IP address of a NAT gateway can be used for an entire private network.

<span class="mw-page-title-main">Load balancing (computing)</span> Set of techniques to improve the distribution of workloads across multiple computing resources

In computing, load balancing is the process of distributing a set of tasks over a set of resources, with the aim of making their overall processing more efficient. Load balancing can optimize response time and avoid unevenly overloading some compute nodes while other compute nodes are left idle.

<span class="mw-page-title-main">Power outage</span> Loss of electric power to an area

A power outage is the loss of the electrical power network supply to an end user.

<span class="mw-page-title-main">Cascading failure</span> Systemic risk of failure

A cascading failure is a failure in a system of interconnected parts in which the failure of one or few parts leads to the failure of other parts, growing progressively as a result of positive feedback. This can occur when a single part fails, increasing the probability that other portions of the system fail. Such a failure may happen in many types of systems, including power transmission, computer networking, finance, transportation systems, organisms, the human body, and ecosystems.

<span class="mw-page-title-main">Fatigue (material)</span> Initiation and propagation of cracks in a material due to cyclic loading

In materials science, fatigue is the initiation and propagation of cracks in a material due to cyclic loading. Once a fatigue crack has initiated, it grows a small amount with each loading cycle, typically producing striations on some parts of the fracture surface. The crack will continue to grow until it reaches a critical size, which occurs when the stress intensity factor of the crack exceeds the fracture toughness of the material, producing rapid propagation and typically complete fracture of the structure.

<span class="mw-page-title-main">Fracture mechanics</span> Study of propagation of cracks in materials

Fracture mechanics is the field of mechanics concerned with the study of the propagation of cracks in materials. It uses methods of analytical solid mechanics to calculate the driving force on a crack and those of experimental solid mechanics to characterize the material's resistance to fracture.

In software architecture, publish–subscribe is a messaging pattern where publishers categorize messages into classes that are received by subscribers. This is contrasted to the typical messaging pattern model where publishers send messages directly to subscribers.

<span class="mw-page-title-main">Reverse proxy</span> Type of proxy server

In computer networks, a reverse proxy or surrogate server is a proxy server that appears to any client to be an ordinary web server, but in reality merely acts as an intermediary that forwards the client's requests to one or more ordinary web servers. Reverse proxies help increase scalability, performance, resilience, and security, but they also carry a number of risks.

<span class="mw-page-title-main">Link aggregation</span> Using multiple network connections in parallel to increase capacity and reliability

In computer networking, link aggregation is the combining of multiple network connections in parallel by any of several methods. Link aggregation increases total throughput beyond what a single connection could sustain, and provides redundancy where all but one of the physical links may fail without losing connectivity. A link aggregation group (LAG) is the combined collection of physical ports.

<span class="mw-page-title-main">Redundancy (engineering)</span> Duplication of critical components to increase reliability of a system

In engineering and systems theory, redundancy is the intentional duplication of critical components or functions of a system with the goal of increasing reliability of the system, usually in the form of a backup or fail-safe, or to improve actual system performance, such as in the case of GNSS receivers, or multi-threaded computer processing.

Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission-critical, or even life-critical systems.

A cybersecurity regulation comprises directives that safeguard information technology and computer systems with the purpose of forcing companies and organizations to protect their systems and information from cyberattacks like viruses, worms, Trojan horses, phishing, denial of service (DOS) attacks, unauthorized access and control system attacks. While cybersecurity regulations aim to minimize cyber risks and enhance protection, the uncertainty arising from frequent changes or new regulations can significantly impact organizational response strategies.

High availability (HA) is a characteristic of a system that aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.

<span class="mw-page-title-main">Hard disk drive failure</span> Electromechanical malfunctioning

A hard disk drive failure occurs when a hard disk drive malfunctions and the stored information cannot be accessed with a properly configured computer.

<span class="mw-page-title-main">Failure</span> Not meeting a desired or intended objective

Failure is the social concept of not meeting a desirable or intended objective, and is usually viewed as the opposite of success. The criteria for failure depends on context, and may be relative to a particular observer or belief system. One person might consider a failure what another person considers a success, particularly in cases of direct competition or a zero-sum game. Similarly, the degree of success or failure in a situation may be differently viewed by distinct observers or participants, such that a situation that one considers to be a failure, another might consider to be a success, a qualified success or a neutral situation.

<span class="mw-page-title-main">Single point of failure</span> A part whose failure will disrupt the entire system

A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working. SPOFs are undesirable in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial system.

<span class="mw-page-title-main">Structural fracture mechanics</span> Field of structural engineering

Structural fracture mechanics is the field of structural engineering concerned with the study of load-carrying structures that includes one or several failed or damaged components. It uses methods of analytical solid mechanics, structural engineering, safety engineering, probability theory, and catastrophe theory to calculate the load and stress in the structural components and analyze the safety of a damaged structure.

<span class="mw-page-title-main">Structural integrity and failure</span> Ability of a structure to support a designed structural load without breaking

Structural integrity and failure is an aspect of engineering that deals with the ability of a structure to support a designed structural load without breaking and includes the study of past structural failures in order to prevent failures in future designs.

A cyberattack occurs when there is an unauthorized action against computer infrastructure that compromises the confidentiality, integrity, or availability of its content.

An Application Defined Network (ADN) is a style of enterprise data network that uses virtual networks and security components to provide a dedicated logical network for applications. This allows customized security and network policies to be created to meet the requirements of that specific application. ADN technology allows for simple physical architectures with fewer devices, less device configuration and integration. ADN solutions simplify businesses' needs to securely deploy multiple applications across the enterprise footprint and partner networks, regardless of where the application resides. They provide policy-based, application-specific delivery to corporate data centers, cloud services and third-party networks securely and cost-effectively. Some ADN solutions integrate 3G or 4G wireless backup services to enable a second internet connection when connectivity is lost on the primary access connection. The ADN design provides an application-to-application (A2A) based model that evolves enterprise networks beyond the site-to-site (S2S) private model.

References

  1. 1 2 Homeland Insecurity Archived 2011-09-28 at the Wayback Machine , Atlantic Monthly , September 2002
  2. David Hillson (29 March 2011). The Failure Files: Perspectives on Failure. Triarchy Press. p. 146. ISBN   9781908009302.
  3. 1 2 Eric Vanderburg (February 18, 2013). "Fail Secure – The right way to fail". PC Security World. Archived from the original on October 27, 2014. Retrieved November 11, 2014.
  4. Failing Well with Information Security Archived 2008-10-14 at the Wayback Machine - Young, William; Apogee Ltd Consulting, 2003