Circuit breaker design pattern

Last updated December 16, 2024

The Circuit Breaker is a design pattern commonly used in software development to improve system resilience and fault tolerance. Circuit breaker pattern can prevent cascading failures particularly in distributed systems.^[1] In distributed systems, the Circuit Breaker pattern can be used to monitor service health and can detect failures dynamically. Unlike timeout-based methods, which can lead to delayed error responses or the premature failure of healthy requests, the Circuit Breaker pattern can proactively identify unresponsive services and can prevent repeated attempts. This approach can enhance the user experience. ^[2]

Challenges

According to Marc Brooker, circuit breakers can misinterpret a partial failure as total system failure and inadvertently bring down the entire system. In particular, sharded systems and cell-based architectures are vulnerable to this issue. A workaround is that the server indicates to the client which specific part is overloaded and the client uses a corresponding mini circuit breaker. However, this workaround can be complex and expensive.^[4]^[5]

Different states of circuit breaker

Closed
Open
Half-open

Closed state

When everything is normal, the circuit breakers remain closed, and all the request passes through to the services. If the number of failures increases beyond the threshold, the circuit breaker trips and goes into an open state.

Open state

In this state circuit breaker returns an error immediately without even invoking the services. The Circuit breakers move into the half-open state after a timeout period elapses. Usually, it will have a monitoring system where the timeout will be specified.

Half-open state

In this state, the circuit breaker allows a limited number of requests from the service to pass through and invoke the operation. If the requests are successful, then the circuit breaker will go to the closed state. However, if the requests continue to fail, then it goes back to open state.

Related Research Articles

Distributed computing is a field of computer science that studies distributed systems, defined as computer systems whose inter-communicating components are located on different networked computers.

In digital computers, an interrupt is a request for the processor to interrupt currently executing code, so that the event can be processed in a timely manner. If the request is accepted, the processor will suspend its current activities, save its state, and execute a function called an interrupt handler to deal with the event. This interruption is often temporary, allowing the software to resume normal activities after the interrupt handler finishes, although the interrupt could instead indicate a fatal error.

In computing, load balancing is the process of distributing a set of tasks over a set of resources, with the aim of making their overall processing more efficient. Load balancing can optimize response time and avoid unevenly overloading some compute nodes while other compute nodes are left idle.

Message-oriented middleware (MOM) is software or hardware infrastructure supporting sending and receiving messages between distributed systems. Message-oriented middleware is in contrast to streaming-oriented middleware where data is communicated as a sequence of bytes with no explicit message boundaries. Note that streaming protocols are almost always built above protocols using discrete messages such as frames (Ethernet), datagrams (UDP), packets (IP), cells (ATM), et al.

In software engineering, service-oriented architecture (SOA) is an architectural style that focuses on discrete services instead of a monolithic design. SOA is a good choice for system integration. By consequence, it is also applied in the field of software design where services are provided to the other components by application components, through a communication protocol over a network. A service is a discrete unit of functionality that can be accessed remotely and acted upon and updated independently, such as retrieving a credit card statement online. SOA is also intended to be independent of vendors, products and technologies.

REST is a software architectural style that was created to guide the design and development of the architecture for the World Wide Web. REST defines a set of constraints for how the architecture of a distributed, Internet-scale hypermedia system, such as the Web, should behave. The REST architectural style emphasises uniform interfaces, independent deployment of components, the scalability of interactions between them, and creating a layered architecture to promote caching to reduce user-perceived latency, enforce security, and encapsulate legacy systems.

In computer science, message passing is a technique for invoking behavior on a computer. The invoking program sends a message to a process and relies on that process and its supporting infrastructure to then select and run some appropriate code. Message passing differs from conventional programming where a process, subroutine, or function is directly invoked by name. Message passing is key to some models of concurrency and object-oriented programming.

In telecommunications and related engineering, the term timeout or time-out has several meanings, including:

Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission-critical, or even life-critical systems.

Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

Event-driven architecture (EDA) is a software architecture paradigm concerning the production and detection of events. Event-driven architectures are evolutionary in nature and provide a high degree of fault tolerance, performance, and scalability. However, they are complex and inherently challenging to test. EDAs are good for complex and dynamic workloads.

In computer science, state machine replication (SMR) or state machine approach is a general method for implementing a fault-tolerant service by replicating servers and coordinating client interactions with server replicas. The approach also provides a framework for understanding and designing replication management protocols.

In software engineering, a monolithic application is a single unified software application that is self-contained and independent from other applications, but typically lacks flexibility. There are advantages and disadvantages of building applications in a monolithic style of software architecture, depending on requirements. Monolith applications are relatively simple and have a low cost but their shortcomings are lack of elasticity, fault tolerance and scalability. Alternative styles to monolithic applications include multitier architectures, distributed computing and microservices. Despite their popularity in recent years, monolithic applications are still a good choice for applications with small team and little complexity. However, once it becomes too complex, you can consider refactoring it into microservices or a distributed application. Note that a monolithic application deployed on a single machine, may be performant enough for your current workload but it's less available, less durable, less changeable, less fine-tuned and less scalable than a well designed distributed system.

In computer science, fault injection is a testing technique for understanding how computing systems behave when stressed in unusual ways. This can be achieved using physical- or software-based means, or using a hybrid approach. Widely studied physical fault injections include the application of high voltages, extreme temperatures and electromagnetic pulses on electronic components, such as computer memory and central processing units. By exposing components to conditions beyond their intended operating limits, computing systems can be coerced into mis-executing instructions and corrupting critical data.

A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

Domain-driven design (DDD) is a major software design approach, focusing on modeling software to match a domain according to input from that domain's experts. DDD is against the idea of having a single unified model; instead it divides a large system into bounded contexts, each of which have their own model.

Software fault tolerance is the ability of computer software to continue its normal operation despite the presence of system or hardware faults. Fault-tolerant software has the ability to satisfy requirements despite failures.

Raft is a consensus algorithm designed as an alternative to the Paxos family of algorithms. It was meant to be more understandable than Paxos by means of separation of logic, but it is also formally proven safe and offers some additional features. Raft offers a generic way to distribute a state machine across a cluster of computing systems, ensuring that each node in the cluster agrees upon the same series of state transitions. It has a number of open-source reference implementations, with full-specification implementations in Go, C++, Java, and Scala. It is named after Reliable, Replicated, Redundant, And Fault-Tolerant.

In software engineering, a microservice architecture is an architectural pattern that arranges an application as a collection of loosely coupled, fine-grained services, communicating through lightweight protocols. A microservice-based architecture enables teams to develop and deploy their services independently, reduce code interdependency and increase readability and modularity within a codebase. This is achieved by reducing several dependencies in the codebase, allowing developers to evolve their services with limited restrictions, and reducing additional complexity. Consequently, organizations can develop software with rapid growth and scalability, as well as implement off-the-shelf services more easily. These benefits come with the cost of needing to maintain a decoupled structure within the codebase, which means its initial implementation is more complex than that of a monolithic codebase. Interfaces need to be designed carefully and treated as APIs.

Cell-based Architecture (CBA) is a software design paradigm that structures applications as a collection of small, self-contained units called "cells." Each cell encapsulates specific functionality along with its own data, logic, and state, enabling independent development, deployment, and scaling. CBA is a holistic approach that combines aspects of application architecture, deployment architecture, and organizational (people/team) architecture. This integration aligns technical design with team structures, promoting modularity, agility, and efficient collaboration across development processes.

References

↑ Machine Learning in Microservices Productionizing Microservices Architecture for Machine Learning Solutions. Packt Publishing. 2023. ISBN 9781804612149.
↑ Richards, Mark. Microservices AntiPatterns and Pitfalls. O'Reilly.
↑ Kubernetes Native Microservices with Quarkus and MicroProfile. Manning. 2022. ISBN 9781638357155.
↑ Understanding Distributed Systems. ISBN 9781838430214.
↑ "Will circuit breakers solve my problems?".

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Machine Learning in Microservices Productionizing Microservices Architecture for Machine Learning Solutions. Packt Publishing. 2023. ISBN 9781804612149.

[:1-2] Richards, Mark. Microservices AntiPatterns and Pitfalls. O'Reilly.

[3] Kubernetes Native Microservices with Quarkus and MicroProfile. Manning. 2022. ISBN 9781638357155.

[4] Understanding Distributed Systems. ISBN 9781838430214.

[5] "Will circuit breakers solve my problems?".

[1]

[2]

[3]

[4]

[5]