DBOS

Last updated

DBOS (Database-Oriented Operating System) is a database-oriented operating system meant to simplify and improve the scalability, security and resilience of large-scale distributed applications. [1] [2] It started in 2020 as a joint open source project with MIT, Stanford and Carnegie Mellon University, after a brainstorm between Michael Stonebraker and Matei Zaharia on how to scale and improve scheduling and performance of millions of Apache Spark tasks. [2]

Contents

The basic idea is to run a multi-node multi-core, transactional, highly-available distributed database, such as VoltDB, as the only application for a microkernel, and then to implement scheduling, messaging, file systems and other operating system services on top of the database.

The architectural philosophy is described by this quote from the abstract of their initial preprint:

All operating system state should be represented uniformly as database tables, and operations on this state should be made via queries from otherwise stateless tasks. This design makes it easy to scale and evolve the OS without whole-system refactoring, inspect and debug system state, upgrade components without downtime, manage decisions using machine learning, and implement sophisticated security features. [3]

Stonebraker claims a variety of security benefits, from a "smaller, less porous attack surface", to the ability to log and analyze how the system state changes in real-time due to the transactional nature of the OS. [1] Recovery from a severe bug or an attack can be as simple as rolling back the database to a previous state. And since the database is already distributed, the complexity of orchestration systems like Kubernetes can be avoided.

A prototype was built with competitive performance to existing systems. [4]

DBOS Cloud

In March of 2024, DBOS Cloud became the first commercial service from DBOS Inc. It provides transactional Functions as a Service (FaaS), and is positioned as a competitor to serverless computing architectures like AWS Lambda. DBOS Cloud is currently based on FoundationDB, a fast ACID NoSQL database, running on the Firecracker microVM service from AWS. It provides built-in support for features like multinode scaling and a "time-traveler" debugger that can help track down elusive heisenbugs and works in Visual Studio Code. Another feature is reliable execution, allowing a program to continue running even if the operating system needs to be restarted, and ensuring that no work is repeated. [5]

Firecracker runs on stripped down Linux microkernel via a stripped down KVM hypervisor, so parts of the Linux kernel are still under the covers, but work is ongoing to eliminate them. [6]

DBOS Cloud has been tested running across 1,000 cores running applications. The first API provided is for TypeScript, via the open-source DBOS Transact framework. [6] It provides a runtime with built-in reliable message delivery and idempotency. [7]

Holger Mueller of Constellation Research wondered how well DBOS the company can scale. “Will a small team at DBOS be able to run an OS, database, observability, workflow and cyber stack as good as the combination of the best of breed vendors?” [8]

See also

Related Research Articles

<span class="mw-page-title-main">Microkernel</span> Kernel that provides fewer services than a traditional kernel

In computer science, a microkernel is the near-minimum amount of software that can provide the mechanisms needed to implement an operating system (OS). These mechanisms include low-level address space management, thread management, and inter-process communication (IPC).

Mach is an operating system kernel developed at Carnegie Mellon University by Richard Rashid and Avie Tevanian to support operating system research, primarily distributed and parallel computing. Mach is often considered one of the earliest examples of a microkernel. However, not all versions of Mach are microkernels. Mach's derivatives are the basis of the operating system kernel in GNU Hurd and of Apple's XNU kernel used in macOS, iOS, iPadOS, tvOS, and watchOS.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB2 until 2017, when it changed to its present form.

A database transaction symbolizes a unit of work, performed within a database management system against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally represents any change in a database. Transactions in a database environment have two main purposes:

  1. To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure. For example: when execution prematurely and unexpectedly stops in which case many operations upon a database remain uncompleted, with unclear status.
  2. To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the programs' outcomes are possibly erroneous.
<span class="mw-page-title-main">FOSDEM</span> Annual event in Brussels centered on free and open source software development

Free and Open source Software Developers' European Meeting (FOSDEM) is a non-commercial, volunteer-organized European event centered on free and open-source software development. It is aimed at developers and anyone interested in the free and open-source software movement. It aims to enable developers to meet and to promote the awareness and use of free and open-source software.

K42 is a discontinued open-source research operating system (OS) for cache-coherent 64-bit multiprocessor systems. It was developed primarily at IBM Thomas J. Watson Research Center in collaboration with the University of Toronto and University of New Mexico. The main focus of this OS is to address performance and scalability issues of system software on large-scale, shared memory, non-uniform memory access (NUMA) multiprocessing computers.

In computing, a solution stack or software stack is a set of software subsystems or components needed to create a complete platform such that no additional software is needed to support applications. Applications are said to "run on" or "run on top of" the resulting platform.

<span class="mw-page-title-main">Amazon Elastic Compute Cloud</span> Cloud computing platform

Amazon Elastic Compute Cloud (EC2) is a part of Amazon's cloud-computing platform, Amazon Web Services (AWS), that allows users to rent virtual computers on which to run their own computer applications. EC2 encourages scalable deployment of applications by providing a web service through which a user can boot an Amazon Machine Image (AMI) to configure a virtual machine, which Amazon calls an "instance", containing any software desired. A user can create, launch, and terminate server-instances as needed, paying by the second for active servers – hence the term "elastic". EC2 provides users with control over the geographical location of instances that allows for latency optimization and high levels of redundancy. In November 2010, Amazon switched its own retail website platform to EC2 and AWS.

<span class="mw-page-title-main">Kernel (operating system)</span> Core of a computer operating system

A kernel is a computer program at the core of a computer's operating system that always has complete control over everything in the system. The kernel is also responsible for preventing and mitigating conflicts between different processes. It is the portion of the operating system code that is always resident in memory and facilitates interactions between hardware and software components. A full kernel controls all hardware resources via device drivers, arbitrates conflicts between processes concerning such resources, and optimizes the utilization of common resources e.g. CPU & cache usage, file systems, and network sockets. On most systems, the kernel is one of the first programs loaded on startup. It handles the rest of startup as well as memory, peripherals, and input/output (I/O) requests from software, translating them into data-processing instructions for the central processing unit.

<span class="mw-page-title-main">Michael Stonebraker</span> American computer scientist (born 1943)

Michael Ralph Stonebraker is an American computer scientist specializing in database systems. Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational databases. He is also the founder of many database companies, including Ingres Corporation, Illustra, Paradigm4, StreamBase Systems, Tamr, Vertica and VoltDB, and served as chief technical officer of Informix. For his contributions to database research, Stonebraker received the 2014 Turing Award, often described as "the Nobel Prize for computing."

<span class="mw-page-title-main">Vertica</span> Software company

Vertica is an analytic database management software company. Vertica was founded in 2005 by the database researcher Michael Stonebraker with Andrew Palmer as the founding CEO. Ralph Breslauer and Christopher P. Lynch served as CEOs later on.

Redis is a source-available, in-memory storage, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Because it holds all data in memory and because of its design, Redis offers low-latency reads and writes, making it particularly suitable for use cases that require a cache. Redis is the most popular NoSQL database, and one of the most popular databases overall. Redis is used in companies like Twitter, Airbnb, Tinder, Yahoo, Adobe, Hulu, Amazon and OpenAI.

<span class="mw-page-title-main">Ion Stoica</span> Romanian–American computer scientist

Ion Stoica is a Romanian–American computer scientist specializing in distributed systems, cloud computing and computer networking. He is a professor of computer science at the University of California, Berkeley and co-director of AMPLab. He co-founded Conviva and Databricks with other original developers of Apache Spark.

<span class="mw-page-title-main">OpenShift</span> Cloud computing software

OpenShift is a family of containerization software products developed by Red Hat. Its flagship product is the OpenShift Container Platform — a hybrid cloud platform as a service built around Linux containers orchestrated and managed by Kubernetes on a foundation of Red Hat Enterprise Linux. The family's other products provide this platform through different environments: OKD serves as the community-driven upstream, Several deployment methods are available including self-managed, cloud native under ROSA, ARO and RHOIC on AWS, Azure, and IBM Cloud respectively, OpenShift Online as software as a service, and OpenShift Dedicated as a managed service.

NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.

<span class="mw-page-title-main">Apache Mesos</span> Software to manage computer clusters

Apache Mesos is an open-source project to manage computer clusters. It was developed at the University of California, Berkeley.

<span class="mw-page-title-main">AWS Lambda</span> Serverless computing platform

AWS Lambda is an event-driven, serverless Function as a Service (FaaS) provided by Amazon as a part of Amazon Web Services. It is designed to enable developers to run code without provisioning or managing servers. It executes code in response to events and automatically manages the computing resources required by that code. It was introduced on November 13, 2014.

<span class="mw-page-title-main">Genode</span> Free and open-source software operating system

Genode is a free and open-source software operating system (OS) framework consisting of a microkernel abstraction layer and a set of user space components. The framework is notable as one of the few open-source operating systems not derived from a proprietary OS, such as Unix. The characteristic design philosophy is that a small trusted computing base is of primary concern in a security-oriented OS.

<span class="mw-page-title-main">Valkey</span> Open source in-memory key–value database

Valkey is an open-source in-memory storage, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Because it holds all data in memory and because of its design, Valkey offers low-latency reads and writes, making it particularly suitable for use cases that require a cache. Valkey is a successor to Redis, the most popular NoSQL database, and one of the most popular databases overall. Valkey or its predecessor Redis are used in companies like Twitter, Airbnb, Tinder, Yahoo, Adobe, Hulu, Amazon and OpenAI.

References

  1. 1 2 Werner, John. "Put The OS In The Database: Performance, Cybersecurity And Endurance In The Cloud". Forbes. Retrieved 2023-12-27.
  2. 1 2 Clark, Lindsay. "Postgres pioneer promises to upend the database once more". www.theregister.com. Retrieved 2023-12-27.
  3. Cafarella, Michael; DeWitt, David; Gadepally, Vijay; Kepner, Jeremy; Kozyrakis, Christos; Kraska, Tim; Stonebraker, Michael; Zaharia, Matei (2020-07-21), DBOS: A Proposal for a Data-Centric Operating System, arXiv: 2007.11112
  4. Skiadopoulos, Athinagoras; Li, Qian; Kraft, Peter; Kaffes, Kostis; Hong, Daniel; Mathew, Shana; Bestor, David; Cafarella, Michael; Gadepally, Vijay; Graefe, Goetz; Kepner, Jeremy; Kozyrakis, Christos; Kraska, Tim; Stonebraker, Michael; Suresh, Lalith (2021-09-01). "DBOS: a DBMS-oriented operating system". Proceedings of the VLDB Endowment. 15 (1): 21–30. doi:10.14778/3485450.3485454. hdl: 1721.1/143731 . ISSN   2150-8097. S2CID   245827586.
  5. Wayne Williams (2024-03-17). "'What if the operating system is the problem': Linux was never created for the cloud — so engineers developed DBOS, a new operating system that is part OS, part database". TechRadar. Retrieved 2024-04-22.
  6. 1 2 Morgan, Timothy Prickett (2024-03-12). "The Cloud Outgrows Linux, And Sparks A New Operating System". The Next Platform. Retrieved 2024-04-22.
  7. dbos-inc/dbos-transact, DBOS, Inc., 2024-04-22, retrieved 2024-04-22
  8. Ghoshal, Anirban (2024-03-12). "DBOS Cloud overturns database-on-OS conventions for speed". InfoWorld. Retrieved 2024-04-22.