Couchbase Server

Last updated
Couchbase Server
Developer(s) Couchbase, Inc.
Initial releaseAugust 2010 (2010-08)
Stable release
7.6.2 / July 19, 2024 (2024-07-19)
Repository
Written in C++, Erlang, C, [1] Go, Java
Type Multi-model database, distributed key-value database, document-oriented database, JSON database
License BSL 1.1, [2] freemium
Website www.couchbase.com
Couchbase at AWS Summit CouchbaseAWSTorontoSummit.jpg
Couchbase at AWS Summit

Couchbase Server, originally known as Membase, is a source-available, [2] distributed (shared-nothing architecture) multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve many concurrent users by creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, Couchbase Server is designed to provide easy-to-scale key-value, or JSON document access, with low latency and high sustainability throughput. It is designed to be clustered from a single machine to very large-scale deployments spanning many machines.

Contents

Couchbase Server provided client protocol compatibility with memcached, [3] but added disk persistence, data replication, live cluster reconfiguration, rebalancing and multitenancy with data partitioning.

Product history

Membase was developed by several leaders of the memcached project, who had founded a company, NorthScale, to develop a key-value store with the simplicity, speed, and scalability of memcached, but also the storage, persistence and querying capabilities of a database. The original membase source code was contributed by NorthScale, and project co-sponsors Zynga and Naver Corporation (then known as NHN) to a new project on membase.org in June 2010. [4]

On February 8, 2011, the Membase project founders and Membase, Inc. announced a merger with CouchOne (a company with many of the principal players behind CouchDB) with an associated project merger. The merged company was called Couchbase, Inc. In January 2012, Couchbase released Couchbase Server 1.8. In September of 2012, Orbitz said it had changed some of its systems to use Couchbase. [5] In December of 2012, Couchbase Server 2.0 (announced in July 2011) was released and included a new JSON document store, indexing and querying, incremental MapReduce and replication across data centers. [6] [7]

Architecture

Every Couchbase node consists of a data service, index service, query service, and cluster manager component. Starting with the 4.0 release, the three services can be distributed to run on separate nodes of the cluster if needed. In the parlance of Eric Brewer's CAP theorem, Couchbase is normally a CP type system meaning it provides consistency and partition tolerance, or it can be set up as an AP system with multiple clusters.

Cluster manager

The cluster manager supervises the configuration and behavior of all the servers in a Couchbase cluster. It configures and supervises inter-node behavior like managing replication streams and re-balancing operations. It also provides metric aggregation and consensus functions for the cluster, and a RESTful cluster management interface. The cluster manager uses the Erlang programming language and the Open Telecom Platform.

Replication and fail-over

Data replication within the nodes of a cluster can be controlled with several parameters. In December of 2012, support was added for replication between different data centers. [6]

Data manager

The data manager stores and retrieves documents in response to data operations from applications. It asynchronously writes data to disk after acknowledging to the client. In version 1.7 and later, applications can optionally ensure data is written to more than one server or to disk before acknowledging a write to the client. Parameters define item ages that affect when data is persisted, and how max memory and migration from main-memory to disk is handled. It supports working sets greater than a memory quota per "node" or "bucket". External systems can subscribe to filtered data streams, supporting, for example, full text search indexing, data analytics or archiving. [8]

Data format

A document is the most basic unit of data manipulation in Couchbase Server. Documents are stored in JSON document format with no predefined schemas. Non-JSON documents can also be stored in Couchbase Server (binary, serialized values, XML, etc.)

Object-managed cache

Couchbase Server includes a built-in multi-threaded object-managed cache that implements memcached compatible APIs such as get, set, delete, append, prepend etc.

Storage engine

Couchbase Server has a tail-append storage design that is immune to data corruption, OOM killers or sudden loss of power. Data is written to the data file in an append-only manner, which enables Couchbase to do mostly sequential writes for update, and provide an optimized access patterns for disk I/O.

Performance

A performance benchmark done by Altoros in 2012, compared Couchbase Server with other technologies. [9] Cisco Systems published a benchmark that measured the latency and throughput of Couchbase Server with a mixed workload in 2012. [10]

Licensing and support

Couchbase Server is a packaged version of Couchbase's open source software technology and is available in a community edition without recent bug fixes with an Apache 2.0 license [11] and an edition for commercial use. [12] Couchbase Server builds are available for Ubuntu, Debian, Red Hat, SUSE, Oracle Linux, Microsoft Windows and macOS operating systems.

Couchbase has supported software developers' kits for the programming languages .NET, PHP, Ruby, Python, C, Node.js, Java, Go, and Scala.

SQL++

A query language called SQL++ (formerly called N1QL), is used for manipulating the JSON data in Couchbase, just like SQL manipulates data in RDBMS. It has SELECT, INSERT, UPDATE, DELETE, MERGE statements to operate on JSON data. It was initially announced in March 2015 as "SQL for documents". [13]

The SQL++ data model is non-first normal form (N1NF) with support for nested attributes and domain-oriented normalization. The SQL++ data model is also a proper superset and generalization of the relational model.

Example

{"email":"testme@example.org","friends":[{"name":"Pavan"},{"name":"Ravi"}]}
Like query
SELECT*FROM`bucket`WHEREemailLIKE"%@example.org";
Array query
SELECT*FROM`bucket`WHEREANYxINfriendsSATISFIESx.name="Pavan"END;

Couchbase Mobile

Couchbase Mobile / Couchbase Lite is a mobile database providing data replication. [14]

Couchbase Lite (originally TouchDB) provides native libraries for offline-first NoSQL databases with built-in peer-to-peer or client-server replication mechanisms. [15] Sync Gateway manages secure access and synchronization of data between Couchbase Lite and Couchbase Server. [16]

Couchbase Lite added support for Vector Search in version 3.2 [17] , allowing cloud to edge support for vector search in mobile applications.

Uses

Couchbase began as an evolution of Memcached, a high-speed data cache, and can be used as a drop-in replacement for Memcached, providing high availability for memcached application without code changes. [18]

Couchbase is used to support applications where a flexible data model, easy scalability, and consistent high performance are required, such as tracking real-time user activity or providing a store of user preferences or online applications. [19]

Couchbase Mobile, which stores data locally on devices (usually mobile devices) is used to create “offline-first” applications that can operate when a device is not connected to a network and synchronize with Couchbase Server once a network connection is re-established. [20]

The Catalyst Lab at Northwestern University uses Couchbase Mobile to support the Evo application, a healthy lifestyle research program where data is used to help participants improve dietary quality, physical activity, stress, or sleep. [21]

Amadeus uses Couchbase with Apache Kafka to support their “open, simple, and agile” strategy to consume and integrate data on loyalty programs for airline and other travel partners. High scalability is needed when disruptive travel events create a need to recognize and compensate high value customers. [22]

Starting in 2012, it played a role in LinkedIn's caching systems, including backend caching for recruiter and jobs products, counters for security defense mechanisms, for internal applications. [23]

Alternatives

For caching, Couchbase competes with Memcached and Redis. For document databases, Couchbase competes with other document-oriented database systems. It is commonly compared with MongoDB, Amazon DynamoDB, Oracle RDBMS, DataStax, Google Bigtable, MariaDB, IBM Cloudant, Redis Enterprise, SingleStore, and MarkLogic. [24] [25]

Bibliography

Related Research Articles

<span class="mw-page-title-main">MySQL</span> SQL database engine software

MySQL is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database organizes data into one or more data tables in which data may be related to each other; these relations help structure the data. SQL is a language that programmers use to create, modify and extract data from the relational database, as well as control user access to the database. In addition to relational databases and SQL, an RDBMS like MySQL works with an operating system to implement a relational database in a computer's storage system, manages users, allows for network access and facilitates testing database integrity and creation of backups.

Memcached is a general-purpose distributed memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source must be read. Memcached is free and open-source software, licensed under the Revised BSD license. Memcached runs on Unix-like operating systems and on Microsoft Windows. It depends on the libevent library.

A query language, also known as data query language or database query language (DQL), is a computer language used to make queries in databases and information systems. In database systems, query languages rely on strict theory to retrieve information. A well known example is the Structured Query Language (SQL).

MySQL Cluster, also known as MySQL Ndb Cluster is a technology providing shared-nothing clustering and auto-sharding for the MySQL database management system. It is designed to provide high availability and high throughput with low latency, while allowing for near linear scalability. MySQL Cluster is implemented through the NDB or NDBCLUSTER storage engine for MySQL.

Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group and resolving any conflicts that might arise between concurrent changes made by different members.

In database computing, Oracle Real Application Clusters (RAC) — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments. Oracle Corporation includes RAC with the Enterprise Edition, provided the nodes are clustered using Oracle Clusterware.

SAP SQL Anywhere is a proprietary relational database management system (RDBMS) product from SAP. SQL Anywhere was known as Sybase SQL Anywhere prior to the acquisition of Sybase by SAP.

<span class="mw-page-title-main">Apache CouchDB</span> Document-oriented NoSQL database

Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang.

Microsoft SQL Server is a proprietary relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network. Microsoft markets at least a dozen different editions of Microsoft SQL Server, aimed at different audiences and for workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard may be held on a separate database server instance, to spread load.

NoSQL is an approach to database design that focuses on providing a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Instead of the typical tabular structure of a relational database, NoSQL databases house data within one data structure. Since this non-relational database design does not require a schema, it offers rapid scalability to manage large and typically unstructured data sets. NoSQL systems are also sometimes called "Not only SQL" to emphasize that they may support SQL-like query languages or sit alongside SQL databases in polyglot-persistent architectures.

Redis is a source-available, in-memory storage, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Because it holds all data in memory and because of its design, Redis offers low-latency reads and writes, making it particularly suitable for use cases that require a cache. Redis is the most popular NoSQL database, and one of the most popular databases overall. Redis is used in companies like Twitter, Airbnb, Tinder, Yahoo, Adobe, Hulu, Amazon and OpenAI.

<span class="mw-page-title-main">Couchbase, Inc.</span> American software company

Couchbase, Inc. is an American public software company that develops and provides commercial packages and support for Couchbase Server and Couchbase Lite both of which are open-source, NoSQL, multi-model, document-oriented database software packages that store JSON documents or a pure key-value database. The company has its headquarters in Santa Clara, California, and offices in San Francisco, Austin, Bengaluru and the United Kingdom.

<span class="mw-page-title-main">Oracle NoSQL Database</span> Distributed database

Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.

In computing, Hazelcast is a unified real-time data platform implemented in Java that combines a fast data store with stream processing. It is also the name of the company that is developing the product. The Hazelcast company is funded by venture capital and headquartered in Palo Alto, California.

Elliptics is a distributed key–value data storage with open source code. By default it is a classic distributed hash table (DHT) with multiple replicas put in different groups. Elliptics was created to meet requirements of multi-datacenter and physically distributed storage locations when storing huge amount of medium and large files.

<span class="mw-page-title-main">ArangoDB</span> Multi-model database

ArangoDB is a graph database system developed by ArangoDB Inc. ArangoDB is a multi-model database system since it supports three data models with one database core and a unified query language AQL. AQL is mainly a declarative language and allows the combination of different data access patterns in a single query.

Amazon ElastiCache is a fully managed in-memory data store and cache service by Amazon Web Services (AWS). The service improves the performance of web applications by retrieving information from managed in-memory caches, instead of relying entirely on slower disk-based databases. ElastiCache supports two open-source in-memory caching engines: Memcached and Redis.

Database scalability is the ability of a database to handle changing demands by adding/removing resources. Databases use a host of techniques to cope. According to Marc Brooker: "a system is scalable in the range where marginal cost of additional workload is nearly constant." Serverless technologies fit this definition but you need to consider total cost of ownership not just the infra cost.

RavenDB is an open-source document-oriented database written in C#, developed by Hibernating Rhinos Ltd. It is cross-platform, supported on Windows, Linux, and Mac OS. RavenDB stores data as JSON documents and can be deployed in distributed clusters with master-master replication.

References

  1. Damien Katz (January 8, 2013). "The Unreasonable Effectiveness of C" . Retrieved September 30, 2016.
  2. 1 2 "Couchbase Adopts BSL License". The Couchbase Blog. 26 March 2021.
  3. "NewProtocols - memcached - Klingon - Memcached - Google Project Hosting". 2011-08-22. Retrieved 2013-06-04.
  4. Shashank Tiwari (31 August 2011). Professional NoSQL. John Wiley & Sons. pp. 15–16. ISBN   9781118167809.
  5. "Balancing Oracle and open source at Orbitz". GigaOM. September 21, 2012. Retrieved September 19, 2016.
  6. 1 2 Andrew Brust (December 12, 2012). "Couchbase 2.0 released; implements JSON document store". ZDNet.
  7. Derrick Harris (July 29, 2011). "Couchbase goes 2.0, pushes SQL for NoSQL". GigaOm. Archived from the original on October 2, 2016. Retrieved September 19, 2016.
  8. Trond Norbye (March 15, 2010). "Want to know what your memcached servers are doing? Tap them". Couchbase blog.
  9. Frank Weigel (October 30, 2012). "Benchmarking Couchbase". Couchbase. Retrieved September 30, 2016.
  10. "Cisco and Solarflare Achieve Dramatic Latency Reduction for Interactive Web Applications with Couchbase, a NoSQL Database" (PDF). Cisco Systems. June 18, 2012. Archived from the original (PDF) on August 13, 2012. Retrieved October 7, 2016.
  11. "Couchbase Open Source Projects". Couchbase web site. Retrieved October 7, 2016.
  12. "Couchbase Server Editions". Couchbase. Archived from the original on 2012-12-27. Retrieved 2012-12-07.
  13. Andy Slater (March 24, 2015). "Ssssh! don't tell anyone but Couchbase is a serious contender: Couchbase Live Europe 2015" . Retrieved February 13, 2018.
  14. "DB-Engines: Couchbase including Mobile". DB-Engines. Archived from the original on 2013-07-29. Retrieved 29 June 2021.
  15. "Lite | Couchbase". www.couchbase.com. Retrieved 11 May 2020.
  16. "Sync Gateway Couchbase". DB-Engines. Archived from the original on 2013-07-29. Retrieved 29 June 2021.
  17. "Vector Search | Couchbase". www.couchbase.com.
  18. Jaquier, Yannick (2016-09-27). "Couchbase server as a Memcached cluster (part 2)". IT World. Retrieved 2022-02-09.
  19. "Introduction to Couchbase - NoSQL Document Database". Today Software Magazine. Retrieved 2022-02-09.
  20. "Couchbase Mobile". DEV Community. 6 February 2022. Retrieved 2022-02-09.
  21. "How Northwestern's Catalyst Lab scales healthy behavior program with Couchbase". VentureBeat. 2021-12-31. Retrieved 2022-02-09.
  22. "Amadeus Loyalty wins the Couchbase Community Award under the Cloud Computing Category". Amadeus IT Group . January 20, 2022.
  23. Michael Kehoe (December 6, 2017). "Couchbase Ecosystem at LinkedIn". engineering.linkedin.com. Retrieved 2022-02-09.
  24. Inc, Gartner. "Top Couchbase Competitors and Alternatives - Gartner 2022 - Cloud Database Management Systems". Gartner. Retrieved 2022-02-09.{{cite web}}: |last= has generic name (help)
  25. "MongoDB to Couchbase: An Introduction to Developers and Experts - DZone Database". dzone.com. Retrieved 2022-02-09.