Apache CouchDB

Apache CouchDB
	CouchDB's Fauxton Administration Interface, User database
Original author(s)	Damien Katz, Jan Lehnardt, Naomi Slater, Christopher Lenz, J. Chris Anderson, Paul Davis, Adam Kocoloski, Jason Davies, Benoît Chesneau, Filipe Manana, Robert Newson
Developer(s)	Apache Software Foundation
Initial release	2005;19 years ago
Stable release	3.3.3 / 4 December 2023;8 months ago
Repository	github.com/apache/couchdb ;
Written in	Erlang, JavaScript, C, C++
Operating system	Cross-platform
Type	Document-oriented database
License	Apache License 2.0
Website	couchdb.apache.org

Last updated August 05, 2024

Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang.

Unlike a relational database, a CouchDB database does not store data and relationships in tables. Instead, each database is a collection of independent documents. Each document maintains its own data and self-contained schema. An application may access multiple databases, such as one stored on a user's mobile phone and another on a server. Document metadata contains revision information, making it possible to merge any differences that may have occurred while the databases were disconnected.

CouchDB implements a form of multiversion concurrency control (MVCC) so it does not lock the database file during writes. Conflicts are left to the application to resolve. Resolving a conflict generally involves first merging data into one of the documents, then deleting the stale one.^[3]

Other features include document-level ACID semantics with eventual consistency, (incremental) MapReduce, and (incremental) replication. One of CouchDB's distinguishing features is multi-master replication, which allows it to scale across machines to build high-performance systems. A built-in Web application called Fauxton (formerly Futon) helps with administration.

History

Couch is an acronym for cluster of unreliable commodity hardware.^[4] The CouchDB project was created in April 2005 by Damien Katz, a former Lotus Notes developer at IBM. He self-funded the project for almost two years and released it as an open-source project under the GNU General Public License.

In February 2008, it became an Apache Incubator project and was offered under the Apache License instead.^[5] A few months after, it graduated to a top-level project.^[6] This led to the first stable version being released in July 2010.^[7]

In early 2012, Katz left the project to focus on Couchbase Server.^[8]

Since Katz's departure, the Apache CouchDB project has continued, releasing 1.2 in April 2012 and 1.3 in April 2013. In July 2013, the CouchDB community merged the codebase for BigCouch, Cloudant's clustered version of CouchDB, into the Apache project.^[9] The BigCouch clustering framework is included in the current release of Apache CouchDB.^[10]

Native clustering is supported at version 2.0.0. And the new Mango Query Server provides a simple JSON-based way to perform CouchDB queries without JavaScript or MapReduce. Also in version 2.0.0 was the introduction of Fauxton, the new built-in web interface, to replace Futon, the old built-in web interface.^[11]

Main features

ACID Semantics: CouchDB provides ACID semantics.^[12] It does this by implementing a form of Multi-Version Concurrency Control, meaning that CouchDB can handle a high volume of concurrent readers and writers without conflict.
Built for Offline: CouchDB can replicate to devices (like smartphones) that can go offline and handle data sync for you when the device is back online.
Distributed Architecture with Replication: CouchDB was designed with bi-directional replication (or synchronization) and off-line operation in mind. That means multiple replicas can have their own copies of the same data, modify it, and then sync those changes at a later time.
Document Storage: CouchDB stores data as "documents", as one or more field/value pairs expressed as JSON. Field values can be simple things like strings, numbers, or dates; but ordered lists and associative arrays can also be used. Every document in a CouchDB database has a unique id and there is no required document schema.
Eventual Consistency: CouchDB guarantees eventual consistency to be able to provide both availability and partition tolerance.
Map/Reduce Views and Indexes: The stored data is structured using views. In CouchDB, each view is constructed by a JavaScript function that acts as the Map half of a map/reduce operation. The function takes a document and transforms it into a single value that it returns. CouchDB can index views and keep those indexes updated as documents are added, removed, or updated.
HTTP API: All items have a unique URI that gets exposed via HTTP. It uses the HTTP methods POST, GET, PUT and DELETE for the four basic CRUD (Create, Read, Update, Delete) operations on all resources.

CouchDB also offers a built-in administration interface accessible via Web called Fauxton.^[13]

Use cases and production deployments

Replication and synchronization capabilities of CouchDB make it ideal for using it in mobile devices, where network connection is not guaranteed, and the application must keep on working offline.

CouchDB is well suited for applications with accumulating, occasionally changing data, on which pre-defined queries are to be run and where versioning is important (CRM, CMS systems, by example). Master-master replication is an especially interesting feature, allowing easy multi-site deployments.^[14]

Users

Users of CouchDB include:

CERN uses CouchDB as database for the Data Management System at the Large Hadron Collider.^[15]
Red Cross use the application iDAT for completing casework electronically in disaster areas. Here CouchDB is used as multi-node peer-to-peer offline-first database.^[16]
IBM Cloud services are based at a fundamental level on CouchDB.^[17]
United Airlines uses CouchDB for the in-flight entertainment systems in over 3,000 planes.^[18]^[19]
Amadeus IT Group, for some of their back-end systems.^{[ citation needed ]}
Credit Suisse, for internal use at commodities department for their marketplace framework.^[20]^{[ better source needed ]}
Meebo, for their social platform (Web and applications).^{[ citation needed ]} Meebo was acquired by Google and most products were shut down on July 12, 2012.^[21]
npm uses CouchDB as replicating database for their package registry.^[22]
Sophos, for some of their back-end systems.^{[ citation needed ]}
BBC, for a dynamic CMS-Platform.^[23]
Canonical began using it in 2009 for its synchronization service "Ubuntu One",^[24] but stopped using it in November 2011.^[25]
CANAL+ for international on-demand platform at CANAL+ Overseas.
Protogrid, as storage back-end for their rapid application development framework ^[26]

Data manipulation: documents and views

CouchDB manages a collection of JSON documents. The documents are organised via views. Views are defined with aggregate functions and filters are computed in parallel, much like MapReduce.

Views are generally stored in the database and their indexes are updated continuously. CouchDB supports a view system using external socket servers and a JSON-based protocol.^[27] As a consequence, view servers have been developed in a variety of languages (JavaScript is the default, but there are also PHP, Ruby, Python and Erlang).

Accessing data via HTTP

Applications interact with CouchDB via HTTP. The following demonstrates a few examples using cURL, a command-line utility. These examples assume that CouchDB is running on localhost (127.0.0.1) on port 5984.

Action	Request	Response
Accessing server information	curlhttp://127.0.0.1:5984/	{"couchdb":"Welcome","version":"1.1.0"}
Creating a database named wiki	curl-XPUThttp://127.0.0.1:5984/wiki	{"ok":true}
Attempting to create a second database named wiki	curl-XPUThttp://127.0.0.1:5984/wiki	{"error":"file_exists","reason":"The database could not be created, the file already exists."}
Retrieve information about the wiki database	curlhttp://127.0.0.1:5984/wiki	{"db_name":"wiki","doc_count":0,"doc_del_count":0,"update_seq":0,"purge_seq":0,"compact_running":false,"disk_size":79,"instance_start_time":"1272453873691070","disk_format_version":5}
Delete the database wiki	curl-XDELETEhttp://127.0.0.1:5984/wiki	{"ok":true}
Create a document, asking CouchDB to supply a document id	curl-XPOST-H"Content-Type: application/json"--data\'{ "text" : "Wikipedia on CouchDB", "rating": 5 }'\ http://127.0.0.1:5984/wiki	{"ok":true,"id":"123BAC","rev":"946B7D1C"}
get a list of databases	curlhttp://127.0.0.1:5984/_all_dbs	["_replicator","_users","wiki"]

Open source components

CouchDB includes a number of other open source projects as part of its default package.

Component	Description	License
Erlang	Erlang is a general-purpose concurrent programming language and runtime system. The sequential subset of Erlang is a functional language with strict evaluation, single assignment, and dynamic typing	Apache 2.0 (Release 18.0 and later) Erlang Public License (Earlier releases)
ICU	International Components for Unicode (ICU) is an open-source project of mature C/C++ and Java libraries for Unicode support, software internationalization and software globalization	Unicode License
jQuery	jQuery is a lightweight cross-browser JavaScript library that emphasizes interaction between JavaScript and HTML	MIT License
OpenSSL	OpenSSL is an open-source implementation of the SSL and TLS protocols. The core library (written in the C programming language) implements the basic cryptographic functions and provides various utility functions	Apache 1.0 and the four-clause BSD License
SpiderMonkey	SpiderMonkey is a performant JavaScript engine maintained by the Mozilla Foundation. It contains an interpreter, a JIT compiler and a garbage collector	MPL 2.0

Related Research Articles

eXist-db is an open source software project for NoSQL databases built on XML technology. It is classified as both a NoSQL document-oriented database system and a native XML database. Unlike most relational database management systems (RDBMS) and NoSQL databases, eXist-db provides XQuery and XSLT as its query and application programming languages.

In computing, a solution stack or software stack is a set of software subsystems or components needed to create a complete platform such that no additional software is needed to support applications. Applications are said to "run on" or "run on top of" the resulting platform.

Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group and resolving any conflicts that might arise between concurrent changes made by different members.

A LAMP is one of the most common software stacks for the web's most popular applications. Its generic software stack model has largely interchangeable components.

In computing, a materialized view is a database object that contains the results of a query. For example, it may be a local copy of data located remotely, or may be a subset of the rows and/or columns of a table or join result, or may be a summary using an aggregate function.

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

Mnesia is a distributed, soft real-time database management system written in the Erlang programming language. It is distributed as part of the Open Telecom Platform.

A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.

An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes:

A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load.

LYME and LYCE are software stacks composed entirely of free and open-source software to build high-availability heavy duty dynamic web pages. The stacks are composed of:

MongoDB is a source-available, cross-platform, document-oriented database program. Classified as a NoSQL database product, MongoDB utilizes JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and current versions are licensed under the Server Side Public License (SSPL). MongoDB is a member of the MACH Alliance.

Semi-structured data is a form of structured data that does not obey the tabular structure of data models associated with relational databases or other forms of data tables, but nonetheless contains tags or other markers to separate semantic elements and enforce hierarchies of records and fields within the data. Therefore, it is also known as self-describing structure.

Couchbase Server, originally known as Membase, is a source-available, distributed multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve many concurrent users by creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, Couchbase Server is designed to provide easy-to-scale key-value, or JSON document access, with low latency and high sustainability throughput. It is designed to be clustered from a single machine to very large-scale deployments spanning many machines.

Cloudant is an IBM software product, which is primarily delivered as a cloud-based service. Cloudant is a non-relational, distributed database service of the same name. Cloudant is based on the Apache-backed CouchDB project and the open source BigCouch project.

Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Built chiefly by contributions from developers from MapR, Drill is inspired by Google's Dremel system. Drill is an Apache top-level project. Tom Shiran is the founder of the Apache Drill Project. It was designated an Apache Software Foundation top-level project in December 2016.

<span class="mw-page-title-main">MEAN (solution stack)</span> JavaScript software stack

MEAN is a source-available JavaScript software stack for building dynamic web sites and web applications. A variation known as MERN replaces Angular with React.js front-end, and another named MEVN use Vue.js as front-end.

Elliptics is a distributed key–value data storage with open source code. By default it is a classic distributed hash table (DHT) with multiple replicas put in different groups. Elliptics was created to meet requirements of multi-datacenter and physically distributed storage locations when storing huge amount of medium and large files.

In computing, Hoodie is an open-source JavaScript package, that enables offline-first, front-end web development by providing a complete backend infrastructure. It aims to allow developers to rapidly develop web applications using only front-end code by providing a backend based on Node.js and Apache CouchDB. It runs on many Unix-like systems as well as on Microsoft Windows.

References

↑ "Release 3.3.3". 4 December 2023. Retrieved 19 December 2023.
↑ Apache Software Foundation. "Apache CouchDB" . Retrieved 15 April 2012.
↑ Smith, Jason. "What is the CouchDB replication protocol? Is it like Git?". StackOverflow. Stack Exchange. Retrieved 14 April 2012.
↑ "Exploring CouchDB". Developer Works. IBM. March 31, 2009. Retrieved September 30, 2016.
↑ Apache mailing list announcement on mail-archives.apache.org
↑ Re: Proposed Resolution: Establish CouchDB TLP on mail-archives.apache.org
↑ "CouchDB NoSQL Database Ready for Production Use" Archived 2010-11-15 at the Wayback Machine , article from PC World of July 2010
↑ Katz, Damien. "The future of CouchDB" . Retrieved 15 April 2012.
↑ Slater, Noah (25 July 2013). "Welcome BigCouch" . Retrieved 25 July 2013.
↑ "'2.0'". 20 September 2016. Retrieved 13 January 2017.
↑ "1.8. 2.0.x Branch — Apache CouchDB® 3.3 Documentation". docs.couchdb.org. Retrieved 2024-08-04.
↑ CouchDB, Technical Overview Archived October 20, 2011, at the Wayback Machine
↑ "couchdb-fauxton". GitHub. apache. Retrieved 2 May 2023.
↑ Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison from Kristóf Kovács
↑ "Why Large Hadron Collider Scientists are Using CouchDB". ReadWrite. 2010-08-26. Retrieved 2022-03-29.
↑ iDAT, Red Cross Code, 2021-07-31, retrieved 2022-03-29
↑ "Database-Deep-Dives-CouchDB". www.ibm.com. 19 July 2019. Retrieved 2022-03-29.
↑ "Database-Deep-Dives-CouchDB". www.ibm.com. 19 July 2019. Retrieved 2022-03-29.
↑ "United Airlines Streamlines Operations With Couchbase | Case Study". www.couchbase.com. Retrieved 2022-03-29.
↑ "CouchDB in the wild" Archived 2017-07-20 at the Wayback Machine article of the product's Web, a list of software projects and websites using CouchDB
↑ Cutler, Kim-Mai (9 June 2012). "Meebo Gets The Classic Google Acq-hire Treatment: Most Products To Shut Down Soon". TechCrunch. AOL Inc. Retrieved 7 January 2016.
↑ "npm-registry-couchapp". GitHub. npm. 17 June 2015. Retrieved 7 January 2016.
↑ CouchDB at the BBC as a fault tolerant, scalable, multi-data center key-value store
↑ Email from Elliot Murphy (Canonical) Archived 2011-05-05 at the Wayback Machine to the CouchDB-Devel list
↑ Canonical Drops CouchDB From Ubuntu One (Slashdot)
↑ "Protogrid - Über uns".
↑ View Server Documentation Archived 2008-10-20 at the Wayback Machine on wiki.apache.org

Bibliography

Anderson, J. Chris; Slater, Noah; Lehnardt, Jan (November 15, 2009), CouchDB: The Definitive Guide (1st ed.), O'Reilly Media, p. 300, ISBN 978-0-596-15816-3
Lennon, Joe (December 15, 2009), Beginning CouchDB (1st ed.), Apress, p. 300, ISBN 978-1-4302-7237-3, archived from the original on December 5, 2010, retrieved November 1, 2009
Holt, Bradley (March 7, 2011), Writing and Querying MapReduce Views in CouchDB (1st ed.), O'Reilly Media, p. 76, ISBN 978-1-4493-0312-9
Holt, Bradley (April 11, 2011), Scaling CouchDB (1st ed.), O'Reilly Media, p. 72, ISBN 978-1-4493-0343-3
Brown, MC (October 31, 2011), Getting Started with CouchDB (1st ed.), O'Reilly Media, p. 50, ISBN 978-1-4493-0755-4
Thompson, Mick (August 2, 2011), Getting Started with GEO, CouchDB, and Node.js (1st ed.), O'Reilly Media, p. 64, ISBN 978-1-4493-0752-3

External links

Official website
CouchDB: The Definitive Guide
Complete HTTP API Reference
Simple PHP5 library to communicate with CouchDB
Asynchronous CouchDB client for Java
Asynchronous CouchDB client for Scala
Lehnardt, Jan (2008). "Couch DB at 10,000 feet". Erlang eXchange 2008. Archived from the original on 9 November 2012. Retrieved 15 April 2012.
Lenhardt, Jan (2009). "CouchDB for Erlang Developers". Erlang Factory London 2009. Archived from the original on 19 June 2011. Retrieved 15 April 2012.
Katz, Damien (January 2009). "CouchDB and Me". RubyFringe. InfoQ. Archived from the original on 27 April 2011. Retrieved 15 April 2012.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[wikidata-4a7b9e9fa5b59c9ded3cbb36d18cf1ab3935a609-v13-1] "Release 3.3.3". 4 December 2023. Retrieved 19 December 2023.

[official-website-2] Apache Software Foundation. "Apache CouchDB" . Retrieved 15 April 2012.

[3] Smith, Jason. "What is the CouchDB replication protocol? Is it like Git?". StackOverflow. Stack Exchange. Retrieved 14 April 2012.

[4] "Exploring CouchDB". Developer Works. IBM. March 31, 2009. Retrieved September 30, 2016.

[5] Apache mailing list announcement on mail-archives.apache.org

[6] Re: Proposed Resolution: Establish CouchDB TLP on mail-archives.apache.org

[7] "CouchDB NoSQL Database Ready for Production Use" Archived 2010-11-15 at the Wayback Machine , article from PC World of July 2010

[8] Katz, Damien. "The future of CouchDB" . Retrieved 15 April 2012.

[9] Slater, Noah (25 July 2013). "Welcome BigCouch" . Retrieved 25 July 2013.

[couch2bigcouch-10] "'2.0'". 20 September 2016. Retrieved 13 January 2017.

[11] "1.8. 2.0.x Branch — Apache CouchDB® 3.3 Documentation". docs.couchdb.org. Retrieved 2024-08-04.

[ACID-12] CouchDB, Technical Overview Archived October 20, 2011, at the Wayback Machine

[13] "couchdb-fauxton". GitHub. apache. Retrieved 2 May 2023.

[Analyse_Kristof_Kovacs-14] Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison from Kristóf Kovács

[15] "Why Large Hadron Collider Scientists are Using CouchDB". ReadWrite. 2010-08-26. Retrieved 2022-03-29.

[16] iDAT, Red Cross Code, 2021-07-31, retrieved 2022-03-29

[17] "Database-Deep-Dives-CouchDB". www.ibm.com. 19 July 2019. Retrieved 2022-03-29.

[18] "Database-Deep-Dives-CouchDB". www.ibm.com. 19 July 2019. Retrieved 2022-03-29.

[19] "United Airlines Streamlines Operations With Couchbase | Case Study". www.couchbase.com. Retrieved 2022-03-29.

[couchdbinthewild-20] "CouchDB in the wild" Archived 2017-07-20 at the Wayback Machine article of the product's Web, a list of software projects and websites using CouchDB

[21] Cutler, Kim-Mai (9 June 2012). "Meebo Gets The Classic Google Acq-hire Treatment: Most Products To Shut Down Soon". TechCrunch. AOL Inc. Retrieved 7 January 2016.

[22] "npm-registry-couchapp". GitHub. npm. 17 June 2015. Retrieved 7 January 2016.

[23] CouchDB at the BBC as a fault tolerant, scalable, multi-data center key-value store

[24] Email from Elliot Murphy (Canonical) Archived 2011-05-05 at the Wayback Machine to the CouchDB-Devel list

[25] Canonical Drops CouchDB From Ubuntu One (Slashdot)

[26] "Protogrid - Über uns".

[Apache,_View_Server-27] View Server Documentation Archived 2008-10-20 at the Wayback Machine on wiki.apache.org

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

v t e The Apache Software Foundation
Top-level projects	Accumulo ActiveMQ Airavata Airflow Allura Ambari Ant Aries Arrow Apache HTTP Server APR Avro Axis Axis2 Beam Bloodhound Brooklyn Calcite Camel CarbonData Cassandra Cayenne CloudStack Cocoon Cordova CouchDB cTAKES CXF Derby Directory Drill Druid Empire-db Felix Flex Flink Flume FreeMarker Geronimo Groovy Guacamole Gump Hadoop HBase Helix Hive Iceberg Ignite Impala Jackrabbit James Jena JMeter Kafka Kudu Kylin Lucene Mahout Maven MINA mod_perl MyFaces Mynewt NiFi NetBeans Nutch NuttX OFBiz Oozie OpenEJB OpenJPA OpenNLP OрenOffice ORC PDFBox Parquet Phoenix POI Pig Pinot Pivot Qpid Roller RocketMQ Samza Shiro SINGA Sling Solr Spark Storm SpamAssassin Struts 1 Struts 2 Subversion Superset SystemDS Tapestry Thrift Tika TinkerPop Tomcat Trafodion Traffic Server UIMA Velocity Wicket Xalan Xerces XMLBeans Yetus ZooKeeper
Commons	BCEL BSF Daemon Jelly Logging
Incubator	Taverna
Other projects	Batik FOP Ivy Log4j
Attic	Apex AxKit Beehive Bluesky iBATIS Click Continuum Deltacloud Etch Giraph Hama Harmony Jakarta Marmotta MXNet ODE River Shale Slide Sqoop Stanbol Tuscany Wave XML
Licenses	Apache License
Category