Developer(s) | MongoDB Inc. |
---|---|
Initial release | February 11, 2009 [1] |
Stable release | |
Repository | |
Written in | C++, JavaScript, Python |
Operating system | Windows Vista and later, Linux, OS X 10.7 and later, Solaris, [3] FreeBSD [4] |
Available in | English |
Type | Document-oriented database |
License | Server Side Public License or proprietary |
Website | mongodb |
MongoDB is a source-available, cross-platform, document-oriented database program. Classified as a NoSQL database product, MongoDB uses JSON-like documents with optional schemas. Released in February 2009 by 10gen (now MongoDB Inc.), it supports features like sharding, replication, and ACID transactions (from version 4.0). MongoDB Atlas, its managed cloud service, operates on AWS, Google Cloud Platform, and Microsoft Azure. Current versions are licensed under the Server Side Public License (SSPL). MongoDB is a member of the MACH Alliance.
The American software company 10gen began developing MongoDB in 2007 as a component of a planned platform-as-a-service product. In 2009, the company shifted to an open-source development model and began offering commercial support and other services. In 2013, 10gen changed its name to MongoDB Inc. [5]
On October 20, 2017, MongoDB became a publicly traded company, listed on NASDAQ as MDB with an IPO price of $24 per share. [6]
On November 8, 2018, with the stable release 4.0.4, the software's license changed from AGPL 3.0 to SSPL. [7] [8]
On October 30, 2019, MongoDB teamed with Alibaba Cloud to offer Alibaba Cloud customers a MongoDB-as-a-service solution. Customers can use the managed offering from Alibaba's global data centers. [9]
Version | Release date | Feature notes | Refs |
---|---|---|---|
1.0 | August 2009 | [10] | |
1.2 | December 2009 |
| [11] |
1.4 | March 2010 | [12] | |
1.6 | August 2010 |
| [13] |
1.8 | March 2011 | [14] | |
2.0 | September 2011 | [15] | |
2.2 | August 2012 | [16] | |
2.4 | March 2013 |
| [17] |
2.6 | April 8, 2014 |
| [18] |
3.0 | March 3, 2015 |
| [19] |
3.2 | December 8, 2015 |
| [20] |
3.4 | November 29, 2016 |
| [21] |
3.6 | November 2017 | [22] | |
4.0 | June 2018 |
| [23] |
4.2 | August 2019 | [24] | |
4.4 | July 2020 | [25] | |
4.4.5 | April 2021 | [25] | |
4.4.6 | May 2021 | [25] | |
5.0 | July 13, 2021 |
| [26] [27] [28] |
6.0 | July 2022 | [29] | |
7.0 | August 15, 2023 | [30] | |
8.0 | October 2, 2024 | [31] | |
MongoDB supports field, range query and regular-expression searches. [32] Queries can return specific fields of documents and also include user-defined JavaScript functions. Queries can also be configured to return a random sample of results of a given size.
Fields in a MongoDB document can be indexed with primary and secondary indices.
MongoDB provides high availability with replica sets. [33] A replica set consists of two or more copies of the data. Each replica-set member may act in the role of primary or secondary replica at any time. All writes and reads are done on the primary replica by default. Secondary replicas maintain a copy of the data of the primary using built-in replication. When a primary replica fails, the replica set automatically conducts an election process to determine which secondary should become the primary. Secondaries can optionally serve read operations, but that data is only eventually consistent by default.
If the replicated MongoDB deployment only has a single secondary member, a separate daemon called an arbiter must be added to the set. It has the single responsibility of resolving the election of the new primary. [34] As a consequence, an ideal distributed MongoDB deployment requires at least three separate servers, even in the case of just one primary and one secondary. [34]
MongoDB scales horizontally using sharding. [35] The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards, which are masters with one or more replicas. Alternatively, the shard key can be hashed to map to a shard – enabling an even data distribution.
MongoDB can run over multiple servers, balancing the load or duplicating data to keep the system functional in case of hardware failure.
MongoDB can be used as a file system, called GridFS, with load-balancing and data-replication features over multiple machines for storing files.
This function, called a grid file system, [36] is included with MongoDB drivers. MongoDB exposes functions for file manipulation and content to developers. GridFS can be accessed using the mongofiles utility or plugins for Nginx [37] and lighttpd. [38] GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document. [39]
MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function and single-purpose aggregation methods. [40]
Map-reduce can be used for batch processing of data and aggregation operations. However, according to MongoDB's documentation, the aggregation pipeline provides better performance for most aggregation operations. [41]
The aggregation framework enables users to obtain results similar to those returned by queries that include the SQL GROUP BY clause. Aggregation operators can be strung together to form a pipeline, analogous to Unix pipes. The aggregation framework includes the $lookup operator, which can join documents from multiple collections, as well as statistical operators such as standard deviation.
JavaScript can be used in queries, aggregation functions (such as MapReduce) and sent directly to the database to be executed.
MongoDB supports fixed-size collections called capped collections. This type of collection maintains insertion order and, once the specified size has been reached, behaves like a circular queue.
MongoDB supports multi-document ACID transactions since the 4.0 release in June 2018. [42]
The MongoDB Community Edition is free and available for Windows, Linux and macOS. [43]
MongoDB Enterprise Server is the commercial edition of MongoDB and is available as part of the MongoDB Enterprise Advanced subscription. [44]
MongoDB is also available as an on-demand, fully managed service. MongoDB Atlas runs on AWS, Microsoft Azure and Google Cloud Platform. [45]
On March 10, 2022, MongoDB warned its users in Russia and Belarus that their data stored on the MongoDB Atlas platform will be destroyed as a result of American sanctions related to the Russo-Ukrainian War. [46]
MongoDB has official drivers for major programming languages and development environments. [47] There are also a large number of unofficial or community-supported drivers for other programming languages and frameworks.
The primary interface to the database has been the mongo shell. Since MongoDB 3.2, MongoDB Compass is introduced as the native GUI. There are products and third-party projects that offer user interfaces for administration and data viewing. [48]
As of October 2018, MongoDB is released under the Server Side Public License (SSPL), a non-free license developed by the project. It replaces the GNU Affero General Public License, and is nearly identical to the GNU General Public License version 3, but requires that those making the software publicly available as part of a "service" must make the service's entire source code (insofar that a user would be able to recreate the service themselves) available under this license. By contrast, the AGPL only requires the source code of the licensed software to be provided to users when the software is conveyed over a network. [49] [50] The SSPL was submitted for certification to the Open Source Initiative but later withdrawn. [51] In January 2021, the Open Source Initiative stated that SSPL is not an open source license. [52] The language drivers are available under an Apache License. In addition, MongoDB Inc. offers proprietary licenses for MongoDB. The last versions licensed as AGPL version 3 are 4.0.3 (stable) and 4.1.4. [53]
MongoDB has been removed from the Debian, Fedora and Red Hat Enterprise Linux distributions because of the licensing change. Fedora determined that the SSPL version 1 is not a free software license because it is "intentionally crafted to be aggressively discriminatory" towards commercial users. [54] [55]
Because of MongoDB's default security configuration, which allows any user full access to the database, data from tens of thousands of MongoDB installations has been stolen. Furthermore, many MongoDB servers have been held for ransom. [56] [57] In September 2017, Davi Ottenheimer head of product security at MongoDB, proclaimed that measures had been taken to defend against these risks. [58]
From the MongoDB 2.6 release onward, the binaries for the official MongoDB RPM and DEB packages bind to localhost by default. From MongoDB 3.6, this default behavior was extended to all MongoDB packages across all platforms. As a result, all networked connections to the database are denied unless explicitly configured by an administrator. [59]
In some failure scenarios in which an application can access two distinct MongoDB processes that cannot access each other, it is possible for MongoDB to return stale reads. It is also possible for MongoDB to roll back writes that have been acknowledged. [60] The issue was addressed in version 3.4.0, released in November 2016, [61] and applied to earlier releases from v3.2.12 onward. [62]
Before version 2.2, locks were implemented on a per-server-process basis. With version 2.2, locks were implemented at the database level. [63] Beginning with version 3.0, [64] pluggable storage engines are available, and each storage engine may implement locks differently. [64] With MongoDB 3.0, locks are implemented at the collection level for the MMAPv1 storage engine, [65] while the WiredTiger storage engine uses an optimistic concurrency protocol that effectively provides document-level locking. [66] Even with versions prior to 3.0, one approach to increase concurrency is to use sharding. [67] In some situations, reads and writes will yield their locks. If MongoDB predicts that a page is unlikely to be in memory, operations will yield their lock while the pages load. The use of lock yielding expanded greatly in version 2.2. [68]
Until version 3.3.11, MongoDB could not perform collation-based sorting and was limited to bytewise comparison via memcmp, which would not provide correct ordering for many non-English languages when used with a Unicode encoding. The issue was fixed on August 23, 2016.
Prior to MongoDB 4.0, queries against an index were not atomic. Documents that were updated while queries was running could be missed. [69] The introduction of the snapshot read concern in MongoDB 4.0 eliminated this risk. [70]
MongoDB claimed that version 3.6.4 had passed "the industry's toughest data safety, correctness, and consistency tests" by Jepsen, and that "MongoDB offers among the strongest data consistency, correctness, and safety guarantees of any database available today." [71] Jepsen, which describes itself as a "distributed systems safety research company," disputed both claims on Twitter, saying, "In that report, MongoDB lost data and violated causal by default." In its May 2020 report on MongoDB version 4.2.6, Jepsen wrote that MongoDB had only mentioned tests that version 3.6.4 had passed, and that version had 4.2.6 introduced more problems. [72] Jepsen's test summary reads in part:
Jepsen evaluated MongoDB version 4.2.6, and found that even at the strongest levels of read and write concern, it failed to preserve snapshot isolation. Instead, Jepsen observed read skew, cyclic information flow, duplicate writes, and internal consistency violations. Weak defaults meant that transactions could lose writes and allow dirty reads, even downgrading requested safety levels at the database and collection level. Moreover, the snapshot read concern did not guarantee snapshot unless paired with write concern majority—even for read-only transactions. These design choices complicate the safe use of MongoDB transactions. [73]
On May 26, Jepsen updated the report to say: "MongoDB identified a bug in the transaction retry mechanism which they believe was responsible for the anomalies observed in this report; a patch is scheduled for 4.2.8." [73] The issue has been patched as of that version, and "Jepsen criticisms of the default write concerns have also been addressed, with the default write concern now elevated to the majority concern (w:majority) from MongoDB 5.0." [74]
MongoDB Inc. hosts an annual developer conference that has been called MongoDB World or MongoDB.live. [75]
Year | Dates | City | Venue | Notes |
---|---|---|---|---|
2014 [76] | June 23–25 | New York | Sheraton Times Square Hotel | |
2015 [77] | June 1–2 | New York | Sheraton Times Square Hotel | |
2016 [78] | June 28–29 | New York | New York Hilton Midtown | |
2017 [79] | June 20–21 | Chicago | Hyatt Regency Chicago | First year not in New York City |
2018 [80] | June 26–27 | New York | New York Hilton Midtown | |
2019 [81] | June 17–19 | New York | New York Hilton Midtown | |
2020 [82] | May 4–6 | Online | In‑person event canceled and conference held entirely online because of the COVID-19 pandemic | |
2021 [83] | July 13–14 | Online | Conference held online because of the COVID-19 pandemic | |
2022 [84] | June 7–9 | New York | Javitz Center |
MySQL is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database organizes data into one or more data tables in which data may be related to each other; these relations help structure the data. SQL is a language that programmers use to create, modify and extract data from the relational database, as well as control user access to the database. In addition to relational databases and SQL, an RDBMS like MySQL works with an operating system to implement a relational database in a computer's storage system, manages users, allows for network access and facilitates testing database integrity and creation of backups.
Memcached is a general-purpose distributed memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source must be read. Memcached is free and open-source software, licensed under the Revised BSD license. Memcached runs on Unix-like operating systems and on Microsoft Windows. It depends on the libevent library.
Source-available software is software released through a source code distribution model that includes arrangements where the source can be viewed, and in some cases modified, but without necessarily meeting the criteria to be called open-source. The licenses associated with the offerings range from allowing code to be viewed for reference to allowing code to be modified and redistributed for both commercial and non-commercial purposes.
The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.
Navicat is a series of graphical database management and development software produced by CyberTech Ltd. for MySQL, MariaDB, Redis, MongoDB, Oracle, SQLite, PostgreSQL and Microsoft SQL Server. It has an Explorer-like graphical user interface and supports multiple database connections for local and remote databases. Its design is made to meet the needs of a variety of audiences, from database administrators and programmers to various businesses/companies that serve clients and share information with partners.
Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang.
The GNU Affero General Public License is a free, copyleft license published by the Free Software Foundation in November 2007, and based on the GNU GPL version 3 and the Affero General Public License (non-GNU).
An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes:
A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard may be held on a separate database server instance, to spread load.
Redis is a source-available, in-memory storage, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Because it holds all data in memory and because of its design, Redis offers low-latency reads and writes, making it particularly suitable for use cases that require a cache. Redis is the most popular NoSQL database, and one of the most popular databases overall. Companies that use Redis include Twitter, Airbnb, Tinder, Yahoo, Adobe, Hulu, Amazon and OpenAI.
Couchbase Server, originally known as Membase, is a source-available, distributed multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve many concurrent users by creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, Couchbase Server is designed to provide easy-to-scale key-value, or JSON document access, with low latency and high sustainability throughput. It is designed to be clustered from a single machine to very large-scale deployments spanning many machines.
The open-core model is a business model for the monetization of commercially produced open-source software. The open-core model primarily involves offering a "core" or feature-limited version of a software product as free and open-source software, while offering "commercial" versions or add-ons as proprietary software. The term was coined by Andrew Lampitt in 2008.
Elasticsearch is a search engine based on Apache Lucene. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Official clients are available in Java, .NET (C#), PHP, Python, Ruby and many other languages. According to the DB-Engines ranking, Elasticsearch is the most popular enterprise search engine.
Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.
RocksDB is a high performance embedded database for key-value data. It is a fork of Google's LevelDB optimized to exploit multi-core processors (CPUs), and make efficient use of fast storage, such as solid-state drives (SSD), for input/output (I/O) bound workloads. It is based on a log-structured merge-tree data structure. It is written in C++ and provides official language bindings for C++, C, and Java. Many third-party language bindings exist. RocksDB is free and open-source software, released originally under a BSD 3-clause license. However, in July 2017 the project was migrated to a dual license of both Apache 2.0 and GPLv2 license. This change helped its adoption in Apache Software Foundation's projects after blacklist of the previous BSD+Patents license clause.
DBeaver is a SQL client software application and a database administration tool. For relational databases it uses the JDBC application programming interface (API) to interact with databases via a JDBC driver. For other databases (NoSQL) it uses proprietary database drivers. It provides an editor that supports code completion and syntax highlighting. It provides a plug-in architecture that allows users to modify much of the application's behavior to provide database-specific functionality or features that are database-independent. It is written in Java and based on the Eclipse platform.
Realm is an open source object database management system, initially for mobile operating systems (Android/iOS) but also available for platforms such as Xamarin, React Native, and others, including desktop applications (Windows). It is licensed under the Apache License.
Amazon DocumentDB is a managed proprietary NoSQL database service that supports document data structures, with some compatibility with MongoDB version 3.6 and version 4.0. As a document database, Amazon DocumentDB can store, query, and index JSON data. It is available on Amazon Web Services. As of March 2023, AWS introduced some compliance with MongoDB 5.0 but lacks time series collection support.
The Server Side Public License (SSPL) is a source-available copyleft software license introduced by MongoDB Inc. in 2018.