Cloud database

Last updated

A cloud database is a database that typically runs on a cloud computing platform and access to the database is provided as-a-service. There are two common deployment models: users can run databases on the cloud independently, using a virtual machine image, or they can purchase access to a database service, maintained by a cloud database provider. Of the databases available on the cloud, some are SQL-based and some use a NoSQL data model.

Contents

Database services take care of scalability and high availability of the database. Database services make the underlying software-stack transparent to the user. [1]

Deployment models

There are two primary methods to run a database on a cloud platform:

Virtual machine image
Cloud platforms allow users to purchase virtual-machine instances for a limited time, and one can run a database on such virtual machines. Users can either upload their own machine image with a database installed on it, or use ready-made machine images that already include an optimized installation of a database. [2]
Database-as-a-service (DBaaS)
With a database as a service model, users pay fees to a cloud provider for services and computing resources, reducing the amount of money and effort needed to develop and manage databases. [2] Users are given tools to create and manage database instances, and control users. Some cloud providers also offer tools to manage database structures and data. [3] Many cloud providers offer both relational (Amazon RDS, SQL Server) and NoSQL (MongoDB, Amazon DynamoDB) databases. [3] This is a type of software as a service (SaaS).

Architecture and common characteristics

Data model

The design and development of typical systems utilize data management and relational databases as their key building blocks. Advanced queries expressed in SQL work well with the strict relationships that are imposed on information by relational databases. However, relational database technology was not initially designed or developed for use over distributed systems. This issue has been addressed with the addition of clustering enhancements to the relational databases, although some basic tasks require complex and expensive protocols, such as with data synchronization. [5]

Modern relational databases have shown poor performance on data-intensive systems, therefore, the idea of NoSQL has been utilized within database management systems for cloud based systems. [6] Within NoSQL implemented storage, there are no requirements for fixed table schemas, and the use of join operations is avoided. "The NoSQL databases have proven to provide efficient horizontal scalability, good performance, and ease of assembly into cloud applications." [7] Data models relying on simplified relay algorithms have also been employed in data-intensive cloud mapping applications unique to virtual frameworks. [8]

It is also important to differentiate between cloud databases which are relational as opposed to non-relational or NoSQL: [9]

SQL databases
SQL databases are one type of database which can run in the cloud, either in a virtual machine or as a service, depending on the vendor. While SQL databases are easily vertically scalable, horizontal scalability poses a challenge, that cloud database services based on SQL have started to address. [10] [ need quotation to verify ]
NoSQL databases
NoSQL databases are another type of database which can run in the cloud. NoSQL databases are built to service heavy read/write loads and can scale up and down easily, [11] and therefore they are more natively suited to running in the cloud. However, most contemporary applications are built around an SQL data model, so working with NoSQL databases often requires a complete rewrite of application code. [12]
Some SQL databases have developed NoSQL capabilities including JSON, binary JSON (e.g. BSON or similar variants), and key-value store data types.
A multi-model database with relational and non-relational capabilities provides a standard SQL interface to users and applications and thus facilitates the usage of such databases for contemporary applications built around an SQL data model. Native multi-model databases support multiple data models with one core and a unified query language to access all data models.

Vendors

The following table lists notable database vendors with a cloud database offering, classified by their deployment model – machine image vs. database as a service – and data model, SQL vs. NoSQL.

Cloud database vendors by deployment and data model
Virtual Machine DeploymentDatabase as a Service
SQL Data Model
NoSQL Data Model

See also

Related Research Articles

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB2 until 2017, when it changed to its present form.

In computing, a solution stack or software stack is a set of software subsystems or components needed to create a complete platform such that no additional software is needed to support applications. Applications are said to "run on" or "run on top of" the resulting platform.

<span class="mw-page-title-main">Amazon Elastic Compute Cloud</span> Cloud computing platform

Amazon Elastic Compute Cloud (EC2) is a part of Amazon.com's cloud-computing platform, Amazon Web Services (AWS), that allows users to rent virtual computers on which to run their own computer applications. EC2 encourages scalable deployment of applications by providing a web service through which a user can boot an Amazon Machine Image (AMI) to configure a virtual machine, which Amazon calls an "instance", containing any software desired. A user can create, launch, and terminate server-instances as needed, paying by the second for active servers – hence the term "elastic". EC2 provides users with control over the geographical location of instances that allows for latency optimization and high levels of redundancy. In November 2010, Amazon switched its own retail website platform to EC2 and AWS.

<span class="mw-page-title-main">Navicat</span> SQL database management software

Navicat is a series of graphical database management and development software produced by CyberTech Ltd. for MySQL, MariaDB, Redis, MongoDB, Oracle, SQLite, PostgreSQL and Microsoft SQL Server. It has an Explorer-like graphical user interface and supports multiple database connections for local and remote databases. Its design is made to meet the needs of a variety of audiences, from database administrators and programmers to various businesses/companies that serve clients and share information with partners.

<span class="mw-page-title-main">Microsoft Azure SQL Database</span> Managed cloud database

Microsoft Azure SQL Database is a managed cloud database (PaaS) provided as part of Microsoft Azure services. The service handles database management functions for cloud based Microsoft SQL Servers including upgrading, patching, backups, and monitoring without user involvement.

Google App Engine is a cloud computing platform as a service for developing and hosting web applications in Google-managed data centers. Applications are sandboxed and run across multiple servers. App Engine offers automatic scaling for web applications—as the number of requests increases for an application, App Engine automatically allocates more resources for the web application to handle the additional demand.

<span class="mw-page-title-main">Rackspace Cloud</span> Cloud computing platform

The Rackspace Cloud is a set of cloud computing products and services billed on a utility computing basis from the US-based company Rackspace. Offerings include Cloud Storage, virtual private server, load balancers, databases, backup, and monitoring.

A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load.

<span class="mw-page-title-main">Cloud computing</span> Form of shared Internet-based computing

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. Large clouds often have functions distributed over multiple locations, each of which is a data center. Cloud computing relies on sharing of resources to achieve coherence and typically uses a pay-as-you-go model, which can help in reducing capital expenses but may also lead to unexpected operating expenses for users.

<span class="mw-page-title-main">Microsoft Azure</span> Cloud computing platform by Microsoft

Microsoft Azure, often referred to as Azure, is a cloud computing platform run by Microsoft. It offers access, management, and the development of applications and services through global data centers. It also provides a range of capabilities, including software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS). Microsoft Azure supports many programming languages, tools, and frameworks, including Microsoft-specific and third-party software and systems.

NoSQL is an approach to database design that focuses on providing a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Instead of the typical tabular structure of a relational database, NoSQL databases house data within one data structure. Since this non-relational database design does not require a schema, it offers rapid scalability to manage large and typically unstructured data sets. NoSQL systems are also sometimes called "Not only SQL" to emphasize that they may support SQL-like query languages or sit alongside SQL databases in polyglot-persistent architectures.

A graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the graph. The graph relates the data items in the store to a collection of nodes and edges, the edges representing the relationships between the nodes. The relationships allow data in the store to be linked together directly and, in many cases, retrieved with one operation. Graph databases hold the relationships between data as a priority. Querying relationships is fast because they are perpetually stored in the database. Relationships can be intuitively visualized using graph databases, making them useful for heavily inter-connected data.

<span class="mw-page-title-main">Redis</span> Open-source in-memory key–value database

Redis is an open-source in-memory storage, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Because it holds all data in memory and because of its design, Redis offers low-latency reads and writes, making it particularly suitable for use cases that require a cache. Redis is the most popular NoSQL database, and one of the most popular databases overall. Redis is used in companies like Twitter, Airbnb, Tinder, Yahoo, Adobe, Hulu, Amazon and OpenAI.

Amazon Relational Database Service is a distributed relational database service by Amazon Web Services (AWS). It is a web service running "in the cloud" designed to simplify the setup, operation, and scaling of a relational database for use in applications. Administration processes like patching the database software, backing up databases and enabling point-in-time recovery are managed automatically. Scaling storage and compute resources can be performed by a single API call to the AWS control plane on-demand. AWS does not offer an SSH connection to the underlying virtual machine as part of the managed service.

<span class="mw-page-title-main">Apache Drill</span> Open-source software framework

Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Built chiefly by contributions from developers from MapR, Drill is inspired by Google's Dremel system. Drill is an Apache top-level project. Tom Shiran is the founder of the Apache Drill Project. It was designated an Apache Software Foundation top-level project in December 2016.

mLab American software company, 2011–2019

mLab was a cloud database service which hosted fully-managed MongoDB databases. mLab ran on cloud providers such as Amazon Web Services, Google Cloud, Microsoft Azure, as well as various platform-as-a-service providers.

Amazon Aurora is a relational database service developed and offered by Amazon Web Services beginning in October 2014. Aurora is available as part of the Amazon Relational Database Service (RDS).

Azure Cosmos DB is a globally distributed, multi-model database service and offered by Microsoft. It is designed to provide high availability, scalability, and low-latency access to data for modern applications. Unlike traditional relational databases, Cosmos DB is a NoSQL database, which means it can handle unstructured and semi-structured, in addition to structured, data types.

Serverless computing is a cloud computing execution model in which the cloud provider allocates machine resources on demand, taking care of the servers on behalf of their customers. "Serverless" is a misnomer in the sense that servers are still used by cloud service providers to execute code for developers. However, developers of serverless applications are not concerned with capacity planning, configuration, management, maintenance, fault tolerance, or scaling of containers, VMs, or physical servers. Serverless computing does not hold resources in volatile memory; computing is rather done in short bursts with the results persisted to storage. When an app is not in use, there are no computing resources allocated to the app. Pricing is based on the actual amount of resources consumed by an application. It can be a form of utility computing.

Amazon DocumentDB is a managed proprietary NoSQL database service that supports document data structures, with some compatibility with MongoDB version 3.6 and version 4.0. As a document database, Amazon DocumentDB can store, query, and index JSON data. It is available on Amazon Web Services. As of March 2023, AWS introduced some compliance with MongoDB 5.0 but lacks time series collection support.

References

  1. Hwang, G.; Fu, S. (May 2016). "Proof of Violation for Trust and Accountability of Cloud Database Systems". 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). pp. 425–433. doi:10.1109/CCGrid.2016.27. ISBN   978-1-5090-2453-7. S2CID   18373753.
  2. 1 2 3 Chao, Lee (2014). Cloud database development and management. Boca Raton: Taylor & Francis. ISBN   978-1-4665-6506-7. OCLC   857081580.
  3. 1 2 McHaney, Roger (2021). Cloud technologies: an overview of cloud computing technologies for managers. Hoboken, NJ. ISBN   978-1-119-76951-4. OCLC   1196822611.{{cite book}}: CS1 maint: location missing publisher (link)
  4. Sakr, Sherif (June 2014). "Cloud-hosted databases: technologies, challenges and opportunities". Cluster Computing. 17 (2): 487–502. doi:10.1007/s10586-013-0290-7. ISSN   1386-7857. S2CID   254370104.
  5. A. Anjomshoaa and A. Tjoa, "How the cloud computing paradigm could shape the future of enterprise information processing", Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services - iiWAS'11, pp. 7-10, 2011.
  6. S. Cass, "Designing for the Cloud", MIT Technology Review, 2009. [Online]. Available: https://www.technologyreview.com/s/414090/designing-for-the-cloud/. Retrieved 2016-10-04.
  7. "NoSQL", Wikipedia, 2016. Retrieved 2016-10-04.
  8. Modi, A (2017). "Live migration of virtual machines with their local persistent storage in a data intensive cloud". International Journal of High Performance Computing and Networking. 10 (1): 134. doi:10.1504/IJHPCN.2017.083213.
  9. https://docs.microsoft.com/en-us/azure/architecture/data-guide/big-data/non-relational-data Article in 'Microsoft Azure'
  10. Dave Rosenberg, Are databases in the cloud really all that different?, CNET, Retrieved 2011-11-6
  11. Agrawal, Rakesh; et al. (2008). "The Claremont report on database research" (PDF). SIGMOD Record. 37 (3): 9–19. CiteSeerX   10.1.1.211.5963 . doi:10.1145/1462571.1462573. ISSN   0163-5808. S2CID   666280.
  12. Ken North, "SQL, NoSQL or SomeSQL?", Dr. Dobb's, Retrieved 2011-11-9.
  13. Deploy your database applications and projects on the cloud, IBM.com, Retrieved 2011-9-1
  14. Chris Kanaracus, "Ingres rolls out cloud database offerings", InfoWorld , Retrieved 2011-8-28.
  15. "Amazon Web Services Announces Two New Database Services – AWS Database Migration Service and Amazon RDS for MariaDB Archived 2017-06-01 at the Wayback Machine , Amazon Press Releases, retrieved 2015-11-17
  16. "MariaDB Enterprise Cluster + MariaDB MaxScale Archived 2016-12-04 at the Wayback Machine , Microsoft Azure, retrieved 2015-11-17
  17. "Running MySQL on Amazon EC2 with EBS (Elastic Block Store), Amazon Web Services, retrieved 2011-11-20
  18. Swoyer, Stephen. "NuoDB: A Database for the Cloud." TDWI. Nov. 13, 2012. Retrieved Nov. 26, 2012
  19. Amazon Machine Images - Oracle Database 11g Release 2 (11.2.0.1) Enterprise Edition - 64 Bit Archived 2011-10-16 at the Wayback Machine , Amazon Web Services, Retrieved 2011-11-9.
  20. "Oracle Database in the Cloud", Oracle.com, Retrieved 2011-11-9.
  21. Chris Kanaracus, "EnterpriseDB Adding New Cloud Option for PostgreSQL Database", PCWorld , retrieved 2011-8-28
  22. "AWS | SAP HANA". Amazon Web Services, Inc. Retrieved 2016-07-07.
  23. "SAP Solutions". Microsoft Azure . Retrieved 2016-07-07.
  24. "SAP HANA Enterprise Cloud". hana.sap.com. Archived from the original on 2016-08-15.
  25. "Clustrix Enters the Rackspace Partner Program". Yahoo! Finance . Archived from the original on 2016-04-14.
  26. Tony Baer, "Cockroach DB introduces a serverless tier", ZDNet.com, Retrieved 2021-12-13.
  27. 1 2 EnterpriseDB#cite note-10
  28. "Cloud SQL - MySQL Relational Database Service" . Retrieved 2016-11-28.
  29. "Announcing Heroku PostgreSQL Database Add-on", Heroku Blog, Retrieved 2011-11-9.
  30. Noel Yuhanna, SQL Azure Raises The Bar On Cloud Databases , Forrester, Retrieved 2011-11-9.
  31. Pethuru, Raj (2014-03-31). Handbook of Research on Cloud Infrastructures for Big Data Analytics. IGI Global. ISBN   9781466658653.
  32. Klint Finley, "7 Cloud-Based Database Services" Archived 2011-11-09 at the Wayback Machine , ReadWriteWeb, Retrieved 2011-11-9.
  33. "Setting up Cassandra in the Cloud Archived 2015-11-13 at the Wayback Machine ", Cassandra Wiki, Retrieved 2011-11-10.
  34. "Google Cloud Platform Blog: Click to Deploy Apache Cassandra on Google Compute Engine" . Retrieved 2016-11-28.
  35. " Archived 2019-04-11 at the Wayback Machine
  36. "Clusterpoint Database Virtual Box VM Installation Guide Archived 2015-03-10 at archive.today ", Clusterpoint, Retrieved 2015-03-08.
  37. "Amazon Machine Images, CouchDB 0.10.x 32 bit Ubuntu [ permanent dead link ]", Amazon Web Services, Retrieved 2011-11-10.
  38. "CouchDB Cloud Hosting on Google Cloud Platform" . Retrieved 2016-11-28.
  39. "Amazon Machine Image, Hadoop AMI [ permanent dead link ]", Amazon Web Services, Retrieved 2011-11-10.
  40. "Cloud Dataproc: Managed Spark & Managed Hadoop Service" . Retrieved 2016-11-28.
  41. ["http://www.rackspace.com/blog/cloud-big-data-platform-limited-availability/ Hadoop at Rackspace] Archived 2014-03-02 at the Wayback Machine ", Rackspace Big Data Platforms, Retrieved 2014-02-24.
  42. "MarkLogic Developer 8 (HVM) on AWS Marketplace". aws.amazon.com. Retrieved 2016-03-31.
  43. marklogic.com. "Flexible Deployment" (PDF). Retrieved 2016-11-28.
  44. "MongoDB on Amazon EC2, MongoDB.org, Retrieved 2011-11-10.
  45. "Deploying MongoDB on Google Compute Engine" . Retrieved 2016-11-28.
  46. "MongoDB on Azure Archived 2012-10-31 at the Wayback Machine , MongoDB.org, Retrieved 2011-11-10.
  47. "Easily Scale MongoDB at Rackspace Archived 2014-03-02 at the Wayback Machine ", Managed MongoDB ObjectRocket by Rackspace, Retrieved 2014-02-24.
  48. "Neo4J in the Cloud Archived 2011-09-25 at the Wayback Machine ", Neo4J Wiki, Retrieved 2011-11-10.
  49. "Announcing Neo4J on Windows Azure", Neo4J Blog, Retrieved 2011-11-10.
  50. 1 2 Adrian Bridgwater, "ScyllaDB's real-time NoSQL database tapped by 'super app'", Computerworld, Retrieved 2012-12-27.
  51. Andrew Brust, "Cloudant Makes NoSQL as a Service Bigger", ZDNet, Retrieved 2012-5-22.
  52. "DataStax Astra DB: DataStax managed services powered by Apache Cassandra". DataStax. Retrieved 2022-03-07.
  53. "Bigtable: Scalable NoSQL Database Service" . Retrieved 2016-11-28.
  54. "Datastore: NoSQL Schemaless Database" . Retrieved 2016-11-28.
  55. "MongoDB Atlas: Hosted MongoDB as a Service" . Retrieved 2016-08-30.
  56. "NoSQL Database Cloud Service". Oracle Cloud. Retrieved 2017-11-29.