Oracle NoSQL Database

Last updated
Oracle NoSQL DB
Developer(s) Oracle Corporation
Initial releaseSeptember 2011;12 years ago (2011-09)
Stable release
22.2 / 19 August 2022 (2022-08-19)
Written in Java
Available inEnglish
Type NoSQL
License Apache License 2.0 (CE) and Proprietary (EE)
Website oracle.com/technetwork/database/database-technologies/nosqldb/

Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. [1] [2] [3] [4] It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.

Contents

Oracle NoSQL Database Cloud Service is a managed cloud service for applications that require low latency, flexible data models, and elastic scaling for dynamic workloads.

Developers focus on application development and data store requirements rather than managing back-end servers, storage expansion, cluster deployments, topology, software installation/patches/upgrades, backup, operating systems, and availability. NoSQL database scales to meet dynamic application workloads and throughput requirements.

Users create tables to store their application data and perform database operations. A NoSQL table is similar to a relational table with additional properties including provisioned write units, read units, and storage capacity. Users provision the throughput and storage capacity in each table based on anticipated workloads. NoSQL Database resources are allocated and scaled accordingly to meet workload requirements. Users are billed hourly based on the capacity provisioned.

NoSQL Database supports tabular model. Each row is identified by a unique key, and has a value, of arbitrary length, which is interpreted by the application. The application can manipulate (insert, delete, update, read) a single row in a transaction. The application can also perform an iterative, non-transactional scan of all the rows in the database.

Licensing

Oracle Corporation distributes the Oracle NoSQL Database in three editions:

Oracle NoSQL Database is licensed using a freemium model: open-source versions of Oracle NoSQL Community Edition are available, but end-users can purchase additional features and support via the Oracle Store. [5]

Oracle NoSQL Database drivers, [6] licensed pursuant to the Apache 2.0 License, are used with both the community and enterprise editions. [7]

Main features

Architecture

Oracle NoSQL Database is built upon the Oracle Berkeley DB Java Edition high-availability storage engine. It adds services to provide a distributed, highly available key/value store, suited for large-volume, latency-sensitive applications. [8]

Sharding and replication

Oracle NoSQL Database is a client-server, sharded, shared-nothing system. The data in each shard are replicated on each of the nodes that comprise the shard. The major key for a record is hashed to identify the shard that the record belongs to. Oracle NoSQL Database is designed to support changing the number of shards dynamically in response to availability of additional hardware. If the number of shards changes, key-value pairs are redistributed across the new set of shards dynamically, without requiring a system shutdown and restart. A shard is made up of a single electable master node to serve read and write requests, and several replicas (usually two or more) that can serve read requests. Replicas are kept up to date using streaming replication. Each change on the master node is committed locally to disk and also propagated to the replicas.

High availability and fault-tolerance

Oracle NoSQL Database provides single-master, multi-replica database replication. [9] Transactional data is delivered to all replica nodes with flexible durability policies per transaction. In the event the master replica node fails, a consensus-based PAXOS-based automated fail-over election process minimizes downtime. As soon as the failed node is repaired, it rejoins the shard, updated and then becomes available for processing read requests. Thus, Oracle NoSQL Database applications can tolerate failures of nodes within a shard and also multiple failures of nodes in distinct shards.

Proper placement of masters and replicas on server hardware (racks and interconnect switches) by Oracle NoSQL Database is intended to increase availability on commodity servers.

Transparent load balancing

Oracle NoSQL Database Driver [10] partitions the data in real time and evenly distributes it across the storage nodes. It is network topology and latency-aware, routing read and write operations to the most appropriate storage node in order to optimize load distribution and performance.

Administration and system monitoring

Oracle NoSQL Database's administration service can be accessed from a web console or a command-line interface. This service supports functionality such as the ability to configure, start, stop and monitor a storage node, without requiring configuration files, shell scripts, or explicit database operations. It allows Java Management Extensions (JMX) or Simple Network Management Protocol (SNMP) agents to be available for monitoring. This allows management clients to poll information about the status, performance metrics and operational parameters of a storage node and its managed services. [11]

Elastic configuration

"Elasticity" refers to dynamic online expansion of the deployed cluster. [12] Adding storage nodes increases capacity, performance and reliability. Oracle NoSQL Database includes a topology planning feature, with which an administrator can modify the configuration of a NoSQL database while the database is online. The administrator can:

  • Increase data distribution: by increasing number of shards in the cluster, which increases write throughput.
  • Increase replication factor: by assigning additional replication nodes to each shard, which increases read throughput and system availability.
  • Rebalance data store: by modifying the capacity of storage nodes, the system can be rebalanced, re-allocating replication nodes to storage nodes [13] as appropriate.

Administrators can move replication nodes and/or partitions from over-utilized nodes onto underutilized storage nodes or vice versa.

Multi-zone deployment

Oracle NoSQL Database supports multiple zones to intelligently allocate replication of processes and data, in order to improve reliability during hardware, network and power-related failure modes. The two types of zones are: primary zones that contain nodes that can serve as masters or replicas and are typically connected by fast interconnects. Secondary zones contain nodes that can only serve as replicas. Secondary zones can be used to provide low latency read access to data at a distant location, or to offload read-only workloads such as analytics, report generation and data exchange for improved workload management.

JSON data format

Oracle NoSQL Database supports Avro [14] data serialization, which provides a compact, schema-based binary data format. Schemas are defined using JSON. Oracle NoSQL Database supports schema evolution. Configurable Smart Topology System administrators indicate how much capacity is available on a given storage node, allowing more capable nodes to host multiple replication nodes. Once the system knows about the capacity for the storage nodes in a configuration, it automatically allocates replication nodes intelligently. This is intended for better load balancing, better use of system resources and minimizing system impact in the event of storage node failure. Smart Topology supports data centers, ensuring that a full set of replicas is initially allocated to each data center.

Online rolling upgrade

Oracle NoSQL Database provides facilities to perform a rolling upgrade, allowing a system administrator to upgrade cluster nodes while the database remains available. [15]

Fault tolerance

Oracle NoSQL Database is configurable to be either C/P or A/P in CAP. [16] In particular, if writes are configured to be performed synchronously to all replicas, it is C/P in CAP i.e. a partition or node failure causes the system to be unavailable for writes. If replication is performed asynchronously, and reads are configured to be served from any replica, it is A/P in CAP i.e. the system is always available, but there is no guarantee of consistency.

Database features

Table data model

Release 3.0 introduced tabular data structure, which simplifies application data modeling by leveraging existing schema design concepts. Table model is layered on top of the distributed key-value structure, inheriting all its advantages and simplifying application design by enabling seamless integration with familiar SQL-based applications

Secondary index

Primary key only based indexing limits the number of low latency access paths. Sometimes applications need a non-primary-key based paths to support specific application requirements. OND supports secondary index on any value field. [17]

Large object support

Oracle NoSQL Database EE Stream based APIs allow reading and writing large objects (LOBs) such as audio and video files, without having to materialize the entire file in memory. This is intended to decrease the latency of operations across mixed workloads of objects of varying sizes. [18]

ACID compliant transaction

Oracle NoSQL Database provides ACID compliant transactions for full create, read, update and delete (CRUD) operations, with adjustable durability and consistency transaction guarantees. A sequence of operations can operate as a single atomic unit as long as all the affected records share the same major key path. [19]

Integration

Oracle NoSQL Database includes support for Java, C, Python, C# and REST APIs. These allow the application developer to perform CRUD operations. These libraries include Avro support, so that developers can serialize key-value records and de-serialize key-value records interchangeably between C and Java applications. [20]

Oracle RESTful Services

Oracle NoSQL Database supports Oracle REST Data Services (ORDS). [21] This allows customers to build a REST-based application that can access data in either Oracle Database or OND.

GeoJSON

Supports spatial queries on RFC7946 compliant GeoJSON data. Spatial functions and indexing for GeoJSON data are supported.

Apache Hadoop

KVAvroInputFormat and KVInputFormat [22] classes are available to read data from OND natively into Hadoop MapReduce jobs. One use for this class is to read NoSQL database records into Oracle Loader for Hadoop. [23]

Oracle integration

Oracle Big Data SQL and Hive

Oracle Big Data SQL is a common SQL access layer to data stored in Hadoop, HDFS, Hive and OND. This allows customers to query Oracle NoSQL Data from Hive or Oracle Database. Users can run MapReduce jobs against data stored in OND that is configured for secure access. The latest release also supports both primitive and complex data types

Oracle Database

Oracle NoSQL Database EE supports external table allows fetching Oracle NoSQL data from Oracle database using SQL statements such as Select, Select Count(*) etc. Once NoSQL data is exposed through external tables, one can access the data via standard JDBC drivers and/or visualize it through enterprise business intelligence tools.

Other Oracle products

Oracle Event Processing (OEP) provides read access to Oracle NoSQL Database via the NoSQL Database cartridge. Once the cartridge is configured, CQL queries can be used. Oracle Semantic Graph includes a Jena Adapter for Oracle NoSQL Database [24] to store large volumes of RDF data (as triplets/quadruplets). This adapter enables fast access to graph data stored in OND via SPARQL queries. Integration with Oracle Coherence allows OND to be used as a cache for Oracle Coherence applications, allowing applications to directly access cached data from OND.

Enterprise security

Oracle NoSQL Database EE supports OS-independent, cluster-wide password-based user authentication and Oracle Wallet integration and enables greater protection [25] from unauthorized access to sensitive data. Additionally, session-level Secure Sockets Layer (SSL) encryption and network port restrictions improve protection from network intrusion.

Performance

The Oracle NoSQL Database team has worked with several key Oracle partners, including Intel and Cisco, [26] performing Yahoo! Cloud Serving Benchmarks (YCSB) on various hardware configurations, and published its results. For example, in 2012 Oracle reported that Oracle NoSQL Database exceeded 1 million mixed YCSB Ops/Sec. [27]

See also

Related Research Articles

<span class="mw-page-title-main">PostgreSQL</span> Free and open-source relational database management system

PostgreSQL, also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. It was originally named POSTGRES, referring to its origins as a successor to the Ingres database developed at the University of California, Berkeley. In 1996, the project was renamed to PostgreSQL to reflect its support for SQL. After a review in 2007, the development team decided to keep the name PostgreSQL and the alias Postgres.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB/2, then DB2 until 2017 and finally changed to its present form.

Oracle TimesTen In-Memory Database is an in-memory, relational database management system with persistence and high availability. Originally designed and implemented at Hewlett-Packard labs in Palo Alto, California, TimesTen spun out into a separate startup in 1996 and was acquired by Oracle Corporation in 2005.

MySQL Cluster is a technology providing shared-nothing clustering and auto-sharding for the MySQL database management system. It is designed to provide high availability and high throughput with low latency, while allowing for near linear scalability. MySQL Cluster is implemented through the NDB or NDBCLUSTER storage engine for MySQL.

Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group and resolving any conflicts that might arise between concurrent changes made by different members.

Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

In database computing, Oracle Real Application Clusters (RAC) — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments. Oracle Corporation includes RAC with the Enterprise Edition, provided the nodes are clustered using Oracle Clusterware.

SAP IQ is a column-based, petabyte scale, relational database software system used for business intelligence, data warehousing, and data marts. Produced by Sybase Inc., now an SAP company, its primary function is to analyze large amounts of data in a low-cost, highly available environment. SAP IQ is often credited with pioneering the commercialization of column-store technology.

An embedded database system is a database management system (DBMS) which is deeply integrated with an application software; it is built into the software or an application. It is a broad term which includes:

<span class="mw-page-title-main">Apache Cassandra</span> Free and open-source database management system

Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. Cassandra was designed to implement a combination of Amazon's Dynamo distributed storage and replication techniques combined with Google's Bigtable data and storage engine model.

A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load.

NoSQL is an approach to database design that focuses on providing a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Instead of the typical tabular structure of a relational database, NoSQL databases house data within one data structure. Since this non-relational database design does not require a schema, it offers rapid scalability to manage large and typically unstructured data sets. NoSQL systems are also sometimes called "Not only SQL" to emphasize that they may support SQL-like query languages or sit alongside SQL databases in polyglot-persistent architectures.

<span class="mw-page-title-main">Couchbase Server</span> Open-source NoSQL database

Couchbase Server, originally known as Membase, is a source-available, distributed multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve many concurrent users by creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, Couchbase Server is designed to provide easy-to-scale key-value, or JSON document access, with low latency and high sustainability throughput. It is designed to be clustered from a single machine to very large-scale deployments spanning many machines.

Amazon Relational Database Service is a distributed relational database service by Amazon Web Services (AWS). It is a web service running "in the cloud" designed to simplify the setup, operation, and scaling of a relational database for use in applications. Administration processes like patching the database software, backing up databases and enabling point-in-time recovery are managed automatically. Scaling storage and compute resources can be performed by a single API call to the AWS control plane on-demand. AWS does not offer an SSH connection to the underlying virtual machine as part of the managed service.

Azure Cosmos DB is a globally distributed, multi-model database service offered by Microsoft. It is designed to provide high availability, scalability, and low-latency access to data for mission-critical applications. Unlike traditional relational databases, Cosmos DB is a NoSQL database, which means it can handle unstructured and semi-structured, in addition to structured, data types.

The following outline is provided as an overview of and topical guide to MySQL:

<span class="mw-page-title-main">ScyllaDB</span> Open-source distributed NoSQL wide-column data store

ScyllaDB is an open-source distributed NoSQL wide-column data store. It was designed to be compatible with Apache Cassandra while achieving significantly higher throughputs and lower latencies. It supports the same protocols as Cassandra and the same file formats (SSTable), but is a completely rewritten implementation, using the C++20 language replacing Cassandra's Java, and the Seastar asynchronous programming library replacing classic Linux programming techniques such as threads, shared memory and mapped files. In addition to implementing Cassandra's protocols, ScyllaDB also implements the Amazon DynamoDB API.

Database scalability is the ability of a database to handle changing demands by adding/removing resources. Databases use a host of techniques to cope.

TiDB is an open-source NewSQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and can provide horizontal scalability, strong consistency, and high availability. It is developed and supported primarily by PingCAP and licensed under Apache 2.0, though it is also available as a paid product. TiDB drew its initial design inspiration from Google's Spanner and F1 papers.

<span class="mw-page-title-main">YugabyteDB</span> Transactional distributed SQL database

YugabyteDB is a high-performance transactional distributed SQL database for cloud-native applications, developed by Yugabyte.

References

  1. "Oracle NoSQL Database Technical Overview". www.oracle.com.
  2. "Oracle NoSQL Database Performance Tests".
  3. Wayner, Peter (16 November 2011). "First look: Oracle NoSQL Database".
  4. Wolfe, Alexander. "Do You Know NoSQL?". Forbes .
  5. "Oracle Store". shop.oracle.com.
  6. "Oracle NoSQL Database Downloads". www.oracle.com.
  7. "Oracle NoSQL Database 12c Release 2 (12.2.4.5)". docs.oracle.com.
  8. ""Oracle NoSQL Database White Paper"" (PDF).
  9. "Chapter 1. Introduction to Oracle NoSQL Database". docs.oracle.com.
  10. Intelligent Drivers
  11. "Oracle NoSQL Database Administration". www.oracle.com. Retrieved 2019-04-15.
  12. "Elastic Expansion". www.oracle.com.
  13. "Storage Nodes".
  14. "Chapter 8. Avro Bindings". docs.oracle.com.
  15. "Rolling Upgrade". www.oracle.com.
  16. Abadi, Daniel (4 October 2011). "DBMS Musings: Overview of the Oracle NoSQL Database".
  17. "Oracle NoSQL Database 3.0 Supports Table Data Model and Secondary Indexing". InfoQ.
  18. "Large Object Support". www.oracle.com.
  19. "Oracle NoSQL Database transactions". www.oracle.com.
  20. "Oracle NoSQL Database API". www.oracle.com.
  21. Compare: "Overview (Oracle REST Data Services Plugin API)". download.oracle.com. Oracle Corporation. Retrieved 2015-11-30. This document describes how to develop and deploy plugins that integrate with the Oracle REST Data Services (ORDS) runtime.
  22. "Oracle NoSQL Database API". docs.oracle.com.
  23. "Using Oracle NoSQL Database with Hadoop". www.oracle.com.
  24. "Oracle Semantic Technologies Downloads". www.oracle.com.
  25. "Oracle NoSQL Database 3.0 Ups Security and Performance". www.dbta.com. April 2, 2014.
  26. "Cisco Data Center" (PDF). Cisco.
  27. "Oracle NoSQL Database Exceeds 1 Million Mixed YCSB Ops/Sec" Archived 2015-05-20 at the Wayback Machine