A spatial database is a general-purpose database (usually a relational database) that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data.
Most spatial databases allow the representation of simple geometric objects such as points, lines and polygons. Some spatial databases handle more complex structures such as 3D objects, topological coverages, linear networks, and triangulated irregular networks (TINs). While typical databases have developed to manage various numeric and character types of data, such databases require additional functionality to process spatial data types efficiently, and developers have often added geometry or feature data types.
Geographic database (or geodatabase) is a georeferenced spatial database, used for storing and manipulating geographic data (or geodata, i.e., data associated with a location on Earth), [lower-alpha 1] especially in geographic information systems (GIS). Almost all current relational and object-relational database management systems now have spatial extensions, and some GIS software vendors have developed their own spatial extensions to database management systems.
The Open Geospatial Consortium (OGC) developed the Simple Features specification (first released in 1997) [1] and sets standards for adding spatial functionality to database systems. [2] The SQL/MM Spatial ISO/IEC standard is a part of the structured query language and multimedia standard extending the Simple Features. [3]
The core functionality add by a spatial extension to a database is one or more spatial datatypes, which allow for the storage of spatial data as attribute values in a table. [4] Most commonly, a single spatial value would be a geometric primitive (point, line, polygon, etc.) based on the vector data model. The datatypes in most spatial databases are based on the OGC Simple Features specification for representing geometric primitives. Some spatial databases also support the storage of raster data. Because all geographic locations must be specified according to a spatial reference system, spatial databases must also allow for the tracking and transformation of coordinate systems. In many systems, when a spatial column is defined in a table, it also includes a choice of coordinate system, chosen from a list of available systems that is stored in a lookup table.
The second major functionality extension in a spatial database is the addition of spatial capabilities to the query language (e.g., SQL); these give the spatial database the same query, analysis, and manipulation operations that are available in traditional GIS software. In most relational database management systems, this functionality is implemented as a set of new functions that can be used in SQL SELECT statements. Several types of operations are specified by the Open Geospatial Consortium standard:
Some databases support only simplified or modified sets of these operations, especially in cases of NoSQL systems like MongoDB and CouchDB.
A spatial index is used by a spatial database to optimize spatial queries. Database systems use indices to quickly look up values by sorting data values in a linear (e.g. alphabetical) order; however, this way of indexing data is not optimal for spatial queries in two- or three-dimensional space. Instead, spatial databases use a spatial index designed specifically for multi-dimensional ordering. [5] Common spatial index methods include:
A spatial query is a special type of database query supported by spatial databases, including geodatabases. The queries differ from non-spatial SQL queries in several important ways. Two of the most important are that they allow for the use of geometry data types such as points, lines and polygons and that these queries consider the spatial relationship between these geometries.
The function names for queries differ across geodatabases. The following are a few of the functions built into PostGIS, a free geodatabase which is a PostgreSQL extension (the term 'geometry' refers to a point, line, box or other two or three dimensional shape): [7]
Function prototype: functionName (parameter(s)) : return type
Thus, a spatial join between a points layer of cities and a polygon layer of countries could be performed in a spatially-extended SQL statement as:
SELECT * FROM cities, countries WHERE ST_Contains(countries.shape, cities.shape)
The Intersect vector overlay operation (a core element of GIS software) could be replicated as:
SELECT ST_Intersection(veg.shape, soil.shape) int_poly, veg.*, soil.* FROM veg, soil where ST_Intersects(veg.shape, soil.shape)
DBS | License | Distributed | Spatial objects | Spatial functions | PostgreSQL interface | UMN MapServer interface | Documentation | Modifiable | HDFS |
---|---|---|---|---|---|---|---|---|---|
Apache Drill | Apache License 2.0 | yes | yes | yes - Drill Geospatial Functions Documentation | yes | no | Official Documentation | ANSI SQL | yes |
ArangoDB | Apache License 2.0 | yes | yes | yes - capabilities overview query language functions | no | no | official documentation | AQL | no |
GeoMesa | Apache License 2.0 | yes | yes (Simple Features) | yes (JTS) | no (manufacturable with GeoTools) | no | parts of the functions, a few examples | with Simple Feature Access in Java Virtual Machine and Apache Spark are all kinds of tasks solvable | yes |
H2 (H2GIS) | LGPL 3 (since v1.3), GPL 3 before | no | yes (custom, no raster) | Simple Feature Access and custom functions for H2Network | yes | no | yes (homepage) | SQL | no |
Ingres | GPL or proprietary | yes (if extension is installed) | yes (custom, no raster) | Geometry Engine, Open Source [21] | no | with MapScript | just briefly | with C and OME | no |
Neo4J-spatial [22] | GNU affero general public license | no | yes (Simple Features) | yes (contain, cover, covered by, cross, disjoint, intersect, intersect window, overlap, touch, within and within distance) | no | no | just briefly | fork of JTS | no |
PostgreSQL with PostGIS | GNU General Public License | no | yes (Simple Features and raster) | yes (Simple Feature Access and raster functions) | yes | yes | detailed | SQL, in connection with R | no |
Postgres-XL with PostGIS | Mozilla public license and GNU general public license | yes | yes (Simple Features and raster) | yes (Simple Feature Access and raster functions) | yes | yes | PostGIS: yes, Postgres-XL: briefly | SQL, in connection with R or Tcl or Python | no |
Rasdaman | server GPL, client LGPL, enterprise proprietary | yes | just raster | raster manipulation with rasql | yes | with Web Coverage Service or Web Processing Service | detailed wiki | own defined function in enterprise edition | no |
RethinkDB | AGPL | yes | yes |
| no | no | official documentation [23] | forking | no |
PostgreSQL, also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transactions with atomicity, consistency, isolation, durability (ACID) properties, automatically updatable views, materialized views, triggers, foreign keys, and stored procedures. It is supported on all major operating systems, including Linux, FreeBSD, OpenBSD, macOS, and Windows, and handles a range of workloads from single machines to data warehouses or web services with many concurrent users.
Ingres Database is a proprietary SQL relational database management system intended to support large commercial and government applications.
PostGIS is an open source software program that adds support for geographic objects to the PostgreSQL object-relational database. PostGIS follows the Simple Features for SQL specification from the Open Geospatial Consortium (OGC).
A GIS file format is a standard for encoding geographical information into a computer file, as a specialized type of file format for use in geographic information systems (GIS) and other geospatial applications. Since the 1970s, dozens of formats have been created based on various data models for various purposes. They have been created by government mapping agencies, GIS software vendors, standards bodies such as the Open Geospatial Consortium, informal user communities, and even individual developers.
TerraLib is an open-source geographic information system (GIS) software library. It extends object-relational database management systems (DBMS) to handle spatiotemporal data types.
The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.
A GIS software program is a computer program to support the use of a geographic information system, providing the ability to create, store, manage, query, analyze, and visualize geographic data, that is, data representing phenomena for which location is important. The GIS software industry encompasses a broad range of commercial and open-source products that provide some or all of these capabilities within various information technology architectures.
ArcSDE is a server-software sub-system that aims to enable the usage of Relational Database Management Systems for spatial data. The spatial data may then be used as part of a geodatabase.
ArcGIS is a family of client, server and online geographic information system (GIS) software developed and maintained by Esri.
In computing, GiST or Generalized Search Tree, is a data structure and API that can be used to build a variety of disk-based search trees. GiST is a generalization of the B+ tree, providing a concurrent and recoverable height-balanced search tree infrastructure without making any assumptions about the type of data being stored, or the queries being serviced. GiST can be used to easily implement a range of well-known indexes, including B+ trees, R-trees, hB-trees, RD-trees, and many others; it also allows for easy development of specialized indexes for new data types. It cannot be used directly to implement non-height-balanced trees such as quad trees or prefix trees (tries), though like prefix trees it does support compression, including lossy compression. GiST can be used for any data type that can be naturally ordered into a hierarchy of supersets. Not only is it extensible in terms of data type support and tree layout, it allows the extension writer to support any query predicates that they choose.
Simple Features is a set of standards that specify a common storage and access model of geographic features made of mostly two-dimensional geometries used by geographic databases and geographic information systems. It is formalized by both the Open Geospatial Consortium (OGC) and the International Organization for Standardization (ISO).
JTS Topology Suite is an open-source Java software library that provides an object model for Euclidean planar linear geometry together with a set of fundamental geometric functions. JTS is primarily intended to be used as a core component of vector-based geomatics software such as geographical information systems. It can also be used as a general-purpose library providing algorithms in computational geometry.
Oracle Spatial and Graph, formerly Oracle Spatial, is a free option component of the Oracle Database. The spatial features in Oracle Spatial and Graph aid users in managing geographic and location-data in a native type within an Oracle database, potentially supporting a wide range of applications — from automated mapping, facilities management, and geographic information systems (AM/FM/GIS), to wireless location services and location-enabled e-business. The graph features in Oracle Spatial and Graph include Oracle Network Data Model (NDM) graphs used in traditional network applications in major transportation, telcos, utilities and energy organizations and RDF semantic graphs used in social networks and social interactions and in linking disparate data sets to address requirements from the research, health sciences, finance, media and intelligence communities.
An object-based spatial database is a spatial database that stores the location as objects. The object-based spatial model treats the world as surface littered with recognizable objects, which exist independent of their locations.
A geographic data model, geospatial data model, or simply data model in the context of geographic information systems, is a mathematical and digital structure for representing phenomena over the Earth. Generally, such data models represent various aspects of these phenomena by means of geographic data, including spatial locations, attributes, change over time, and identity. For example, the vector data model represents geography as collections of points, lines, and polygons, and the raster data model represent geography as cell matrices that store numeric values. Data models are implemented throughout the GIS ecosystem, including the software tools for data management and spatial analysis, data stored in a variety of GIS file formats, specifications and standards, and specific designs for GIS installations.
SpatiaLite is a spatial extension to SQLite, providing vector geodatabase functionality. It is similar to PostGIS, Oracle Spatial, and SQL Server with spatial extensions, although SQLite/SpatiaLite aren't based on client-server architecture: they adopt a simpler personal architecture. i.e. the whole SQL engine is directly embedded within the application itself: a complete database simply is an ordinary file which can be freely copied and transferred from one computer/OS to a different one without any special precaution.
The Dimensionally Extended 9-Intersection Model (DE-9IM) is a topological model and a standard used to describe the spatial relations of two regions, in geometry, point-set topology, geospatial topology, and fields related to computer spatial analysis. The spatial relations expressed by the model are invariant to rotation, translation and scaling transformations.
An array database management system or array DBMS provides database services specifically for arrays, that is: homogeneous collections of data items, sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data. Such arrays tend to be Big Data, with single objects frequently ranging into Terabyte and soon Petabyte sizes; for example, today's earth and space observation archives typically grow by Terabytes a day. Array databases aim at offering flexible, scalable storage and retrieval on this information category.
GeoSPARQL is a standard for representation and querying of geospatial linked data for the Semantic Web from the Open Geospatial Consortium (OGC). The definition of a small ontology based on well-understood OGC standards is intended to provide a standardized exchange basis for geospatial RDF data which can support both qualitative and quantitative spatial reasoning and querying with the SPARQL database query language.
A Geodatabase is a proprietary GIS file format developed in the late 1990s by Esri to represent, store, and organize spatial datasets within a geographic information system. A geodatabase is both a logical data model and the physical implementation of that logical model in several proprietary file formats released during the 2000s. The geodatabase design is based on the spatial database model for storing spatial data in relational and object-relational databases. Given the dominance of Esri in the GIS industry, the term "geodatabase" is used by some as a generic trademark for any spatial database, regardless of platform or design.
[...] 1997 [...] OGC released the OpenGIS Simple Features Specification, which specifies the interface that enables diverse systems to communicate in terms of 'simple features' which are based on 2D geometry. The supported geometry types include points, lines, linestrings, curves, and polygons. Each geometric object is associated with a Spatial Reference System, which describes the coordinate space in which the geometric object is defined.