Spatial database

Last updated

A spatial database is a general-purpose database (usually a relational database) that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data.

Contents

Most spatial databases allow the representation of simple geometric objects such as points, lines and polygons. Some spatial databases handle more complex structures such as 3D objects, topological coverages, linear networks, and triangulated irregular networks (TINs). While typical databases have developed to manage various numeric and character types of data, such databases require additional functionality to process spatial data types efficiently, and developers have often added geometry or feature data types.

Geographic database (or geodatabase) is a georeferenced spatial database, used for storing and manipulating geographic data (or geodata, i.e., data associated with a location on Earth), [lower-alpha 1] especially in geographic information systems (GIS). Almost all current relational and object-relational database management systems now have spatial extensions, and some GIS software vendors have developed their own spatial extensions to database management systems.

The Open Geospatial Consortium (OGC) developed the Simple Features specification (first released in 1997) [1] and sets standards for adding spatial functionality to database systems. [2] The SQL/MM Spatial ISO/IEC standard is a part of the structured query language and multimedia standard extending the Simple Features. [3]

Characteristics

The core functionality add by a spatial extension to a database is one or more spatial datatypes, which allow for the storage of spatial data as attribute values in a table. [4] Most commonly, a single spatial value would be a geometric primitive (point, line, polygon, etc.) based on the vector data model. The datatypes in most spatial databases are based on the OGC Simple Features specification for representing geometric primitives. Some spatial databases also support the storage of raster data. Because all geographic locations must be specified according to a spatial reference system, spatial databases must also allow for the tracking and transformation of coordinate systems. In many systems, when a spatial column is defined in a table, it also includes a choice of coordinate system, chosen from a list of available systems that is stored in a lookup table.

The second major functionality extension in a spatial database is the addition of spatial capabilities to the query language (e.g., SQL); these give the spatial database the same query, analysis, and manipulation operations that are available in traditional GIS software. In most relational database management systems, this functionality is implemented as a set of new functions that can be used in SQL SELECT statements. Several types of operations are specified by the Open Geospatial Consortium standard:

Some databases support only simplified or modified sets of these operations, especially in cases of NoSQL systems like MongoDB and CouchDB.

Spatial index

A spatial index is used by a spatial database to optimize spatial queries. Database systems use indices to quickly look up values by sorting data values in a linear (e.g. alphabetical) order; however, this way of indexing data is not optimal for spatial queries in two- or three-dimensional space. Instead, spatial databases use a spatial index designed specifically for multi-dimensional ordering. [5] Common spatial index methods include:

Spatial query

A spatial query is a special type of database query supported by spatial databases, including geodatabases. The queries differ from non-spatial SQL queries in several important ways. Two of the most important are that they allow for the use of geometry data types such as points, lines and polygons and that these queries consider the spatial relationship between these geometries.

The function names for queries differ across geodatabases. The following are a few of the functions built into PostGIS, a free geodatabase which is a PostgreSQL extension (the term 'geometry' refers to a point, line, box or other two or three dimensional shape): [7]

Function prototype: functionName (parameter(s)) : return type

Thus, a spatial join between a points layer of cities and a polygon layer of countries could be performed in a spatially-extended SQL statement as:

SELECT * FROM cities, countries WHERE ST_Contains(countries.shape, cities.shape)

The Intersect vector overlay operation (a core element of GIS software) could be replicated as:

SELECT ST_Intersection(veg.shape, soil.shape) int_poly, veg.*, soil.* FROM veg, soil where ST_Intersects(veg.shape, soil.shape)

Spatial database management systems

List

Table of free systems especially for spatial data processing

DBSLicenseDistributedSpatial objectsSpatial functions PostgreSQL interfaceUMN MapServer interfaceDocumentationModifiable HDFS
Apache Drill Apache License 2.0 yesyesyes - Drill Geospatial Functions Documentation yesnoOfficial Documentation ANSI SQL yes
ArangoDB Apache License 2.0 yesyesyes - capabilities overview query language functions nonoofficial documentation AQLno
GeoMesa Apache License 2.0yesyes (Simple Features)yes (JTS)no (manufacturable with GeoTools)noparts of the functions, a few exampleswith Simple Feature Access in Java Virtual Machine and Apache Spark are all kinds of tasks solvableyes
H2 (H2GIS) LGPL 3 (since v1.3), GPL 3 beforenoyes (custom, no raster) Simple Feature Access and custom functions for H2Networkyesnoyes (homepage)SQLno
Ingres GPL or proprietaryyes (if extension is installed)yes (custom, no raster)Geometry Engine, Open Source [21] nowith MapScriptjust brieflywith C and OMEno
Neo4J-spatial [22] GNU affero general public licensenoyes (Simple Features)yes (contain, cover, covered by, cross, disjoint, intersect, intersect window, overlap, touch, within and within distance)nonojust brieflyfork of JTS no
PostgreSQL with PostGIS GNU General Public License noyes (Simple Features and raster)yes (Simple Feature Access and raster functions)yesyesdetailedSQL, in connection with R no
Postgres-XL with PostGIS Mozilla public license and GNU general public licenseyesyes (Simple Features and raster)yes (Simple Feature Access and raster functions)yesyesPostGIS: yes, Postgres-XL: brieflySQL, in connection with R or Tcl or Python no
Rasdaman server GPL, client LGPL, enterprise proprietaryyesjust rasterraster manipulation with rasqlyeswith Web Coverage Service or Web Processing Service detailed wikiown defined function in enterprise editionno
RethinkDB AGPL yesyes
  • distance
  • getIntersecting
  • getNearest
  • includes
  • intersects
nonoofficial documentation [23] forkingno

See also

Notes

  1. The term "geodatabase" may also refer specifically to a set of proprietary spatial database formats, Geodatabase (Esri).

Related Research Articles

<span class="mw-page-title-main">PostgreSQL</span> Free and open-source object relational database management system

PostgreSQL, also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transactions with atomicity, consistency, isolation, durability (ACID) properties, automatically updatable views, materialized views, triggers, foreign keys, and stored procedures. It is supported on all major operating systems, including Linux, FreeBSD, OpenBSD, macOS, and Windows, and handles a range of workloads from single machines to data warehouses or web services with many concurrent users.

<span class="mw-page-title-main">Ingres (database)</span> Database software

Ingres Database is a proprietary SQL relational database management system intended to support large commercial and government applications.

<span class="mw-page-title-main">PostGIS</span> Geospatial extension for the PostgreSQL Database

PostGIS is an open source software program that adds support for geographic objects to the PostgreSQL object-relational database. PostGIS follows the Simple Features for SQL specification from the Open Geospatial Consortium (OGC).

A coverage is the digital representation of some spatio-temporal phenomenon. ISO 19123 provides the definition:

A GIS file format is a standard for encoding geographical information into a computer file, as a specialized type of file format for use in geographic information systems (GIS) and other geospatial applications. Since the 1970s, dozens of formats have been created based on various data models for various purposes. They have been created by government mapping agencies, GIS software vendors, standards bodies such as the Open Geospatial Consortium, informal user communities, and even individual developers.

<span class="mw-page-title-main">TerraLib</span> Geographic information system software library

TerraLib is an open-source geographic information system (GIS) software library. It extends object-relational database management systems (DBMS) to handle spatiotemporal data types.

The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.

A GIS software program is a computer program to support the use of a geographic information system, providing the ability to create, store, manage, query, analyze, and visualize geographic data, that is, data representing phenomena for which location is important. The GIS software industry encompasses a broad range of commercial and open-source products that provide some or all of these capabilities within various information technology architectures.

ArcSDE is a server-software sub-system that aims to enable the usage of Relational Database Management Systems for spatial data. The spatial data may then be used as part of a geodatabase.

In computing, GiST or Generalized Search Tree, is a data structure and API that can be used to build a variety of disk-based search trees. GiST is a generalization of the B+ tree, providing a concurrent and recoverable height-balanced search tree infrastructure without making any assumptions about the type of data being stored, or the queries being serviced. GiST can be used to easily implement a range of well-known indexes, including B+ trees, R-trees, hB-trees, RD-trees, and many others; it also allows for easy development of specialized indexes for new data types. It cannot be used directly to implement non-height-balanced trees such as quad trees or prefix trees (tries), though like prefix trees it does support compression, including lossy compression. GiST can be used for any data type that can be naturally ordered into a hierarchy of supersets. Not only is it extensible in terms of data type support and tree layout, it allows the extension writer to support any query predicates that they choose.

Simple Features is a set of standards that specify a common storage and access model of geographic features made of mostly two-dimensional geometries used by geographic databases and geographic information systems. It is formalized by both the Open Geospatial Consortium (OGC) and the International Organization for Standardization (ISO).

JTS Topology Suite is an open-source Java software library that provides an object model for Euclidean planar linear geometry together with a set of fundamental geometric functions. JTS is primarily intended to be used as a core component of vector-based geomatics software such as geographical information systems. It can also be used as a general-purpose library providing algorithms in computational geometry.

Oracle Spatial and Graph, formerly Oracle Spatial, is a free option component of the Oracle Database. The spatial features in Oracle Spatial and Graph aid users in managing geographic and location-data in a native type within an Oracle database, potentially supporting a wide range of applications — from automated mapping, facilities management, and geographic information systems (AM/FM/GIS), to wireless location services and location-enabled e-business. The graph features in Oracle Spatial and Graph include Oracle Network Data Model (NDM) graphs used in traditional network applications in major transportation, telcos, utilities and energy organizations and RDF semantic graphs used in social networks and social interactions and in linking disparate data sets to address requirements from the research, health sciences, finance, media and intelligence communities.

An object-based spatial database is a spatial database that stores the location as objects. The object-based spatial model treats the world as surface littered with recognizable objects, which exist independent of their locations.

A geographic data model, geospatial data model, or simply data model in the context of geographic information systems, is a mathematical and digital structure for representing phenomena over the Earth. Generally, such data models represent various aspects of these phenomena by means of geographic data, including spatial locations, attributes, change over time, and identity. For example, the vector data model represents geography as collections of points, lines, and polygons, and the raster data model represent geography as cell matrices that store numeric values. Data models are implemented throughout the GIS ecosystem, including the software tools for data management and spatial analysis, data stored in a variety of GIS file formats, specifications and standards, and specific designs for GIS installations.

<span class="mw-page-title-main">SpatiaLite</span> Spatial extension to SQLite

SpatiaLite is a spatial extension to SQLite, providing vector geodatabase functionality. It is similar to PostGIS, Oracle Spatial, and SQL Server with spatial extensions, although SQLite/SpatiaLite aren't based on client-server architecture: they adopt a simpler personal architecture. i.e. the whole SQL engine is directly embedded within the application itself: a complete database simply is an ordinary file which can be freely copied and transferred from one computer/OS to a different one without any special precaution.

<span class="mw-page-title-main">DE-9IM</span> Topological model

The Dimensionally Extended 9-Intersection Model (DE-9IM) is a topological model and a standard used to describe the spatial relations of two regions, in geometry, point-set topology, geospatial topology, and fields related to computer spatial analysis. The spatial relations expressed by the model are invariant to rotation, translation and scaling transformations.

<span class="mw-page-title-main">Array DBMS</span> System that provides database services specifically for arrays

An array database management system or array DBMS provides database services specifically for arrays, that is: homogeneous collections of data items, sitting on a regular grid of one, two, or more dimensions. Often arrays are used to represent sensor, simulation, image, or statistics data. Such arrays tend to be Big Data, with single objects frequently ranging into Terabyte and soon Petabyte sizes; for example, today's earth and space observation archives typically grow by Terabytes a day. Array databases aim at offering flexible, scalable storage and retrieval on this information category.

GeoSPARQL is a standard for representation and querying of geospatial linked data for the Semantic Web from the Open Geospatial Consortium (OGC). The definition of a small ontology based on well-understood OGC standards is intended to provide a standardized exchange basis for geospatial RDF data which can support both qualitative and quantitative spatial reasoning and querying with the SPARQL database query language.

A Geodatabase is a proprietary GIS file format developed in the late 1990s by Esri to represent, store, and organize spatial datasets within a geographic information system. A geodatabase is both a logical data model and the physical implementation of that logical model in several proprietary file formats released during the 2000s. The geodatabase design is based on the spatial database model for storing spatial data in relational and object-relational databases. Given the dominance of Esri in the GIS industry, the term "geodatabase" is used by some as a generic trademark for any spatial database, regardless of platform or design.

References

  1. McKee, Lance (2016). "OGC History (detailed)". OGC. Retrieved 2016-07-12. [...] 1997 [...] OGC released the OpenGIS Simple Features Specification, which specifies the interface that enables diverse systems to communicate in terms of 'simple features' which are based on 2D geometry. The supported geometry types include points, lines, linestrings, curves, and polygons. Each geometric object is associated with a Spatial Reference System, which describes the coordinate space in which the geometric object is defined.
  2. OGC Homepage
  3. Kresse, Wolfgang; Danko, David M., eds. (2010). Springer handbook of geographic information (1. ed.). Berlin: Springer. pp.  82–83. ISBN   9783540726807.
  4. Yue, P.; Tan, Z. "DM-03 - Relational DBMS and their Spatial Extensions". GIS&T Body of Knowledge. UCGIS. Retrieved 5 January 2023.
  5. Zhang, X.; Du, Z. "DM-66 Spatial Indexing". GIS&T Body of Knowledge. UCGIS. Retrieved 5 January 2023.
  6. Güting, Ralf Hartmut; Schneider, Markus (2005). Moving Objects Databases. Morgan Kaufmann. p. 262. ISBN   9780120887996.
  7. "PostGIS Function Reference". PostGIS Manual. OSGeo. Retrieved 4 January 2023.
  8. Drill Geospatial Function Documentation
  9. "Geo queries | Elasticsearch Guide [7.15]| Elastic".
  10. H2 geometry type documentation
  11. H2 create spatial index documentation
  12. "GeoSpatial – MonetDB". 4 March 2014.
  13. "MySQL 5.5 Reference Manual - 12.17.1. Introduction to MySQL Spatial Support". Archived from the original on 2013-04-30. Retrieved 2013-05-01.
  14. OpenLink Software. "9.34. Geometry Data Types and Spatial Index Support" . Retrieved October 24, 2018.
  15. OpenLink Software (2018-10-23). "New Releases of Virtuoso Enterprise and Open Source Editions" . Retrieved October 24, 2018.
  16. "OGC Certified PostGIS".
  17. "Command reference – Redis".
  18. "SAP Help Portal" (PDF).
  19. "RTREE". tarantool.org. Archived from the original on 2014-12-13.
  20. "HP Vertica Place". 2 December 2015.
  21. "GEOS".
  22. "Neo4j Spatial is a library of utilities for Neo4j that facilitates the enabling of spatial operations on data. In particular you can add spatial indexes to already located data, and perform spatial". GitHub . 2019-02-18.
  23. "ReQL command reference - RethinkDB".

Further reading