Filename extension | .gdb (file), .geodatabase (mobile) |
---|---|
Developed by | Esri |
Initial release | December 1999 |
Latest release | 11 2022 |
Type of format | database |
Container for | spatial database including vector and raster data |
Open format? | no |
Free format? | no |
A Geodatabase is a proprietary GIS file format developed in the late 1990s by Esri (a GIS software vendor) to represent, store, and organize spatial datasets within a geographic information system. [1] [2] A geodatabase is both a logical data model and the physical implementation of that logical model in several proprietary file formats released during the 2000s. [3] The geodatabase design is based on the spatial database model for storing spatial data in relational and object-relational databases. [4] Given the dominance of Esri in the GIS industry, the term "geodatabase" is used by some as a generic trademark for any spatial database, regardless of platform or design.
The origin of the geodatabase was in the mid-1990s during the emergence of the first spatial databases. One early approach to integrating relational databases and GIS was the use of server middleware, a third-party program that stores the spatial data in database tables in a custom format, and translates it dynamically into a logical model that can be understood by the client software. In 1996, Esri purchased an early middleware product called Spatial DataBase Engine and rebranded it ArcSDE. Initially, ArcSDE stored and delivered simple vector datasets that looked very similar to shapefiles, but the need for a more robust data model emerged as Esri's Shapefile format became a de facto standard for vector spatial data, even as its shortcomings limited its use in enterprise applications. At the same time, the Arc/INFO coverage format was becoming obsolete after 20 years, unable to handle growing expectations of GIS users. [5] Another motivating factor was that even though several relational database vendors were introducing their own spatial extensions (with the notable exception of Esri's preferred Microsoft SQL Server), their structures and interfaces varied and Esri wanted its users to see all spatial data in the same apparent structure regardless of how it was stored internally. [6] : 240
At the end of 1999, Esri introduced the Geodatabase model as the native format used in its new ArcGIS software (branded Version 8.0 to maintain continuity with Arc/INFO). [7] Initially, it could be implemented as a multiuser geodatabase in ArcSDE on a server or the personal geodatabase locally. [8] : 12 Support for topology rules, linear referencing, and survey data were added in 2003 (with ArcGIS 8.3). [9] [10] [11] Network data was added to the geodatabase in 2005 (ArcGIS 9.1), [12] and vector terrain ( TIN, LIDAR) in 2006 (ArcGIS 9.2). [13] Also at the 9.2 release, ArcSDE was subsumed into ArcGIS Server and the multiuser database format was rebranded the enterprise geodatabase.
Due to shortcomings in the personal geodatabase format (especially file size limitations in Microsoft Access), Esri developed a more robust custom file format, released in 2006 (ArcGIS 9.2) as the file geodatabase. [13] It also released a product called the workgroup geodatabase that included the free Microsoft SQL Server Express for smaller multi-user applications, which has since been discontinued. [14] Eventually, the middleware components for reading and writing the geodatabase spatial database structure were incorporated into ArcGIS desktop, eliminating the need for ArcSDE to be running on the server end. The most recent addition has been the mobile geodatabase format in 2020 (ArcGIS Pro 2.7), which uses SQLite as the backend to store the entire geodatabase as a single file. This replaces the personal geodatabase, which is no longer supported. [15]
Geodatabases, being a common format for GIS datasets, have applications anywhere GIS are widely employed. These applications are so basic often times researchers do not mention their use in studies. There are several fields where their use is extensively documented, including public health, crime analysis, and resource management.
Since John Snow famously identified the source of a cholera outbreak, spatial data has been central to epidemiology and public health. [16] [17] In recent years, information that is relevant to public health has increased exponentially. [18] Leveraged correctly, this data can allow for a rapid response to emerging diseases. To accomplish this, geodatabases are employed extensively to organize data and allow for the identification of space-time patterns. [17] [18] Examples of the use of geodatabase to manage epidemiological data include linking environmental and health data to find patterns. [19] They were used extensively to organize data related to West Nile virus epidemics, and the COVID-19 pandemic. [20] [21] This use includes analyzing misinformation, and the infodemic, surrounding COVID-19. [22]
Geospatial data around resource management plays is extremely complex. Factors such as the forest, water, and mineral resources being managed are obvious; however, governance and socioeconomic factors also play a large role. [3] [23] It is common practice to employ geodatabases to manage these diverse datasets. [23] They have also been used in organized Early Detection Rapid Response (EDRR) efforts to treat invasive plant species to protect environmental resources. [24]
In 1995 The United States Census Bureau made the Topologically Integrated Geographic Encoding and Referencing, or TIGER, Mapping Service available to the public, facilitating desktop and Web GIS by hosting US boundary data. [25] This data availability, facilitated through the internet, silently revolutionized cartography by providing the world with authoritative boundary files, for free. Today, these files, which contain up-to-date boundaries for the United States states, counties, and more, are provided to the public in prepackaged geodatabases. [26]
To the user, a geodatabase looks like a collection of datasets, including some containing geographic data and some auxiliary elements that add functionality to the data. This user view is identical, regardless of how the geodatabase is stored (although enterprise geodatabases add some functions).
Datasets contain geographic data. A geodatabase can contain spatially referenced data in vector or raster formats, or non-spatially referenced data in tabular format. [27] [28] Each dataset contains information about any number of individual items, but typically all of the items in a dataset are of the same theme (e.g., temperature measurements, roads in a city) and have the same set of properties.
A number of elements can be included that are generally dependent on one or more datasets, adding functionality such as quality control. Some of these are called controller datasets
Since its first introduction in 1999, the geodatabase has been available on a number of platforms to meet various project needs.
GDB_Items
: a "table of contents" for all of the elements of the geodatabase as the user will see them, pointing to the corresponding physical tablesGDB_ItemTypes
: the type of dataset of each table (table, feature class, etc.)GDB_ItemRelationships
: information about groupings of tables, such as feature datasetsGDB_ItemRelationshipTypes
: lookup table of types of item relationshipsGDB_DBTune
: general parameters for the geodatabaseGDB_SpatialRefs
: a list of the spatial reference systems used in the datasetsGDB_SystemCatalog
: a list of all tables, including data and system tablesa########.gdbtable
: a table (system table, data table, feature class, raster) consisting of rows with geometry and/or attribute columnsa########.gdbtablx
: a lookup list of the byte offset of each row in the data tablea########.gdbindexes
: a list of all the indexes for a data tablea########.name.atx
: an attribute index for a data table, listing the rows in the sorted order of the selected attribute column. A single data table can have multiple indices.a########.spx
: a spatial index for a feature class table to speed up shape access, using a gridded spatial index.a########.cdf
: a compressed version of one of the above filesa00000001.* - a00000008.*
: system tables, as in the enterprise geodatabase (GDB_SystemCatalog, GDB_SpatialRefs, GDB_DBTune, etc.)The ability to read and write geodatabase format is not limited to Esri products; other software are also able to read & write this format, including:
A geographic information system (GIS) consists of integrated computer hardware and software that store, manage, analyze, edit, output, and visualize geographic data. Much of this often happens within a spatial database; however, this is not essential to meet the definition of a GIS. In a broader sense, one may consider such a system also to include human users and support staff, procedures and workflows, the body of knowledge of relevant concepts and methods, and institutional organizations.
Environmental Systems Research Institute, Inc., doing business as Esri, is an American multinational geographic information system (GIS) software company headquartered in Redlands, California. It is best known for its ArcGIS products. With 40% market share as of 2011, Esri is one of the world's leading supplier of GIS software, web GIS and geodatabase management applications.
A GIS file format is a standard for encoding geographical information into a computer file, as a specialized type of file format for use in geographic information systems (GIS) and other geospatial applications. Since the 1970s, dozens of formats have been created based on various data models for various purposes. They have been created by government mapping agencies, GIS software vendors, standards bodies such as the Open Geospatial Consortium, informal user communities, and even individual developers.
TerraLib is an open-source geographic information system (GIS) software library. It extends object-relational database management systems (DBMS) to handle spatiotemporal data types.
A GIS software program is a computer program to support the use of a geographic information system, providing the ability to create, store, manage, query, analyze, and visualize geographic data, that is, data representing phenomena for which location is important. The GIS software industry encompasses a broad range of commercial and open-source products that provide some or all of these capabilities within various information technology architectures.
ArcSDE is a server-software sub-system that aims to enable the usage of Relational Database Management Systems for spatial data. The spatial data may then be used as part of a geodatabase.
The shapefile format is a geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a mostly open specification for data interoperability among Esri and other GIS software products. The shapefile format can spatially describe vector features: points, lines, and polygons, representing, for example, water wells, rivers, and lakes. Each item usually has attributes that describe it, such as name or temperature.
ArcGIS is a family of client, server and online geographic information system (GIS) software developed and maintained by Esri.
gvSIG, geographic information system (GIS), is a desktop application designed for capturing, storing, handling, analyzing and deploying any kind of referenced geographic information in order to solve complex management and planning problems. gvSIG is known for having a user-friendly interface, being able to access the most common formats, both vector and raster ones. It features a wide range of tools for working with geographic-like information.
ArcInfo is a full-featured geographic information system produced by Esri, and is the highest level of licensing in the ArcGIS Desktop product line. It was originally a command-line based system. The command-line processing abilities are now available through the GUI of the ArcGIS Desktop product.
A spatial database is a general-purpose database that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data.
The Geospatial Data Abstraction Library (GDAL) is a computer software library for reading and writing raster and vector geospatial data formats, and is released under the permissive X/MIT style free software license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats. It may also be built with a variety of useful command line interface utilities for data translation and processing. Projections and transformations are supported by the PROJ library.
ArcGIS Server is the core server geographic information system (GIS) software made by Esri. ArcGIS Server is used for creating and managing GIS Web services, applications, and data. ArcGIS Server is typically deployed on-premises within the organization’s service-oriented architecture (SOA) or off-premises in a cloud computing environment.
ArcMap is the former main component of Esri's ArcGIS suite of geospatial processing programs. Used primarily to view, edit, create, and analyze geospatial data. ArcMap allows the user to explore data within a data set, symbolize features accordingly, and create maps. This is done through two distinct sections of the program, the table of contents and the data frame. In October 2020, it was announced that there are no plans to release 10.9 in 2021, and that ArcMap would no longer be supported after March 1, 2026. Esri is encouraging their users to transition to ArcGIS Pro.
ArcView, now referred to as ArcGIS for Desktop Basic, is the entry-level licensing level of ArcGIS Desktop, a geographic information system software product produced by Esri. It is intended by Esri to be the logical migration path from ArcView 3.x.
A georelational data model is a geographic data model that represents geographic features as an interrelated set of spatial and attribute data. The georelational model was the dominant form of vector file format during the 1980s and 1990s, including the Esri coverage and Shapefile.
A geographic data model, geospatial data model, or simply data model in the context of geographic information systems, is a mathematical and digital structure for representing phenomena over the Earth. Generally, such data models represent various aspects of these phenomena by means of geographic data, including spatial locations, attributes, change over time, and identity. For example, the vector data model represents geography as collections of points, lines, and polygons, and the raster data model represent geography as cell matrices that store numeric values. Data models are implemented throughout the GIS ecosystem, including the software tools for data management and spatial analysis, data stored in a variety of GIS file formats, specifications and standards, and specific designs for GIS installations.
The following tables compare general and technical information for a number of GIS vector file format. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.
Geographic information systems (GIS) play a constantly evolving role in geospatial intelligence (GEOINT) and United States national security. These technologies allow a user to efficiently manage, analyze, and produce geospatial data, to combine GEOINT with other forms of intelligence collection, and to perform highly developed analysis and visual production of geospatial data. Therefore, GIS produces up-to-date and more reliable GEOINT to reduce uncertainty for a decisionmaker. Since GIS programs are Web-enabled, a user can constantly work with a decision maker to solve their GEOINT and national security related problems from anywhere in the world. There are many types of GIS software used in GEOINT and national security, such as Google Earth, ERDAS IMAGINE, GeoNetwork opensource, and Esri ArcGIS.
Geospatial topology is the study and application of qualitative spatial relationships between geographic features, or between representations of such features in geographic information, such as in geographic information systems (GIS). For example, the fact that two regions overlap or that one contains the other are examples of topological relationships. It is thus the application of the mathematics of topology to GIS, and is distinct from, but complementary to the many aspects of geographic information that are based on quantitative spatial measurements through coordinate geometry. Topology appears in many aspects of geographic information science and GIS practice, including the discovery of inherent relationships through spatial query, vector overlay and map algebra; the enforcement of expected relationships as validation rules stored in geospatial data; and the use of stored topological relationships in applications such as network analysis. Spatial topology is the generalization of geospatial topology for non-geographic domains, e.g., CAD software.