Spatial ETL

Last updated

Spatial extract, transform, load (spatial ETL), also known as geospatial transformation and load (GTL), provides the data processing functionality of traditional extract, transform, load (ETL) software, but with a primary focus on the ability to manage spatial data (which may also be called GIS, geographic, or map data). [1]

Contents

A spatial ETL system may translate data directly from one format to another, or via an intermediate format; the latter being more common when transformation of the data is to be carried out.

Transform

The transformation phase of a spatial ETL process allows a variety of functions; some of these are similar to standard ETL, but some are unique to spatial data. [2]

Spatial data commonly consists of a geographic element and related attribute data; therefore spatial ETL transformations are often described as being either geometric transformations – transformation of the geographic element – or attribute transformations – transformations of the related attribute data.

Common geospatial transformations

Additional features

Desirable features of a spatial ETL application are:

Spatial ETL uses

Spatial ETL has a number of distinct uses:

Spatial ETL – origins and history

Although ETL tools for processing non-spatial data have existed for some time, ETL tools that can manage the unique characteristics of spatial data only emerged in the early 1990s.

Spatial ETL tools emerged in the GIS industry to enable interoperability (or the exchange of information) between the industry's diverse array of mapping applications and associated proprietary formats. However, spatial ETL tools are also becoming increasingly important in the realm of management information systems as a tool to help organizations integrate spatial data with their existing non-spatial databases, and also to leverage their spatial data assets to develop more competitive business strategies.

Traditionally, GIS applications have had the ability to read or import a limited number of spatial data formats, but with few specialist ETL transformation tools; the concept being to import data then carry out step-by-step transformation or analysis within the GIS application itself. Conversely, spatial ETL does not require the user to import or view the data, and generally carries out its tasks in a single predefined process.

With the push to achieve greater interoperability within the GIS industry, many existing GIS applications are now incorporating spatial ETL tools within their products; the ArcGIS Data Interoperability Extension being an example of this. [3]

See also

Related Research Articles

<span class="mw-page-title-main">Geographic information system</span> System to capture, manage and present geographic data

A geographic information system (GIS) consists of integrated computer hardware and software that store, manage, analyze, edit, output, and visualize geographic data. Much of this often happens within a spatial database, however, this is not essential to meet the definition of a GIS. In a broader sense, one may consider such a system also to include human users and support staff, procedures and workflows, the body of knowledge of relevant concepts and methods, and institutional organizations.

A coverage is the digital representation of some spatio-temporal phenomenon. ISO 19123 provides the definition:

<span class="mw-page-title-main">Extract, transform, load</span> Procedure in computing

In computing, extract, transform, load (ETL) is a three-phase process where data is extracted, transformed and loaded into an output data container. The data can be collated from one or more sources and it can also be output to one or more destinations. ETL processing is typically executed using software applications but it can also be done manually by system operators. ETL software typically automates the entire process and can be run manually or on reoccurring schedules either as single jobs or aggregated into a batch of jobs.

In computing, the Open Geospatial Consortium Web Feature Service (WFS) Interface Standard provides an interface allowing requests for geographical features across the web using platform-independent calls. One can think of geographical features as the "source code" behind a map, whereas the WMS interface or online tiled mapping portals like Google Maps return only an image, which end-users cannot edit or spatially analyze. The XML-based GML furnishes the default payload-encoding for transporting geographic features, but other formats like shapefiles can also serve for transport. In early 2006 the OGC members approved the OpenGIS GML Simple Features Profile. This profile is designed both to increase interoperability between WFS servers and to improve the ease of implementation of the WFS standard.

A GIS file format is a standard for encoding geographical information into a computer file, as a specialized type of file format for use in geographic information systems (GIS) and other geospatial applications. Since the 1970s, dozens of formats have been created based on various data models for various purposes. They have been created by government mapping agencies, GIS software vendors, standards bodies such as the Open Geospatial Consortium, informal user communities, and even individual developers.

A GIS software program is a computer program to support the use of a geographic information system, providing the ability to create, store, manage, query, analyze, and visualize geographic data, that is, data representing phenomena for which location is important. The GIS software industry encompasses a broad range of commercial and open-source products that provide some or all of these capabilities within various information technology architectures.

<span class="mw-page-title-main">Shapefile</span> Geospatial vector data format

The shapefile format is a geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a mostly open specification for data interoperability among Esri and other GIS software products. The shapefile format can spatially describe vector features: points, lines, and polygons, representing, for example, water wells, rivers, and lakes. Each item usually has attributes that describe it, such as name or temperature.

<span class="mw-page-title-main">ArcGIS</span> Geographic information system maintained by Esri

ArcGIS is a family of client, server and online geographic information system (GIS) software developed and maintained by Esri. ArcGIS was first released in 1999 and originally was released as ARC/INFO, a command line based GIS system for manipulating data. ARC/INFO was later merged into ArcGIS Desktop, which was eventually superseded by ArcGIS Pro in 2015. ArcGIS Pro works in 2D and 3D for cartography and visualization, and includes machine learning (ML).

The Open Source Geospatial Foundation (OSGeo), is a non-profit non-governmental organization whose mission is to support and promote the collaborative development of open geospatial technologies and data. The foundation was formed in February 2006 to provide financial, organizational and legal support to the broader Free and open-source geospatial community. It also serves as an independent legal entity to which community members can contribute code, funding and other resources.

<span class="mw-page-title-main">GDAL</span> Translator library for raster and vector geospatial data formats

The Geospatial Data Abstraction Library (GDAL) is a computer software library for reading and writing raster and vector geospatial data formats, and is released under the permissive X/MIT style free software license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats. It may also be built with a variety of useful command line interface utilities for data translation and processing. Projections and transformations are supported by the PROJ library.

MapInfo Pro is a desktop geographic information system (GIS) software product produced by Precisely and used for mapping and location analysis. MapInfo Pro allows users to visualize, analyze, edit, interpret, understand and output data to reveal relationships, patterns, and trends. MapInfo Pro allows users to explore spatial data within a dataset, symbolize features, and create maps.

Geospatial metadata is a type of metadata applicable to geographic data and information. Such objects may be stored in a geographic information system (GIS) or may simply be documents, data-sets, images or other objects, services, or related items that exist in some other native environment but whose features may be appropriate to describe in a (geographic) metadata catalog.

A Spatial Data Infrastructure (SDI), also called geospatial data infrastructure, is a data infrastructure implementing a framework of geographic data, metadata, users and tools that are interactively connected in order to use spatial data in an efficient and flexible way. Another definition is "the technology, policies, standards, human resources, and related activities necessary to acquire, process, distribute, use, maintain, and preserve spatial data".

ArcMap is the former main component of Esri's ArcGIS suite of geospatial processing programs. Used primarily to view, edit, create, and analyze geospatial data. ArcMap allows the user to explore data within a data set, symbolize features accordingly, and create maps. This is done through two distinct sections of the program, the table of contents and the data frame. In October 2020, it was announced that there are no plans to release 10.9 in 2021, and that ArcMap would no longer be supported after March 1, 2026. Esri is encouraging their users to transition to ArcGIS Pro.

A geographic data model, geospatial data model, or simply data model in the context of geographic information systems, is a mathematical and digital structure for representing phenomena over the Earth. Generally, such data models represent various aspects of these phenomena by means of geographic data, including spatial locations, attributes, change over time, and identity. For example, the vector data model represents geography as collections of points, lines, and polygons, and the raster data model represent geography as cell matrices that store numeric values. Data models are implemented throughout the GIS ecosystem, including the software tools for data management and spatial analysis, data stored in a variety of GIS file formats, specifications and standards, and specific designs for GIS installations.

MapDotNet is a suite of geographic information system (GIS) software products developed by ISC that run on Microsoft Windows. The GIS software competes with ESRI and MapInfo GIS products. MapDotNet UX is the latest generation and consists of a set of WCF web services for rendering map images and tiles and for performing spatial analysis and editing. UX includes an SDK for developing rich interactive mapping applications on Microsoft Silverlight, Windows Presentation Foundation and HTML5. MapDotNet UX also includes an Extract, Transform & Load (ETL), map design and tile cache creation tool called Studio modeled after Microsoft's Expression series of products. The MapDotNet UX renderer is built on WPF and consumes spatial data from multiple sources including Shapefiles, PostGIS, ArcSDE, Oracle Spatial, SQL Azure, SQL Server 2008 R2 and SQL Server 2012.

The Spatial Archive and Interchange Format was defined in the early 1990s as a self-describing, extensible format designed to support interoperability and storage of geospatial data.

Geographic information systems (GIS) play a constantly evolving role in geospatial intelligence (GEOINT) and United States national security. These technologies allow a user to efficiently manage, analyze, and produce geospatial data, to combine GEOINT with other forms of intelligence collection, and to perform highly developed analysis and visual production of geospatial data. Therefore, GIS produces up-to-date and more reliable GEOINT to reduce uncertainty for a decisionmaker. Since GIS programs are Web-enabled, a user can constantly work with a decision maker to solve their GEOINT and national security related problems from anywhere in the world. There are many types of GIS software used in GEOINT and national security, such as Google Earth, ERDAS IMAGINE, GeoNetwork opensource, and Esri ArcGIS.

<span class="mw-page-title-main">Geospatial topology</span> Type of spatial relationship

Geospatial topology is the study and application of qualitative spatial relationships between geographic features, or between representations of such features in geographic information, such as in geographic information systems (GIS). For example, the fact that two regions overlap or that one contains the other are examples of topological relationships. It is thus the application of the mathematics of topology to GIS, and is distinct from, but complementary to the many aspects of geographic information that are based on quantitative spatial measurements through coordinate geometry. Topology appears in many aspects of geographic information science and GIS practice, including the discovery of inherent relationships through spatial query, vector overlay and map algebra; the enforcement of expected relationships as validation rules stored in geospatial data; and the use of stored topological relationships in applications such as network analysis. Spatial topology is the generalization of geospatial topology for non-geographic domains, e.g., CAD software.

<span class="mw-page-title-main">FME (software)</span> Geospatial ETL Software

FME, also known as Feature Manipulation Engine, is a geospatial extract, transformation and load software platform developed and maintained by Safe Software of British Columbia, Canada. FME was first released in 1996, and evolved out of a successful bid by the founders of Safe Software, Don Murray and Dale Lutz, for a Canadian Government contract to monitor logging activities.

References

  1. "What is ETL… and How Can it Turn You into a Geospatial Rock Star?". XYHt. Archived from the original on 6 November 2022. Retrieved 5 November 2022.
  2. Miller, Harvey; Han, Jiawei, eds. (27 May 2009). Geographic Data Mining and Knowledge Discovery. CRC Press. p. 63. ISBN   9781420073980.
  3. "Spatial ETL tools". Esri . Archived from the original on 11 April 2023. Retrieved 11 April 2023.