Well-known text representation of geometry

Last updated

Well-known text (WKT) is a text markup language for representing vector geometry objects. A binary equivalent, known as well-known binary (WKB), is used to transfer and store the same information in a more compact form convenient for computer processing but that is not human-readable. The formats were originally defined by the Open Geospatial Consortium (OGC) and described in their Simple Feature Access. [1] The current standard definition is in the ISO/IEC 13249-3:2016 standard. [2]

Contents

Geometric objects

WKT can represent the following distinct geometric objects:

Coordinates for geometries may be 2D (x, y), 3D (x, y, z), 4D (x, y, z, m) with an m value that is part of a linear referencing system or 2D with an m value (x, y, m). Three-dimensional geometries are designated by a "Z" after the geometry type and geometries with a linear referencing system have an "M" after the geometry type. Empty geometries that contain no coordinates can be specified by using the symbol EMPTY after the type name.

WKT geometries are used throughout OGC specifications and are present in applications that implement these specifications. For example, PostGIS contains functions that can convert geometries to and from a WKT representation, making them human readable.

The OGC standard definition requires a polygon to be topologically closed. It also states that if the exterior linear ring of a polygon is defined in a counterclockwise direction, then it will be seen from the "top". Any interior linear rings should be defined in opposite fashion compared to the exterior ring, in this case, clockwise. [3]

Geometry primitives (2D)
TypeExamples
Point SFA Point.svg POINT (30 10)
LineString SFA LineString.svg LINESTRING (30 10, 10 30, 40 40)
Polygon SFA Polygon.svg POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))
SFA Polygon with hole.svg POLYGON ((35 10, 45 45, 15 40, 10 20, 35 10),
(20 30, 35 35, 30 20, 20 30))
Multipart geometries (2D)
TypeExamples
MultiPoint SFA MultiPoint.svg MULTIPOINT ((10 40), (40 30), (20 20), (30 10))
MULTIPOINT (10 40, 40 30, 20 20, 30 10)
MultiLineString SFA MultiLineString.svg MULTILINESTRING ((10 10, 20 20, 10 40),
(40 40, 30 30, 40 20, 30 10))
MultiPolygon SFA MultiPolygon.svg MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)),
((15 5, 40 10, 10 20, 5 10, 15 5)))
SFA MultiPolygon with hole.svg MULTIPOLYGON (((40 40, 20 45, 45 30, 40 40)),
((20 35, 10 30, 10 10, 30 5, 45 20, 20 35),
(30 20, 20 15, 20 25, 30 20)))
GeometryCollection SFA GeometryCollection.svg GEOMETRYCOLLECTION (POINT (40 10),
LINESTRING (10 10, 20 20, 10 40),
POLYGON ((40 40, 20 45, 45 30, 40 40)))

The following are some other examples of geometric WKT strings: (Note: Each item below is an individual geometry.)

GEOMETRYCOLLECTION(POINT(4 6),LINESTRING(4 6,7 10)) POINT ZM (1 1 5 60) POINT M (1 1 80) POINT EMPTY MULTIPOLYGON EMPTY TRIANGLE((0 0 0,0 1 0,1 1 0,0 0 0)) TIN (((0 0 0, 0 0 1, 0 1 0, 0 0 0)), ((0 0 0, 0 1 0, 1 1 0, 0 0 0))) POLYHEDRALSURFACE Z ( PATCHES     ((0 0 0, 0 1 0, 1 1 0, 1 0 0, 0 0 0)),     ((0 0 0, 0 1 0, 0 1 1, 0 0 1, 0 0 0)),     ((0 0 0, 1 0 0, 1 0 1, 0 0 1, 0 0 0)),     ((1 1 1, 1 0 1, 0 0 1, 0 1 1, 1 1 1)),     ((1 1 1, 1 0 1, 1 0 0, 1 1 0, 1 1 1)),     ((1 1 1, 1 1 0, 0 1 0, 0 1 1, 1 1 1))   ) 

Well-known binary

Well-known binary (WKB) representations are typically shown in hexadecimal strings.

The first byte indicates the byte order for the data:

The next 4 bytes are a 32-bit unsigned integer for the geometry type, as described below:

Geometry types, and WKB integer codes
Type2DZMZM
Geometry0000100020003000
Point0001100120013001
LineString0002100220023002
Polygon0003100320033003
MultiPoint0004100420043004
MultiLineString0005100520053005
MultiPolygon0006100620063006
GeometryCollection0007100720073007
CircularString0008100820083008
CompoundCurve0009100920093009
CurvePolygon0010101020103010
MultiCurve0011101120113011
MultiSurface0012101220123012
Curve0013101320133013
Surface0014101420143014
PolyhedralSurface0015101520153015
TIN0016101620163016
Triangle0017101720173017
Circle0018101820183018
GeodesicString0019101920193019
EllipticalCurve0020102020203020
NurbsCurve0021102120213021
Clothoid0022102220223022
SpiralCurve0023102320233023
CompoundSurface0024102420243024
BrepSolid1025
AffinePlacement1021102

Each data type has a unique data structure, such as the number of points or linear rings, followed by coordinates in 64-bit double numbers.

For example, the geometry POINT(2.0 4.0) is represented as: 000000000140000000000000004010000000000000, where:

Format variations

EWKT and EWKBExtended Well-Known Text/Binary
A PostGIS-specific format that includes the spatial reference system identifier (SRID) and up to 4 ordinate values (XYZM). [4] [5] For example: SRID=4326;POINT(-44.3 60.1) to locate a longitude/latitude coordinate using the WGS 84 reference coordinate system. It also supports circular curves, following elements named (but not fully defined) within the original WKT: CircularString, CompoundCurve, CurvePolygon and CompoundSurface. [6]
AGF TextAutodesk Geometry Format
An extension to OGC's Standard (at the time), to include curved elements; most notably used in MapGuide. [7]

See also

Related Research Articles

In computing, endianness is the order or sequence of bytes of a word of digital data in computer memory or data communication which is identified by describing the impact of the "first" bytes, meaning at the smallest address or sent first. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most significant byte of a word at the smallest memory address and the least significant byte at the largest. A little-endian system, in contrast, stores the least-significant byte at the smallest address. Bi-endianness is a feature supported by numerous computer architectures that feature switchable endianness in data fetches and stores or for instruction fetches. Other orderings are generically called middle-endian or mixed-endian.

<span class="mw-page-title-main">PostGIS</span> Geospatial extension for the PostgreSQL Database

PostGIS is an open source software program that adds support for geographic objects to the PostgreSQL object-relational database. PostGIS follows the Simple Features for SQL specification from the Open Geospatial Consortium (OGC).

<span class="mw-page-title-main">Geography Markup Language</span> XML grammar for geographical features

The Geography Markup Language (GML) is the XML grammar defined by the Open Geospatial Consortium (OGC) to express geographical features. GML serves as a modeling language for geographic systems as well as an open interchange format for geographic transactions on the Internet. Key to GML's utility is its ability to integrate all forms of geographic information, including not only conventional "vector" or discrete objects, but coverages and sensor data.

A coverage is the digital representation of some spatio-temporal phenomenon. ISO 19123 provides the definition:

<span class="mw-page-title-main">Geometric primitive</span> Basic shapes represented in vector graphics

In vector computer graphics, CAD systems, and geographic information systems, geometric primitive is the simplest geometric shape that the system can handle. Sometimes the subroutines that draw the corresponding objects are called "geometric primitives" as well. The most "primitive" primitives are point and straight line segment, which were all that early vector graphics systems had.

A GIS file format is a standard for encoding geographical information into a computer file, as a specialized type of file format for use in geographic information systems (GIS) and other geospatial applications. Since the 1970s, dozens of formats have been created based on various data models for various purposes. They have been created by government mapping agencies, GIS software vendors, standards bodies such as the Open Geospatial Consortium, informal user communities, and even individual developers.

<span class="mw-page-title-main">STL (file format)</span> File format for stereolithography applications

STL is a file format native to the stereolithography CAD software created by 3D Systems. Chuck Hull, the inventor of stereolithography and 3D Systems’ founder, reports that the file extension is an abbreviation for stereolithography.

<span class="mw-page-title-main">Shapefile</span> Geospatial vector data format

The shapefile format is a geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a mostly open specification for data interoperability among Esri and other GIS software products. The shapefile format can spatially describe vector features: points, lines, and polygons, representing, for example, water wells, rivers, and lakes. Each item usually has attributes that describe it, such as name or temperature.

<span class="mw-page-title-main">Spatial reference system</span> System to specify locations on Earth

A spatial reference system (SRS) or coordinate reference system (CRS) is a framework used to precisely measure locations on the surface of Earth as coordinates. It is thus the application of the abstract mathematics of coordinate systems and analytic geometry to geographic space. A particular SRS specification comprises a choice of Earth ellipsoid, horizontal datum, map projection, origin point, and unit of measure. Thousands of coordinate systems have been specified for use around the world or in specific regions and for various purposes, necessitating transformations between different SRS.

Simple Features is a set of standards that specify a common storage and access model of geographic features made of mostly two-dimensional geometries used by geographic information systems. It is formalized by both the Open Geospatial Consortium (OGC) and the International Organization for Standardization (ISO).

A spatial database is a general-purpose database that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data. Most spatial databases allow the representation of simple geometric objects such as points, lines and polygons. Some spatial databases handle more complex structures such as 3D objects, topological coverages, linear networks, and triangulated irregular networks (TINs). While typical databases have developed to manage various numeric and character types of data, such databases require additional functionality to process spatial data types efficiently, and developers have often added geometry or feature data types. The Open Geospatial Consortium (OGC) developed the Simple Features specification and sets standards for adding spatial functionality to database systems. The SQL/MM Spatial ISO/IEC standard is a part of the SQL/MM multimedia standard and extends the Simple Features standard with data types that support circular interpolations. Almost all current relational and object-relational database management systems now have spatial extensions, and some GIS software vendors have developed their own spatial extensions to database management systems.

JTS Topology Suite is an open-source Java software library that provides an object model for Euclidean planar linear geometry together with a set of fundamental geometric functions. JTS is primarily intended to be used as a core component of vector-based geomatics software such as geographical information systems. It can also be used as a general-purpose library providing algorithms in computational geometry.

GeoRSS is a specification for encoding location as part of a Web feed. (Web feeds are used to describe feeds of content, such as news articles, Audio blogs, video blogs and text blog entries. These web feeds are rendered by programs such as aggregators and web browsers.) The name "GeoRSS" is derived from RSS, the most known Web feed and syndication format.

<span class="mw-page-title-main">Minimum bounding rectangle</span> Smallest rectangle which encloses some planar set of points

In computational geometry, the minimum bounding rectangle (MBR), also known as bounding box (BBOX) or envelope, is an expression of the maximum extents of a two-dimensional object (e.g. point, line, polygon) or set of objects within its x-y coordinate system; in other words min(x), max(x), min(y), max(y). The MBR is a 2-dimensional case of the minimum bounding box.

<span class="mw-page-title-main">Polygonal chain</span> Connected series of line segments

In geometry, a polygonal chain is a connected series of line segments. More formally, a polygonal chain is a curve specified by a sequence of points called its vertices. The curve itself consists of the line segments connecting the consecutive vertices.

<span class="mw-page-title-main">DE-9IM</span>

The Dimensionally Extended 9-Intersection Model (DE-9IM) is a topological model and a standard used to describe the spatial relations of two regions, in geometry, point-set topology, geospatial topology, and fields related to computer spatial analysis. The spatial relations expressed by the model are invariant to rotation, translation and scaling transformations.

Additive manufacturing file format (AMF) is an open standard for describing objects for additive manufacturing processes such as 3D printing. The official ISO/ASTM 52915:2016 standard is an XML-based format designed to allow any computer-aided design software to describe the shape and composition of any 3D object to be fabricated on any 3D printer via a computer-aided manufacturing software. Unlike its predecessor STL format, AMF has native support for color, materials, lattices, and constellations.

<span class="mw-page-title-main">Open Geospatial Consortium</span> Standards organization

The Open Geospatial Consortium (OGC), an international voluntary consensus standards organization for geospatial content and location-based services, sensor web and Internet of Things, GIS data processing and data sharing. It originated in 1994 and involves more than 500 commercial, governmental, nonprofit and research organizations in a consensus process encouraging development and implementation of open standards.

The Esri TIN format is a popular yet proprietary geospatial vector data format for geographic information system (GIS) software for storing elevation data as a triangulated irregular network. It is developed and regulated by Esri, US. The Esri TIN format can spatially describe elevation information including breaking edge features. Each points and triangle can carry a tag information. A TIN stored in this file format can have any shape, cover multiple regions and contain holes.

Well-known text representation of coordinate reference systems is a text markup language for representing spatial reference systems and transformations between spatial reference systems. The formats were originally defined by the Open Geospatial Consortium (OGC) and described in their Simple Feature Access and Well-known text representation of coordinate reference systems specifications. The current standard definition is ISO 19162:2019. This supersedes ISO 19162:2015.

References

  1. Herring, John R., ed. (2011-05-28), OpenGIS® Implementation Standard for Geographic information – Simple feature access – Part 1: Common architecture, Open Geospatial Consortium , retrieved 2019-01-28
  2. Information technology – Database languages – SQL multimedia and application packages – Part 3: Spatial (5th ed.), ISO, 2016-01-15, retrieved 2019-01-28
  3. See the OGC Implementation Specification for geographic information – Simple Feature Access, section 6.1.11.1. http://www.opengeospatial.org/standards/sfa
  4. "Postgis/Postgis". GitHub . 6 October 2021.
  5. "ST_GeomFromEWKT" . Retrieved 2022-11-25.
  6. "Chapter 4: Using PostGIS: Data Management and Queries". postgis.net. Retrieved 2021-07-30.
  7. "MapGuide API Reference: AGF Text" . Retrieved 2023-09-14.