
Last updated
Worldwide density of GeoNames entries in 2006 Geonames4.png
Worldwide density of GeoNames entries in 2006

GeoNames (or is a user-editable geographical database available and accessible through various web services, under a Creative Commons attribution license. The project was founded in late 2005. [1]


The GeoNames dataset differs from, but includes data from, [2] the US Government's similarly named GEOnet Names Server.

Database and web services

The GeoNames database contains over 25,000,000 geographical names corresponding to over 11,800,000 unique features. [3] All features are categorized into one of nine feature classes and further subcategorized into one of 645 feature codes. Beyond names of places in various languages, data stored include latitude, longitude, elevation, population, administrative subdivision and postal codes. All coordinates use the World Geodetic System 1984 (WGS84).

Those data are accessible free of charge through a number of Web services and a daily database export. [4]

Wiki interface

The core of the GeoNames database is derived from official public sources, whose quality may vary. Through a wiki interface, users can manually edit and enhance the database by correcting names, updating locations, adding new features, and refining existing entries [5]

Semantic Web integration

Each GeoNames feature is represented as a web resource identified by a stable URI. This URI provides access, through content negotiation, either to the HTML wiki page, or to a RDF description of the feature, using elements of the GeoNames ontology. [6] This ontology describes the GeoNames features properties using the Web Ontology Language, the feature classes and codes being described in the SKOS language. Through Wikipedia articles URL linked in the RDF descriptions, GeoNames data are linked to DBpedia data and other RDF Linked Data.

Accuracy and improvements

As in other crowdsourcing schemes, GeoNames edit interface allows everyone to sign in and edit the database, hence false information can be entered and such information can remain undetected especially for places that are not accessed frequently. Ahlers (2013) studies these inaccuracies and classifies them into loss in the granularity of coordinates (e.g., due to truncation and low-resolution geocoding in some cases), wrong feature codes, near-identical places, and the placement of places outside their designated countries. Manually correcting these inaccuracies is both tedious and error-prone (due to the database size) and may require experts.

The literature provides very few works on automatically resolving them. Singh & Rafiei (2018) study the problem of automatically detecting the scope of locations in a geographical database and its applications in identifying inconsistencies and improving the quality of the database. Computing the boundary information can help detect inconsistencies such as near-identical places and the placement of locations such as cities under wrong parents such as provinces or countries. Singh and Rafiei show that the boundary information derived in their work can move more than 20% of locations in GeoNames to better positions in the spatial hierarchy and the accuracy of those moves is over 90%.

Related Research Articles

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

The Resource Description Framework (RDF) is a method to describe and exchange graph data. It was originally designed as a data model for metadata by the World Wide Web Consortium (W3C). It provides a variety of syntax notations and formats, of which the most widely used is Turtle.

<span class="mw-page-title-main">Geography Markup Language</span> XML grammar for geographical features

The Geography Markup Language (GML) is the XML grammar defined by the Open Geospatial Consortium (OGC) to express geographical features. GML serves as a modeling language for geographic systems as well as an open interchange format for geographic transactions on the Internet. Key to GML's utility is its ability to integrate all forms of geographic information, including not only conventional "vector" or discrete objects, but coverages and sensor data.

The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for various domains: the nouns representing classes of objects and the verbs representing relations between the objects.

A geocode is a code that represents a geographic entity. It is a unique identifier of the entity, to distinguish it from others in a finite set of geographic entities. In general the geocode is a human-readable and short identifier.

<span class="mw-page-title-main">Geotagging</span> Act of associating geographic coordinates to digital media

Geotagging, or GeoTagging, is the process of adding geographical identification metadata to various media such as a geotagged photograph or video, websites, SMS messages, QR Codes or RgSSfeeds and is a form of geospatial metadata. This data usually consists of latitude and longitude coordinates, though they can also include altitude, bearing, distance, accuracy data, and place names, and perhaps a time stamp.

SPARQL is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description Framework (RDF) format. It was made a standard by the RDF Data Access Working Group (DAWG) of the World Wide Web Consortium, and is recognized as one of the key technologies of the semantic web. On 15 January 2008, SPARQL 1.0 was acknowledged by W3C as an official recommendation, and SPARQL 1.1 in March, 2013.

A web resource is any identifiable resource present on or connected to the World Wide Web. Resources are identified using Uniform Resource Identifiers (URIs). In the Semantic Web, web resources and their semantic properties are described using the Resource Description Framework (RDF).

The GEOnet Names Server (GNS), sometimes also referred to in official documentation as Geographic Names Data or geonames in domain and email addresses, is a service that provides access to the United States National Geospatial-Intelligence Agency's (NGA) and the US Board on Geographic Names's (BGN) database of geographic feature names and locations for locations outside the US. The database is the official repository for the US Federal Government on foreign place-name decisions approved by the BGN. Approximately 20,000 of the database's features are updated monthly. Names are not deleted from the database, "except in cases of obvious duplication". The database contains search aids such as spelling variations and non-Roman script spellings in addition to its primary information about location, administrative division, and quality. The accuracy of the database had been criticised.

A semantic wiki is a wiki that has an underlying model of the knowledge described in its pages. Regular, or syntactic, wikis have structured text and untyped hyperlinks. Semantic wikis, on the other hand, provide the ability to capture or identify information about the data within pages, and the relationships between pages, in ways that can be queried or exported like a database through semantic queries.

Simple Features is a set of standards that specify a common storage and access model of geographic features made of mostly two-dimensional geometries used by geographic databases and geographic information systems. It is formalized by both the Open Geospatial Consortium (OGC) and the International Organization for Standardization (ISO).

Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.

Oracle Spatial and Graph, formerly Oracle Spatial, is a free option component of the Oracle Database. The spatial features in Oracle Spatial and Graph aid users in managing geographic and location-data in a native type within an Oracle database, potentially supporting a wide range of applications — from automated mapping, facilities management, and geographic information systems (AM/FM/GIS), to wireless location services and location-enabled e-business. The graph features in Oracle Spatial and Graph include Oracle Network Data Model (NDM) graphs used in traditional network applications in major transportation, telcos, utilities and energy organizations and RDF semantic graphs used in social networks and social interactions and in linking disparate data sets to address requirements from the research, health sciences, finance, media and intelligence communities.

<span class="mw-page-title-main">Blank node</span>

In RDF, a blank node is a node in an RDF graph representing a resource for which a URI or literal is not given. The resource represented by a blank node is also called an anonymous resource. According to the RDF standard a blank node can only be used as subject or object of an RDF triple.

The Great Britain Historical GIS is a spatially enabled database that documents and visualises the changing human geography of the British Isles, although is primarily focussed on the subdivisions of the United Kingdom mainly over the 200 years since the first census in 1801. The project is currently based at the University of Portsmouth, and is the provider of the website A Vision of Britain through Time.

<span class="mw-page-title-main">Linked data</span> Structured data and method for its publication

In computing, linked data is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web pages only for human readers, it extends them to share information in a way that can be read automatically by computers. Part of the vision of linked data is for the Internet to become a global database.

The FAO geopolitical ontology is an ontology developed by the Food and Agriculture Organization of the United Nations (FAO) to describe, manage and exchange data related to geopolitical entities such as countries, territories, regions and other similar areas.

GeoSPARQL is a model for representing and querying geospatial linked data for the Semantic Web. It is standardized by the Open Geospatial Consortium as OGC GeoSPARQL. The definition of a small ontology based on well-understood OGC standards is intended to provide a standardized exchange basis for geospatial RDF data which can support both qualitative and quantitative spatial reasoning and querying with the SPARQL database query language.

In geographic information systems, toponym resolution is the relationship process between a toponym, i.e. the mention of a place, and an unambiguous spatial footprint of the same place.

Shapes Constraint Language (SHACL) is a World Wide Web Consortium (W3C) standard language for describing Resource Description Framework (RDF) graphs. SHACL has been designed to enhance the semantic and technical interoperability layers of ontologies expressed as RDF graphs.


  1. "Marc Wick: Geek of the Week". Simple Talk. 2009-05-06. Retrieved 2020-07-01.
  2. "Datasources used by GeoNames in the GeoNames Gazetteer" . Retrieved 2020-08-20.
  3. "GeoNames web site". Retrieved 2018-09-08.
  4. "GeoNames API". ProgrammableWeb. Archived from the original on 2018-11-26. Retrieved 2018-09-08.
  5. "How can I help ?". GeoNames Forum. GeoNames. Retrieved 11 August 2018.
  6. "GeoNames ontology". Retrieved 2013-12-15.

Further reading