Spatial Archive and Interchange Format

Last updated

The Spatial Archive and Interchange Format (SAIF, pronounced safe) was defined in the early 1990s as a self-describing, extensible format designed to support interoperability and storage of geospatial data.

Contents

SAIF dataset

SAIF has two major components that together define SAIFtalk. The first is the Class Syntax Notation (CSN), a data definition language used to define a dataset's schema. The second is the Object Syntax Notation (OSN), a data language used to represent the object data adhering to the schema. [1] The CSN and OSN are contained in the same physical file, along with a directory at the beginning of the file. The use of ASCII text and a straightforward syntax for both CSN and OSN ensure that they can be parsed easily and understood directly by users and developers. A SAIF dataset, with a .saf or .zip extension, is compressed using the zip archive format.

Schema definition

SAIF defines 285 classes (including enumerations) in the Class Syntax Notation, covering the definitions of high-level features, geometric types, topological relationships, temporal coordinates and relationships, geodetic coordinate system components and metadata. These can be considered as forming a base schema. Using CSN, a user defines a new schema to describe the features in a given dataset. The classes belonging to the new schema are defined in CSN as subclasses of existing SAIF classes or as new enumerations.

A ForestStand::MySchema for example could be defined with attributes including age, species, etc. and with ForestStand::MySchema specified as a subclass of GeographicObject, a feature defined in the SAIF standard. All user defined classes must belong to a schema, one defined by the user or previously existing. Different schemas can exist in the same dataset and objects defined under one schema can reference those specified in another.

Inheritance

SAIF supports multiple inheritance, although common usage involved single inheritance only. [1]

Object referencing

Object referencing can be used as a means of breaking up large monolithic structures. More significantly, it can allow objects to be defined only once and then referenced any number of times. A section of the geometry of the land-water interface could define part of a coastline as well as part of a municipal boundary and part of a marine park boundary. This geometric feature can be defined and given an object reference, which is then used when the geometry of the coastline, municipality and marine park are specified.

Multimedia

Multimedia objects can also be objects in a SAIF dataset and referenced accordingly. For example, image and sound files associated with a given location could be included.

The primary advantage of SAIF was that it was inherently extensible following object oriented principles. This meant that data transfers from one GIS environment to another did not need to follow the lowest common denominator between the two systems. Instead, data could be extracted from a dataset defined by the first GIS, transformed into an intermediary, i.e., the semantically rich SAIF model, and from there transformed into a model and format applicable to the second GIS.

This notion of model to model transformation was deemed to be realistic only with an object oriented approach. It was recognized that scripts to carry out such transformations could in fact add information content. When Safe Software developed the Feature Manipulation Engine (FME), it was in large measure with the express purpose of supporting such transformations. The FMEBC was a freely available software application that supported a wide range of transformations using SAIF as the hub. The FME was developed as a commercial offering in which the intermediary could be held in memory instead of as a SAIF dataset.

History

The SAIF project was established as a means of addressing interoperability between different geographic information systems. Exchange formats of particular prominence at the time included DIGEST (Digital Geographic Information Exchange Standard) and SDTS (Spatial Data Transfer Specification, later accepted as the Spatial Data Transfer Standard). These were considered as too inflexible and difficult to use. Consequently, the Government of British Columbia decided to develop SAIF and to put it forward as a national standard in Canada.

SAIF became a Canadian national standard in 1993 with the approval of the Canadian General Standards Board. The last version of SAIF, published in January 1995, is designated as CGIS-SAIF Canadian Geomatics Interchange Standard: Spatial Archive and Interchange Format: Formal Definition (Release 3.2), [2] issue CAN/CGSB-171.1-95, catalogue number P29-171-001-1995E.

The work on the SAIF modeling paradigm and the CSN classes was carried out principally by Mark Sondheim, Henry Kucera and Peter Friesen, all with the British Columbia government at the time. Dale Lutz and Don Murray of Safe Software developed the Object Syntax Notation and the Reader and Writer software that became part of the Feature Manipulation Engine.

SAIF was brought to the attention of Michael Stonebraker and Kenn Gardels of the University of California at Berkeley, and then to those working on the initial version of the Open Geospatial Interoperability Specification (OGIS), the first efforts of what became the Open Geospatial Consortium (OGC). A series of 18 submissions to the ISO SQL Multimedia working group also helped tie SAIF to the original ISO work on geospatial features.

Today SAIF is of historical interest only. It is significant as a precursor to the Geography Markup Language and as the formative element in the development of the widely used Feature Manipulation Engine. [3]

Related Research Articles

<span class="mw-page-title-main">Geography Markup Language</span> XML grammar for geographical features

The Geography Markup Language (GML) is the XML grammar defined by the Open Geospatial Consortium (OGC) to express geographical features. GML serves as a modeling language for geographic systems as well as an open interchange format for geographic transactions on the Internet. Key to GML's utility is its ability to integrate all forms of geographic information, including not only conventional "vector" or discrete objects, but coverages and sensor data.

A coverage is the digital representation of some spatio-temporal phenomenon. ISO 19123 provides the definition:

A GIS file format is a standard for encoding geographical information into a computer file, as a specialized type of file format for use in geographic information systems (GIS) and other geospatial applications. Since the 1970s, dozens of formats have been created based on various data models for various purposes. They have been created by government mapping agencies, GIS software vendors, standards bodies such as the Open Geospatial Consortium, informal user communities, and even individual developers.

<span class="mw-page-title-main">JSON</span> Open standard file format and data interchange

JSON is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays. It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.

GXL is designed to be a standard exchange format for graphs. GXL is an extensible markup language (XML) sublanguage and the syntax is given by an XML document type definition (DTD). This exchange format offers an adaptable and flexible means to support interoperability between graph-based tools.

A GIS software program is a computer program to support the use of a geographic information system, providing the ability to create, store, manage, query, analyze, and visualize geographic data, that is, data representing phenomena for which location is important. The GIS software industry encompasses a broad range of commercial and open-source products that provide some or all of these capabilities within various information technology architectures.

<span class="mw-page-title-main">Shapefile</span> Geospatial vector data format

The shapefile format is a geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a mostly open specification for data interoperability among Esri and other GIS software products. The shapefile format can spatially describe vector features: points, lines, and polygons, representing, for example, water wells, rivers, and lakes. Each item usually has attributes that describe it, such as name or temperature.

The Clinical Data Interchange Standards Consortium (CDISC) is a standards developing organization (SDO) dealing with medical research data linked with healthcare, to "enable information system interoperability to improve medical research and related areas of healthcare". The standards support medical research from protocol through analysis and reporting of results and have been shown to decrease resources needed by 60% overall and 70–90% in the start-up stages when they are implemented at the beginning of the research process.

Catalogue Service for the Web (CSW), sometimes seen as Catalogue Service - Web, is a standard for exposing a catalogue of geospatial records in XML on the Internet (over HTTP). The catalogue is made up of records that describe geospatial data (e.g. KML), geospatial services (e.g. WMS), and related resources.

JTS Topology Suite is an open-source Java software library that provides an object model for Euclidean planar linear geometry together with a set of fundamental geometric functions. JTS is primarily intended to be used as a core component of vector-based geomatics software such as geographical information systems. It can also be used as a general-purpose library providing algorithms in computational geometry.

Spatial extract, transform, load, also known as geospatial transformation and load (GTL), provides the data processing functionality of traditional extract, transform, load (ETL) software, but with a primary focus on the ability to manage spatial data.

Geospatial metadata is a type of metadata applicable to geographic data and information. Such objects may be stored in a geographic information system (GIS) or may simply be documents, data-sets, images or other objects, services, or related items that exist in some other native environment but whose features may be appropriate to describe in a (geographic) metadata catalog.

A Spatial Data Infrastructure (SDI), also called geospatial data infrastructure, is a data infrastructure implementing a framework of geographic data, metadata, users and tools that are interactively connected in order to use spatial data in an efficient and flexible way. Another definition is "the technology, policies, standards, human resources, and related activities necessary to acquire, process, distribute, use, maintain, and preserve spatial data".

Data exchange is the process of taking data structured under a source schema and transforming it into a target schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between different computer programs.

The Open Geospatial Consortium Web Coverage Service Interface Standard (WCS) defines Web-based retrieval of coverages – that is, digital geospatial information representing space/time-varying phenomena.

The Spatial Data File (SDF) is a single-user geodatabase file format developed by Autodesk. The file format is the native spatial data storage format for Autodesk GIS programs MapGuide and AutoCAD Map 3D. As of 2014 SDF format version SDF3 uses a single file. Prior versions of the format required a spatial index file (SIF), with an optional key index file (KIF) to speed access to the file.

<span class="mw-page-title-main">Open Geospatial Consortium</span> Standards organization

The Open Geospatial Consortium (OGC), an international voluntary consensus standards organization for geospatial content and location-based services, sensor web and Internet of Things, GIS data processing and data sharing. It originated in 1994 and involves more than 500 commercial, governmental, nonprofit and research organizations in a consensus process encouraging development and implementation of open standards.

GeoPackage (GPKG) is an open, non-proprietary, platform-independent and standards-based data format for geographic information systems built as a set of conventions over a SQLite database. Defined by the Open Geospatial Consortium (OGC) with the backing of the US military and published in 2014, GeoPackage has seen widespread support from various government, commercial, and open source organizations.

<span class="mw-page-title-main">FME (software)</span> Geospatial ETL Software

FME, also known as Feature Manipulation Engine, is a geospatial extract, transformation and load software platform developed and maintained by Safe Software of British Columbia, Canada. FME was first released in 1996, and evolved out of a successful bid by the founders of Safe Software, Don Murray and Dale Lutz, for a Canadian Government contract to monitor logging activities.

References

  1. 1 2 "Spatial Archive and Interchange Format:Formal Definition Release 3.2". Geographic Data BC. Archived from the original on 2010-11-21. Retrieved 2010-11-24.
  2. "CGIS-SAIF Canadian geomatics interchange standard: Spatial archive and interchange format: Formal definition (Release 3.2) / Prepared by the Ministry of the Environment, British Columbia. : P29-171-001-1995E - Government of Canada Publications - Canada.ca". July 2002.
  3. "Freeing the Data". XYHt. Retrieved 5 November 2022.

See also