Topic map

Last updated

TopicMapKeyConcepts2.PNG

A topic map is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information. Topic maps were originally developed in the late 1990s as a way to represent back-of-the-book index structures so that multiple indexes from different sources could be merged. However, the developers quickly realized that with a little additional generalization, they could create a meta-model with potentially far wider application. The ISO/IEC standard is formally known as ISO/IEC 13250:2003.

Contents

A topic map represents information using

Topic maps are similar to concept maps and mind maps in many respects, though only topic maps are ISO standards. Topic maps are a form of semantic web technology similar to RDF.

Ontology and merging

Topics, associations, and occurrences can all be typed, where the types must be defined by the one or more creators of the topic map(s). The definitions of allowed types is known as the ontology of the topic map.

Topic maps explicitly support the concept of merging of identity between multiple topics or topic maps. Furthermore, because ontologies are topic maps themselves, they can also be merged thus allowing for the automated integration of information from diverse sources into a coherent new topic map. Features such as subject identifiers (URIs given to topics) and PSIs (published subject indicators) are used to control merging between differing taxonomies. Scoping on names provides a way to organise the various names given to a particular topic by different sources.

Current standard

The work standardizing topic maps (ISO/IEC 13250) took place under the umbrella of the ISO/IEC JTC 1/SC 34/WG 3 committee (ISO/IEC Joint Technical Committee 1, Subcommittee 34, Working Group 3 – Document description and processing languages – Information Association). [1] [2] [3] However, WG3 was disbanded and maintenance of ISO/IEC 13250 was assigned to WG8.

The topic maps (ISO/IEC 13250) reference model and data model standards are defined independent of any specific serialization or syntax.

Data format

The specification is summarized in the abstract as follows: "This specification provides a model and grammar for representing the structure of information resources used to define topics, and the associations (relationships) between topics. Names, resources, and relationships are said to be characteristics of abstract subjects, which are called topics. Topics have their characteristics within scopes: i.e. the limited contexts within which the names and resources are regarded as their name, resource, and relationship characteristics. One or more interrelated documents employing this grammar is called a topic map."

XML serialization formats

Note that XTM 1.0 predates and therefore is not compatible with the more recent versions of the (ISO/IEC 13250) standard.

Other formats

Other proposed or standardized serialization formats include:

The above standards are all recently proposed or defined as part of ISO/IEC 13250. As described below, there are also other, serialization formats such as LTM, AsTMa= that have not been put forward as standards.

Linear topic map notation (LTM) serves as a kind of shorthand for writing topic maps in plain text editors. This is useful for writing short personal topic maps or exchanging partial topic maps by email. The format can be converted to XTM.

There is another format called AsTMa which serves a similar purpose. When writing topic maps manually it is much more compact, but of course can be converted to XTM. Alternatively, it can be used directly with the Perl Module TM (which also supports LTM).

The data formats of XTM and LTM are similar to the W3C standards for RDF/XML or the older N3 notation. [4]

Topic Maps API

A de facto API standard called Common Topic Maps Application Programming Interface (TMAPI) was published in April 2004 and is supported by many Topic Maps implementations or vendors:

Query standard

In normal use it is often desirable to have a way to arbitrarily query the data within a particular Topic Maps store. Many implementations provide a syntax by which this can be achieved (somewhat like 'SQL for Topic Maps') but the syntax tends to vary a lot between different implementations. With this in mind, work has gone into defining a standardized syntax for querying topic maps:

Constraint standards

It can also be desirable to define a set of constraints that can be used to guarantee or check the semantic validity of topic maps data for a particular domain. (Somewhat like database constraints for topic maps). Constraints can be used to define things like 'every document needs an author' or 'all managers must be human'. There are often implementation specific ways of achieving these goals, but work has gone into defining a standardized constraint language as follows:

TMCL is functionally similar to RDF Schema with Web Ontology Language (OWL). [4]

Earlier standards

The "Topic Maps" concept has existed for a long time. The HyTime standard was proposed as far back as 1992 (or earlier?). Earlier versions of ISO 13250 (than the current revision) also exist. More information about such standards can be found at the ISO Topic Maps site.[ citation needed ]

RDF relationship

Some work has been undertaken to provide interoperability between the W3C's RDF/OWL/SPARQL family of semantic web standards and the ISO's family of Topic Maps standards though the two have slightly different goals.[ citation needed ]

The semantic expressive power of Topic Maps is, in many ways, equivalent to that of RDF,[ citation needed ] but the major differences are that Topic Maps (i) provide a higher level of semantic abstraction (providing a template of topics, associations and occurrences, while RDF only provides a template of two arguments linked by one relationship) and (hence) (ii) allow n-ary relationships (hypergraphs) between any number of nodes, while RDF is limited to triplets.[ citation needed ]

See also

Related Research Articles

<span class="mw-page-title-main">Dublin Core</span> Standardized set of metadata elements

The Dublin Core, also known as the Dublin Core Metadata Element Set (DCMES), is a set of fifteen main metadata items for describing digital or physical resources. The Dublin Core Metadata Initiative (DCMI) is responsible for formulating the Dublin Core; DCMI is a project of the Association for Information Science and Technology (ASIS&T), a non-profit organization.

<span class="mw-page-title-main">International Organization for Standardization</span> International standards development organization

The International Organization for Standardization is an international standard development organization composed of representatives from the national standards organizations of member countries. Membership requirements are given in Article 3 of the ISO Statutes.

<span class="mw-page-title-main">Moving Picture Experts Group</span> Alliance of working groups to set standards for multimedia coding

The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and file formats for various applications. Together with JPEG, MPEG is organized under ISO/IEC JTC 1/SC 29 – Coding of audio, picture, multimedia and hypermedia information.

<span class="mw-page-title-main">Standard Generalized Markup Language</span> Markup language

The Standard Generalized Markup Language is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates":

<span class="mw-page-title-main">XML</span> Markup language by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

The Document Style Semantics and Specification Language (DSSSL) is an international standard developed to provide stylesheets for SGML documents.

The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of syntax notations and data serialization formats, with Turtle currently being the most widely used notation.

MPEG-7 is a multimedia content description standard. It was standardized in ISO/IEC 15938. This description will be associated with the content itself, to allow fast and efficient searching for material that is of interest to the user. MPEG-7 is formally called Multimedia Content Description Interface. Thus, it is not a standard which deals with the actual encoding of moving pictures and audio, like MPEG-1, MPEG-2 and MPEG-4. It uses XML to store metadata, and can be attached to timecode in order to tag particular events, or synchronise lyrics to a song, for example.

HyTime is a markup language that is an application of SGML. HyTime defines a set of hypertext-oriented element types that, in effect, supplement SGML and allow SGML document authors to build hypertext and multimedia presentations in a standardized way.

Common Logic (CL) is a framework for a family of logic languages, based on first-order logic, intended to facilitate the exchange and transmission of knowledge in computer-based systems.

The ISO/IEC 11179 Metadata Registry (MDR) standard is an international ISO/IEC standard for representing metadata for an organization in a metadata registry. It documents the standardization and registration of metadata to make data understandable and shareable.

Office Open XML is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version as ECMA-376. ISO and IEC standardized later versions as ISO/IEC 29500.

The Open Document Format for Office Applications, commonly known as OpenDocument, was based on OpenOffice.org XML, as used in OpenOffice.org 1, and was standardised by the Organization for the Advancement of Structured Information Standards (OASIS) consortium.

ISO/IEC JTC 1/SC 34, Document description and processing languages is a subcommittee of the ISO/IEC JTC 1 joint technical committee, which is a collaborative effort of both the International Organization for Standardization and the International Electrotechnical Commission, which develops and facilitates standards within the field of document description and processing languages. The international secretariat of ISO/IEC JTC 1/SC 34 is the Japanese Industrial Standards Committee (JISC) located in Japan.

Language resource management Lexical markup framework, is the International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication.

The Office Open XML file formats were standardised between December 2006 and November 2008, first by the Ecma International consortium, and subsequently, after a contentious standardization process, by the ISO/IEC's Joint Technical Committee 1.

<span class="mw-page-title-main">OpenLisp</span>

OpenLisp is a programming language in the Lisp family developed by Christian Jullien from Eligis. It conforms to the international standard for ISLISP published jointly by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), ISO/IEC 13816:1997(E), revised to ISO/IEC 13816:2007(E).

ISO/IEC JTC 1/SC 37 Biometrics is a standardization subcommittee in the Joint Technical Committee ISO/IEC JTC 1 of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), which develops and facilitates standards within the field of biometrics. The international secretariat of ISO/IEC JTC 1/SC 37 is the American National Standards Institute (ANSI), located in the United States.

GQL is a proposed standard graph query language. In September 2019 a proposal for a project to create a new standard graph query language was approved by a vote of national standards bodies which are members of ISO/IEC Joint Technical Committee 1(ISO/IEC JTC 1). JTC 1 is responsible for international Information Technology standards. GQL is intended to be a declarative database query language, like SQL.

References

  1. ISO JTC1/SC34. "JTC 1/SC 34 – Document Description and Processing Languages". Archived from the original on 6 May 2014. Retrieved 25 December 2009.
  2. "Home of SC34/WG3 – Information Association". 3 June 2008. Retrieved 26 December 2009.
  3. ISO. "JTC 1/SC 34 – Document description and processing languages". ISO. Retrieved 25 December 2009.
  4. 1 2 Lars Marius Garshol (2003). "Living With Topic Maps and RDF" . Retrieved 21 February 2014.

Further reading