ISO-TimeML

Last updated

ISO 24617-1:2009, ISO-TimeML is the International Organization for Standardization ISO/TC37 standard for time and event markup and annotation. [1] The scope is standardization of principles and methods relating to the annotation of temporal events in the contexts of electronic documentation and language.

Contents

Objectives of ISO-TimeML

The goals of ISO-TimeML are to provide a common model for the creation and use of temporal and event annotation, as a means of managing time-related data within documents, and to enable later categorization and data extraction with use of this meta-data.

History of ISO-TimeML

ISO-TimeML was presented to the ISO for consideration as a standard in August 2007. In this presentation, the preliminaries of ISO-TimeML were outlined, and potential applications were examined. [1] In the following year, revisions were made to ISO-TimeML as the standard transitioned from a New Project (NP) to a Working Project (WP). [2] The ISO-TimeML voting period began in October 2008 and was approved as an international standard by March 2009.

ISO-TimeML as one of the members of the ISO/TC37 family of standards

The ISO/TC37 standards are currently elaborated as high level specifications and deal with word segmentation (ISO 24614), annotations (ISO 24611 a.k.a. MAF, ISO 24612 a.k.a. LAF, ISO 24615 a.k.a. SynAF, and ISO 24617-1 a.k.a. SemAF/Time), feature structures (ISO 24610), multimedia containers (ISO 24616 a.k.a. MLIF), and lexicons (ISO 24613 a.k.a. LMF). These standards are based on low level specifications dedicated to constants, namely data categories (revision of ISO 12620), language codes (ISO 639), scripts codes (ISO 15924), country codes (ISO 3166) and Unicode (ISO 10646).

The two level organization forms a coherent family of standards with the following common and simple rules:

Work group members

Joint work between ISO/TC 37/SC 4/WG 2 (TDG 3) and the TimeML Working Group that was agreed on at the TDG 3 and LIRICS Working Group Meeting, USC/ISI, Marina del Rey, CA, U.S.A., 2006-04-20/21/22.

Proposed Project Leaders and Editors:

Further reading

Related Research Articles

International Organization for Standardization An international standard-setting body composed of representatives from national organizations for standards

The International Organization for Standardization is an international standard-setting body composed of representatives from various national standards organizations.

ISO/IEC 8859-1 Character encoding for the Latin alphabets of Western European languages

ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ISO 8859-1 encodes what it refers to as "Latin alphabet no. 1", consisting of 191 characters from the Latin script. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa. It is the basis for some popular 8-bit character sets and the first two blocks of characters in Unicode.

Portable Document Format (PDF) is a file format developed by Adobe in 1993 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it.

C++ General-purpose programming language

C++ is a general-purpose programming language created by Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significantly over time, and modern C++ now has object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation. It is almost always implemented as a compiled language, and many vendors provide C++ compilers, including the Free Software Foundation, LLVM, Microsoft, Intel, Oracle, and IBM, so it is available on many platforms.

Topic map

A topic map is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information. Topic maps were originally developed in the late 1990s as a way to represent back-of-the-book index structures so that multiple indexes from different sources could be merged. However, the developers quickly realized that with a little additional generalization, they could create a meta-model with potentially far wider application. The ISO standard is formally known as ISO/IEC 13250:2003.

James Pustejovsky is an American computer scientist. He is the TJX Feldberg professor of computer science at Brandeis University in Waltham, Massachusetts, United States. His expertise includes theoretical and computational modeling of language, specifically: Computational linguistics, Lexical semantics, Knowledge representation, temporal and spatial reasoning and Extraction. His main topics of research are Natural language processing generally, and in particular, the computational analysis of linguistic meaning.

The American National Corpus (ANC) is a text corpus of American English containing 22 million words of written and spoken data produced since 1990. Currently, the ANC includes a range of genres, including emerging genres such as email, tweets, and web data that are not included in earlier corpora such as the British National Corpus. It is annotated for part of speech and lemma, shallow parse, and named entities.

Office Open XML is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. The format was initially standardized by Ecma, and by the ISO and IEC in later versions.

Classification Markup Language (ClaML) is an xml data format specification meant for the exchange of medical classifications, which are code numbers for of medical diagnoses and procedures.

The Process Specification Language (PSL) is a set of logic terms used to describe processes. The logic terms are specified in an ontology that provides a formal description of the components and their relationships that make up a process. The ontology was developed at the National Institute of Standards and Technology (NIST), and has been approved as an international standard in the document ISO 18629.

ISO/TC 37

ISO/TC 37 is a technical committee within the International Organization for Standardization (ISO) that prepares standards and other documents concerning methodology and principles for terminology and language resources.

ISO/IEC JTC 1/SC 22 Programming languages, their environments and system software interfaces is a standardization subcommittee of the Joint Technical Committee ISO/IEC JTC 1 of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) that develops and facilitates standards within the fields of programming languages, their environments and system software interfaces. ISO/IEC JTC 1/SC 22 is also sometimes referred to as the "portability subcommittee". The international secretariat of ISO/IEC JTC 1/SC 22 is the American National Standards Institute (ANSI), located in the United States.

Language resource management - Lexical markup framework, is the ISO International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication.

TimeML is a set of rules for encoding documents electronically. It is defined in the TimeML Specification version 1.2.1 developed by several efforts, led in large part by the Laboratory for Linguistics and Computation at Brandeis University.

ISO/IEC JTC 1/SC 2 Coded character sets is a standardization subcommittee of the Joint Technical Committee ISO/IEC JTC 1 of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), that develops and facilitates standards within the field of coded character sets. The international secretariat of ISO/IEC JTC 1/SC 2 is the Japanese Industrial Standards Committee (JISC), located in Japan. SC 2 is responsible for the development of the Universal Coded Character Set which is the international standard corresponding to the Unicode Standard.

ISO/IEC JTC 1/SC 29 Coding of audio, picture, multimedia and hypermedia information is a standardization subcommittee of the Joint Technical Committee ISO/IEC JTC 1 of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), that develops and facilitates international standards, technical reports, and technical specifications within the field of audio, picture, multimedia, and hypermedia information coding. The international secretariat of ISO/IEC JTC 1/SC 29 is the Japanese Industrial Standards Committee (JISC) located in Japan.

Temporal annotation is the study of how to automatically add semantic information regarding time to natural language documents. It plays a role in natural language processing and computational linguistics.

MASC is a balanced subset of 500K words of written texts and transcribed speech drawn primarily from the Open American National Corpus (OANC). The OANC is a 15 million word corpus of American English produced since 1990, all of which is in the public domain or otherwise free of usage and redistribution restrictions.

The Portable Document Format (PDF) was created in the early 1990s by Adobe Systems, introduced at the Windows and OS|2 Conference in January 1993 and remained a proprietary format until it was released as an open standard in 2008. Since then, it is under control of International Organization for Standardization Committee of volunteer industry experts.

References

  1. 1 2 Kiyong Lee,Branimir Boguaraev, Harry Bunt,Nancy Ide and James Pustejovsky (August 14, 2007). "ISO-TimeML and its Applications" (PDF). www.tc37sc4.org. Archived from the original (PDF) on 28 July 2011.CS1 maint: multiple names: authors list (link)
  2. James Pustejovsky (January 11, 2008). "ISO TC37/SC4 N471 rev01 - Notes to ISO Meeting" (PDF). www.tc37sc4.org. Archived from the original (PDF) on 13 June 2010. Retrieved 2 April 2018.