The Information Coding Classification (ICC) is a classification system covering almost all extant 6500 knowledge fields (knowledge domains). Its conceptualization goes beyond the scope of the well known library classification systems, such as Dewey Decimal Classification (DDC), Universal Decimal Classification (UDC), and Library of Congress Classification (LCC), by extending also to knowledge systems that so far have not afforded to classify literature. ICC actually presents a flexible universal ordering system for both literature and other kinds of information, set out as knowledge fields. From a methodological point of view, ICC differs from the above-mentioned systems along the following three lines:
Respective knowledge fields permit to step down by the same principle to a third and forth level, and even further to a fifth and sixth level. Finally, knowledge field subdivisions will have to conform to said digital position scheme. Hence, for a given knowledge field identical codes will mark identical categories under respective numbers of the coding system. This mnemotechnical aspect of the system helps memorizing and straightaway retrieving the whereabouts of respective interdisciplinary and transdisciplinary fields.
The first two hierarchical levels may be regarded as a top- or upper ontology for ontologies and other applications.
The terms of the first three hierarchical levels were set out in German and English in Wissensorganisation. Entwicklung, Aufgabe, Anwendung, Zukunft, [1] on pp. 82 to 100. It was published in 2014 and available so far only in German. In the meantime, also the French terms of the knowledge fields have been collected. Competence for maintenance and further development rests with the German Chapter of the International Society for Knowledge Organization (ISKO) e.V.
At the end of 1970, Prof. Alwin Diemer, Univ.of Düsseldorf proposed to Ingetraut Dahlberg to undertake a philosophical dissertation on The universal classification system of knowledge, its ontological, epistemological, and information theoretical foundations. Diemer had in mind an innovating ontological approach for such a system based on the whole spectrum of kinds of being and complying with epistemological requirements. The third requirement had already been taken up somehow in the Indian Colon Classification, yet it still called for explanations and additions. In 1974, the dissertation was published in German entitled Grundlagen universaler Wissensordnung. [2] It started with conceptual clarifications, and why and how the term „universal“ was linked to knowledge, including knowledge fields, such as commodity science, artefacts, statistics, patents, standardization, communication, utility services et al. In chapter 3, six universal classification systems (DDC, UDC, LCC, BC, CC and BBK) were presented, analyzed and compared.
While preparing the dissertation, Dahlberg started with elaborating the new universal system by first gleaning a lot of extant designations of knowledge fields from whatever available reference works. This was funded by the German Documentation Society (DGD) (1971-2) under the title of Order system of knowledge fields. In addition, the syllabuses of German universities and polytechniques were explored for relevant terms and documented (1975). Thereafter, it seemed necessary to add definitions from special dictionaries and encyclopediae; it soon appeared that the 12.500 terms included numerous synonyms, so that the whole collection boiled down to about 6.500 concept designations (Project Logstruktur, supported by the German Science Foundation (DFG) 1976-78).
The outcome of this work [2] was the formulation of 30 theses which ended up in 12 principles for the new system, published 40 years later under. [1] These principles refer not only to theoretical foundations but also to structure and other organizational aspects of the whole array of knowledge fields. In 1974, the digital position scheme for field subdivision had already been developed to allow for classifying classification literature in the bibliographical section of the first issue of the Journal International Classification. In 1977, the entire ICC was ready for presentation at a seminar in Bangalore, India. [3] A publication of the first three hierarchical levels appeared however only in 1982. [4] It was applied to the bibliography of classification systems and thesauri in vol.1 of the International Classification and Indexing Bibliography; [5] it has been updated. [1]
These were published in full length in the book Wissensorganisation. Entwicklung, Aufgabe, Anwendung, Zukunft [1] and the article Information Coding Classification. Geschichtliches, Prinzipien, Inhaltliches, [6] hence it suffices to just mention their topics with some necessary additions.
An example of its application is the structure of the classification system for knowledge organization literature Gliederung der Klassifikationsliteratur. (A simplified version with an additional introduction is given in, [1] p. 71)
The first two levels of ICC can be represented by following matrix.
The first hierarchical level of the 9 subject categories results from the first vertical array under codes 1-9. The second hierarchical level of subject categories is structured by the 9 functionally ordered form categories, listed in the first horizontal line under codes 01-09. Some exceptions are mentioned in principle 7.
For classifying web documents as conceived by Jens Hartmann, University of Karlsruhe, Prof.Walter Koch, University of Graz, has explored in his Institute for Applied Information Technology Research Society (AIT) the application of ICC to automatically classifying metadata of some 350.000 documents. This was facilitated by data generated within the framework of an EU-supported project "EuropeanaLocal". For this exploration, three ICC hierarchical levels have been used for some 5000 terms. The result is described in the report of Christoph Mak. [7] Prof.Koch regarded a classification degree of almost 50% as a good result, considering that only a shortened version of ICC had been used. In order to reach a better result one would have needed 1–2 years. Also an index of all terms with their codes could be achieved under these explorations.
Motivated by the work of an Italian research Group in Trento on Revising the Wordnet Domains Hierarchy: semantics, coverage and balancing, [8] by which the DDC codes were used, Prof. Ernesto William De Luca et al. showed in a study that for such case the use of ICC could lead to essentially better results. This was shown in two contributions: Including knowledge domains from the ICC into the Multilingual Lexical Linked Data Cloud (LLD) [9] and Die Multilingual Lexical Linked Data Cloud: Eine mögliche Zugangsoptimierung?, [10] in which the LLD was used in a meta-model which contains all resources with the possibility of retrieval and navigation of data from different aspects. By this, the existing work about many thousand knowledge fields (of ICC) can be combined with the Multilingual Lexical Linked Data Cloud, based on RDF/OWL representation of EuroWordNet and similar integrated lexical resources (MultiWordNet, MEMODATA and the Hamburg Metapher BD).
In October 2013, the computer scientist Hermann Bense, Dortmund, explored the possibilities for structuring the Semanic Web with ICC codes. He developed two approaches for a pictorial presentation of knowledge fields with their possible subdivisions. A graphic representation of those knowledge fields pertaining to the first two levels can be found under Ontology4. The inclusion of the third hierarchical level has been envisaged as the next step.
A library classification is a system of Organization of knowledge by which library resources are arranged and ordered systematically. Library classifications are a notational system that represents the order of topics in the classification and allows items to be stored in that order. Library classification systems group related materials together, typically arranged as a hierarchical tree structure. A different kind of classification system, called a faceted classification system, is also widely used, which allows the assignment of multiple classifications to an object, enabling the classifications to be ordered in many ways.
WordNet is a lexical database of semantic relations between words in more than 200 languages. WordNet links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into synsets with short definitions and usage examples. WordNet can thus be seen as a combination and extension of a dictionary and thesaurus. While it is accessible to human users via a web browser, its primary use is in automatic text analysis and artificial intelligence applications. WordNet was first created in the English language and the English WordNet database and software tools have been released under a BSD style license and are freely available for download from that WordNet website.
A glossary also known as a vocabulary or clavis, is an alphabetical list of terms in a particular domain of knowledge with the definitions for those terms. Traditionally, a glossary appears at the end of a book and includes terms within that book that are either newly introduced, uncommon, or specialized. While glossaries are most commonly associated with non-fiction books, in some cases, fiction novels may come with a glossary for unfamiliar terms.
A faceted classification is a classification scheme used in organizing knowledge into a systematic order. A faceted classification uses semantic categories, either general or subject-specific, that are combined to create the full classification entry. Many library classification systems use a combination of a fixed, enumerative taxonomy of concepts with subordinate facets that further refine the topic.
A medical classification is used to transform descriptions of medical diagnoses or procedures into standardized statistical code in a process known as clinical coding. Diagnosis classifications list diagnosis codes, which are used to track diseases and other health conditions, inclusive of chronic diseases such as diabetes mellitus and heart disease, and infectious diseases such as norovirus, the flu, and athlete's foot. Procedure classifications list procedure code, which are used to capture interventional data. These diagnosis and procedure codes are used by health care providers, government health programs, private health insurance companies, workers' compensation carriers, software developers, and others for a variety of applications in medicine, public health and medical informatics, including:
In ontology, ontic is physical, real, or factual existence.
SNOMED CT or SNOMED Clinical Terms is a systematically organized computer-processable collection of medical terms providing codes, terms, synonyms and definitions used in clinical documentation and reporting. SNOMED CT is considered to be the most comprehensive, multilingual clinical healthcare terminology in the world. The primary purpose of SNOMED CT is to encode the meanings that are used in health information and to support the effective clinical recording of data with the aim of improving patient care. SNOMED CT provides the core general terminology for electronic health records. SNOMED CT comprehensive coverage includes: clinical findings, symptoms, diagnoses, procedures, body structures, organisms and other etiologies, substances, pharmaceuticals, devices and specimens.
Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.
Knowledge organization (KO), organization of knowledge, organization of information, or information organization is an intellectual discipline concerned with activities such as document description, indexing, and classification that serve to provide systems of representation and order for knowledge and information objects. According to The Organization of Information by Joudrey and Taylor, information organization:
examines the activities carried out and tools used by people who work in places that accumulate information resources for the use of humankind, both immediately and for posterity. It discusses the processes that are in place to make resources findable, whether someone is searching for a single known item or is browsing through hundreds of resources just hoping to discover something useful. Information organization supports a myriad of information-seeking scenarios.
Language resource management - Lexical markup framework, is the International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication.
The International Society for Knowledge Organization, or ISKO, is a professional association for scholars of knowledge organization, knowledge structures, classification studies, and information organization and structure. Founded in 1989, ISKO's mission is "to advance conceptual work in knowledge organization in all kinds of forms, and for all kinds of purposes, such as databases, libraries, dictionaries and the Internet." An interdisciplinary association, ISKO's worldwide membership draws from fields such as information science, philosophy, linguistics, library science, archive studies, science studies, and computer science. ISKO "promotes research, development and applications of knowledge organization systems that advance the philosophical, psychological and semantic approaches for ordering knowledge; provides the means of communication and networking on knowledge organization for its members; and functions as a connecting link between all institutions and national societies, working with problems related to the conceptual organization and processing of knowledge."
ISO 25964 is the international standard for thesauri, published in two parts as follows:
ISO 25964 Information and documentation - Thesauri and interoperability with other vocabulariesPart 1: Thesauri for information retrieval [published August 2011] Part 2: Interoperability with other vocabularies [published March 2013]
In library and information science documents are classified and searched by subject – as well as by other attributes such as author, genre and document type. This makes "subject" a fundamental term in this field. Library and information specialists assign subject labels to documents to make them findable. There are many ways to do this and in general there is not always consensus about which subject should be assigned to a given document. To optimize subject indexing and searching, we need to have a deeper understanding of what a subject is. The question: "what is to be understood by the statement 'document A belongs to subject category X'?" has been debated in the field for more than 100 years.
Taxonomy is the practice and science of categorization or classification.
In the context of information retrieval, a thesaurus is a form of controlled vocabulary that seeks to dictate semantic manifestations of metadata in the indexing of content objects. A thesaurus serves to minimise semantic ambiguity by ensuring uniformity and consistency in the storage and retrieval of the manifestations of content objects. ANSI/NISO Z39.19-2005 defines a content object as "any item that is to be described for inclusion in an information retrieval system, website, or other source of information". The thesaurus aids the assignment of preferred terms to convey semantic metadata associated with the content object.
Ingetraut Dahlberg was a German information scientist and philosopher who developed the universal Information Coding Classification covering some 6,500 subject fields. Her career spanned various roles in research, teaching, editing, and publishing. Dahlberg founded the journal International Classification as well as both the scientific Society for Classification and International Society for Knowledge Organization.
In natural language processing, linguistics, and neighboring fields, Linguistic Linked Open Data (LLOD) describes a method and an interdisciplinary community concerned with creating, sharing, and (re-)using language resources in accordance with Linked Data principles. The Linguistic Linked Open Data Cloud was conceived and is being maintained by the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation, but has been a point of focal activity for several W3C community groups, research projects, and infrastructure efforts since then.
Claudio Tommaso Gnoli is an Italian information scientist, son of Franco Adolfo Giorgio Gnoli and Francesca Vittoria Bruni. He is mainly interested in knowledge organization (KO), in particular in such ontological views as emergentist evolutionism, the theory of levels of reality and General System Theory, as philosophical foundations for both existing and new KO systems, and in testing the potential of a classification by phenomena as opposed to disciplines.
OntoLex is the short name of a vocabulary for lexical resources in the web of data (OntoLex-Lemon) and the short name of the W3C community group that created it.
The term classification can apply to one or all of: