Information Coding Classification

Last updated

The Information Coding Classification (ICC) is a classification system covering almost all extant 6500 knowledge fields (knowledge domains). Its conceptualization goes beyond the scope of the well known library classification systems, such as Dewey Decimal Classification (DDC), Universal Decimal Classification (UDC), and Library of Congress Classification (LCC), by extending also to knowledge systems that so far have not afforded to classify literature. ICC actually presents a flexible universal ordering system for both literature and other kinds of information, set out as knowledge fields. From a methodological point of view, ICC differs from the above-mentioned systems along the following three lines:

Contents

  1. Its main classes are not based on disciplines but on nine live stages of development, so-called ontical levels.
  2. It breaks them roughly down into hierarchical steps by further nine categories which makes decimal number coding possible.
  3. The contents of a knowledge field is earmarked via a digital position scheme, which makes the first hierarchical step refer to the nine ontical levels (object areas as subject categories), and the second hierarchical step refer to nine functionally ordered form categories.

Respective knowledge fields permit to step down by the same principle to a third and forth level, and even further to a fifth and sixth level. Finally, knowledge field subdivisions will have to conform to said digital position scheme. Hence, for a given knowledge field identical codes will mark identical categories under respective numbers of the coding system. This mnemotechnical aspect of the system helps memorizing and straightaway retrieving the whereabouts of respective interdisciplinary and transdisciplinary fields.

The first two hierarchical levels may be regarded as a top- or upper ontology for ontologies and other applications.

The terms of the first three hierarchical levels were set out in German and English in Wissensorganisation. Entwicklung, Aufgabe, Anwendung, Zukunft, [1] on pp. 82 to 100. It was published in 2014 and available so far only in German. In the meantime, also the French terms of the knowledge fields have been collected. Competence for maintenance and further development rests with the German Chapter of the International Society for Knowledge Organization (ISKO) e.V.

Historical development

At the end of 1970, Prof. Alwin Diemer, Univ.of Düsseldorf proposed to Ingetraut Dahlberg to undertake a philosophical dissertation on The universal classification system of knowledge, its ontological, epistemological, and information theoretical foundations. Diemer had in mind an innovating ontological approach for such a system based on the whole spectrum of kinds of being and complying with epistemological requirements. The third requirement had already been taken up somehow in the Indian Colon Classification, yet it still called for explanations and additions. In 1974, the dissertation was published in German entitled Grundlagen universaler Wissensordnung. [2] It started with conceptual clarifications, and why and how the term „universal“ was linked to knowledge, including knowledge fields, such as commodity science, artefacts, statistics, patents, standardization, communication, utility services et al. In chapter 3, six universal classification systems (DDC, UDC, LCC, BC, CC and BBK) were presented, analyzed and compared.

While preparing the dissertation, Dahlberg started with elaborating the new universal system by first gleaning a lot of extant designations of knowledge fields from whatever available reference works. This was funded by the German Documentation Society (DGD) (1971-2) under the title of Order system of knowledge fields. In addition, the syllabuses of German universities and polytechniques were explored for relevant terms and documented (1975). Thereafter, it seemed necessary to add definitions from special dictionaries and encyclopediae; it soon appeared that the 12.500 terms included numerous synonyms, so that the whole collection boiled down to about 6.500 concept designations (Project Logstruktur, supported by the German Science Foundation (DFG) 1976-78).

The outcome of this work [2] was the formulation of 30 theses which ended up in 12 principles for the new system, published 40 years later under. [1] These principles refer not only to theoretical foundations but also to structure and other organizational aspects of the whole array of knowledge fields. In 1974, the digital position scheme for field subdivision had already been developed to allow for classifying classification literature in the bibliographical section of the first issue of the Journal International Classification. In 1977, the entire ICC was ready for presentation at a seminar in Bangalore, India. [3] A publication of the first three hierarchical levels appeared however only in 1982. [4] It was applied to the bibliography of classification systems and thesauri in vol.1 of the International Classification and Indexing Bibliography; [5] it has been updated. [1]

Governing principles

These were published in full length in the book Wissensorganisation. Entwicklung, Aufgabe, Anwendung, Zukunft [1] and the article Information Coding Classification. Geschichtliches, Prinzipien, Inhaltliches, [6] hence it suffices to just mention their topics with some necessary additions.

  1. Structure and form
  2. Matter and energy
  3. Cosmos and earth
  4. Biosphere
  5. Anthroposphere
  6. Sociosphere
  7. Material products (economics and technology)
  8. Intellectual products (knowledge and information)
  9. Spiritual products (products of mind and culture)
  1. General area: problems, theories, principles (axiom and structure)
  2. Object area: objects, kinds, parts, properties of objects
  3. Activity area: methods, processes, activities
  4. Field properties or first characterization
  5. Persons or secondary characterization
  6. Societies or tertiary characterization
  7. Influences from outside
  8. Applications of the field to other fields
  9. Field information and synthesizing tasks
The digital position scheme, called Systematifier, has also been used for structuring the entire system via the categories figuring on the upper zero level.

An example of its application is the structure of the classification system for knowledge organization literature Gliederung der Klassifikationsliteratur. (A simplified version with an additional introduction is given in, [1] p. 71)

In matrix form

The first two levels of ICC can be represented by following matrix.

ICC as a Matrix.png

The first hierarchical level of the 9 subject categories results from the first vertical array under codes 1-9. The second hierarchical level of subject categories is structured by the 9 functionally ordered form categories, listed in the first horizontal line under codes 01-09. Some exceptions are mentioned in principle 7.

Research

Exploration of automatic classification

For classifying web documents as conceived by Jens Hartmann, University of Karlsruhe, Prof.Walter Koch, University of Graz, has explored in his Institute for Applied Information Technology Research Society (AIT) the application of ICC to automatically classifying metadata of some 350.000 documents. This was facilitated by data generated within the framework of an EU-supported project "EuropeanaLocal". For this exploration, three ICC hierarchical levels have been used for some 5000 terms. The result is described in the report of Christoph Mak. [7] Prof.Koch regarded a classification degree of almost 50% as a good result, considering that only a shortened version of ICC had been used. In order to reach a better result one would have needed 1–2 years. Also an index of all terms with their codes could be achieved under these explorations.

Data Linkage

Motivated by the work of an Italian research Group in Trento on Revising the Wordnet Domains Hierarchy: semantics, coverage and balancing, [8] by which the DDC codes were used, Prof. Ernesto William De Luca et al. showed in a study that for such case the use of ICC could lead to essentially better results. This was shown in two contributions: Including knowledge domains from the ICC into the Multilingual Lexical Linked Data Cloud (LLD) [9] and Die Multilingual Lexical Linked Data Cloud: Eine mögliche Zugangsoptimierung?, [10] in which the LLD was used in a meta-model which contains all resources with the possibility of retrieval and navigation of data from different aspects. By this, the existing work about many thousand knowledge fields (of ICC) can be combined with the Multilingual Lexical Linked Data Cloud, based on RDF/OWL representation of EuroWordNet and similar integrated lexical resources (MultiWordNet, MEMODATA and the Hamburg Metapher BD).

Semantic Web structuring

In October 2013, the computer scientist Hermann Bense, Dortmund, explored the possibilities for structuring the Semanic Web with ICC codes. He developed two approaches for a pictorial presentation of knowledge fields with their possible subdivisions. A graphic representation of those knowledge fields pertaining to the first two levels can be found under Ontology4. The inclusion of the third hierarchical level has been envisaged as the next step.

Some potential applications of ICC in its present form

  1. Possibility to roughly structure documents, especially bibliographies and reference works.
  2. Structuring personal repertories, e.g. a Who's Who in Who's Who in Classification and Indexing [11]
  3. Supporting the recollection of statistics by knowledge fields, e.g. also concerning university professors, statistics of academies, of institutions, of teachers in special education
  4. Publishing houses could take up ICC codes for their products to help later sorting by knowledge fields.
  5. As a standard classification ICC may be used in many cases, especially in industry, knowledge management and knowledge engineering.
  6. With the definition of all its terms a lexicon of knowledge fields could be published. This could also be used for such lexica in other languages. [12]
  7. As an example, ICC could be used to compare ongoing scientific activities on a European or world-wide scale.
  8. ICC can also be an appropriate tool for switching between extant universal classification systems. [13]
  9. ICC can also be a suitable „hang-up system“ for special classification systems, e.g. for special terminological concept systems.
  10. ICC in its three hierarchies and corresponding explanations might also be used in higher education to supply the youngsters with an overview of knowledge fields and an understanding of the relationships in the whole of human knowledge.
  11. Similar to the Unified Medical Language System (UMLS) for medicine such a Unified System of Knowledge Fields could be held available in many languages and thus reach a global understanding of knowledge fields.
  12. The alphabetical index to all knowledge field concepts could be used for comparisons with other such indexes to help in finding the missing fields in the different universal classification systems.

Related Research Articles

Library classification Systems of coding and organizing documents or library materials

A library classification is a system of Organization of knowledge by which library resources are arranged and ordered systematically. Library classifications are a notational system that represents the order of topics in the classification and allows items to be stored in that order. Library classification systems group related materials together, typically arranged as a hierarchical tree structure. A different kind of classification system, called a faceted classification system, is also widely used, which allows the assignment of multiple classifications to an object, enabling the classifications to be ordered in many ways.

WordNet Computational lexicon of English

WordNet is a lexical database of semantic relations between words in more than 200 languages. WordNet links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into synsets with short definitions and usage examples. WordNet can thus be seen as a combination and extension of a dictionary and thesaurus. While it is accessible to human users via a web browser, its primary use is in automatic text analysis and artificial intelligence applications. WordNet was first created in the English language and the English WordNet database and software tools have been released under a BSD style license and are freely available for download from that WordNet website.

Glossary Alphabetical list of terms relevant to a certain field of study or action

A glossary also known as a vocabulary or clavis, is an alphabetical list of terms in a particular domain of knowledge with the definitions for those terms. Traditionally, a glossary appears at the end of a book and includes terms within that book that are either newly introduced, uncommon, or specialized. While glossaries are most commonly associated with non-fiction books, in some cases, fiction novels may come with a glossary for unfamiliar terms.

A faceted classification is a classification scheme used in organizing knowledge into a systematic order. A faceted classification uses semantic categories, either general or subject-specific, that are combined to create the full classification entry. Many library classification systems use a combination of a fixed, enumerative taxonomy of concepts with subordinate facets that further refine the topic.

A medical classification is used to transform descriptions of medical diagnoses or procedures into standardized statistical code in a process known as clinical coding. Diagnosis classifications list diagnosis codes, which are used to track diseases and other health conditions, inclusive of chronic diseases such as diabetes mellitus and heart disease, and infectious diseases such as norovirus, the flu, and athlete's foot. Procedure classifications list procedure code, which are used to capture interventional data. These diagnosis and procedure codes are used by health care providers, government health programs, private health insurance companies, workers' compensation carriers, software developers, and others for a variety of applications in medicine, public health and medical informatics, including:

In ontology, ontic is physical, real, or factual existence.

SNOMED CT Database of medical terminology

SNOMED CT or SNOMED Clinical Terms is a systematically organized computer-processable collection of medical terms providing codes, terms, synonyms and definitions used in clinical documentation and reporting. SNOMED CT is considered to be the most comprehensive, multilingual clinical healthcare terminology in the world. The primary purpose of SNOMED CT is to encode the meanings that are used in health information and to support the effective clinical recording of data with the aim of improving patient care. SNOMED CT provides the core general terminology for electronic health records. SNOMED CT comprehensive coverage includes: clinical findings, symptoms, diagnoses, procedures, body structures, organisms and other etiologies, substances, pharmaceuticals, devices and specimens.

Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.

Knowledge organization (KO), organization of knowledge, organization of information, or information organization is an intellectual discipline concerned with activities such as document description, indexing, and classification that serve to provide systems of representation and order for knowledge and information objects. According to The Organization of Information by Joudrey and Taylor, information organization:

examines the activities carried out and tools used by people who work in places that accumulate information resources for the use of humankind, both immediately and for posterity. It discusses the processes that are in place to make resources findable, whether someone is searching for a single known item or is browsing through hundreds of resources just hoping to discover something useful. Information organization supports a myriad of information-seeking scenarios.

Language resource management - Lexical markup framework, is the International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication.

The International Society for Knowledge Organization, or ISKO, is a professional association for scholars of knowledge organization, knowledge structures, classification studies, and information organization and structure. Founded in 1989, ISKO's mission is "to advance conceptual work in knowledge organization in all kinds of forms, and for all kinds of purposes, such as databases, libraries, dictionaries and the Internet." An interdisciplinary association, ISKO's worldwide membership draws from fields such as information science, philosophy, linguistics, library science, archive studies, science studies, and computer science. ISKO "promotes research, development and applications of knowledge organization systems that advance the philosophical, psychological and semantic approaches for ordering knowledge; provides the means of communication and networking on knowledge organization for its members; and functions as a connecting link between all institutions and national societies, working with problems related to the conceptual organization and processing of knowledge."

ISO 25964

ISO 25964 is the international standard for thesauri, published in two parts as follows:

ISO 25964 Information and documentation - Thesauri and interoperability with other vocabulariesPart 1: Thesauri for information retrieval [published August 2011]  Part 2: Interoperability with other vocabularies [published March 2013]

In library and information science documents are classified and searched by subject – as well as by other attributes such as author, genre and document type. This makes "subject" a fundamental term in this field. Library and information specialists assign subject labels to documents to make them findable. There are many ways to do this and in general there is not always consensus about which subject should be assigned to a given document. To optimize subject indexing and searching, we need to have a deeper understanding of what a subject is. The question: "what is to be understood by the statement 'document A belongs to subject category X'?" has been debated in the field for more than 100 years.

Taxonomy is the practice and science of categorization or classification.

In the context of information retrieval, a thesaurus is a form of controlled vocabulary that seeks to dictate semantic manifestations of metadata in the indexing of content objects. A thesaurus serves to minimise semantic ambiguity by ensuring uniformity and consistency in the storage and retrieval of the manifestations of content objects. ANSI/NISO Z39.19-2005 defines a content object as "any item that is to be described for inclusion in an information retrieval system, website, or other source of information". The thesaurus aids the assignment of preferred terms to convey semantic metadata associated with the content object.

Ingetraut Dahlberg German philosopher, information scientist and librarian

Ingetraut Dahlberg was a German information scientist and philosopher who developed the universal Information Coding Classification covering some 6,500 subject fields. Her career spanned various roles in research, teaching, editing, and publishing. Dahlberg founded the journal International Classification as well as both the scientific Society for Classification and International Society for Knowledge Organization.

In natural language processing, linguistics, and neighboring fields, Linguistic Linked Open Data (LLOD) describes a method and an interdisciplinary community concerned with creating, sharing, and (re-)using language resources in accordance with Linked Data principles. The Linguistic Linked Open Data Cloud was conceived and is being maintained by the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation, but has been a point of focal activity for several W3C community groups, research projects, and infrastructure efforts since then.

Claudio Tommaso Gnoli is an Italian information scientist, son of Franco Adolfo Giorgio Gnoli and Francesca Vittoria Bruni. He is mainly interested in knowledge organization (KO), in particular in such ontological views as emergentist evolutionism, the theory of levels of reality and General System Theory, as philosophical foundations for both existing and new KO systems, and in testing the potential of a classification by phenomena as opposed to disciplines.

OntoLex is the short name of a vocabulary for lexical resources in the web of data (OntoLex-Lemon) and the short name of the W3C community group that created it.

The term classification can apply to one or all of:

References

  1. 1 2 3 4 5 6 Ingetraut Dahlberg (2014), Deutsche Sektion der Internationalen Gesellschaft für Wissensorganisation e.V. (ISKO) (ed.), "Wissensorganisation. Entwicklung, Aufgabe, Anwendung, Zukunft", Textbooks for Knowledge Organization (in German), Würzburg: Ergon Verlag, vol. 3, pp. 1–175, ISBN   978-3-95650-065-7
  2. 1 2 Ingetraut Dahlberg (1974), Deutsche Gesellschaft für Dokumentation e.V. (ed.), Grundlagen universaler Wissensordnung. Probleme und Möglichkeiten eines universalen Klassifikationssystems des Wissens.: im Antiquariat noch erhältlich sonst als Print on Demand bei deGruyter (in German), Pullach bei München: Verlag Dokumentation, ISBN   978-3111412672
  3. Ingetraut Dahlberg (1978), Sarada Ranganathan Endowment for Library Science (ed.), Ontical Structures and Universal Classification (in German), Bangalore
  4. Ingetraut Dahlberg (1982), "ICC – Information Coding Classification. Principles, structure and application possibilities", International Classification (in German), vol. 2, pp. 98–103
  5. Ingetraut Dahlberg, ed. (1982), "International Classification and Indexing Bibliography (ICIB 1): Classification systems and thesauri 1950-1982", Knowledge Organization International Journal (in German), Frankfurt: INDEKS Verlag, ISSN   0943-7444
  6. Ingetraut Dahlberg (2010), Marlies Ockenfeld (ed.), "Information Coding Classification. Geschichtliches, Prinzipien, Inhaltliches", Information, Wissenschaft & Praxis (in German), De Gruyter, vol. 61, Heft 8, pp. 449–454, ISSN   1619-4292
  7. Christian Mak (2011), "Kategorisierung des Datenbestandes der EuropeanaLocal-Österreich anhand der ICC", Bericht des Instituts "Angewandte Informationstechnik Forschungsgesellschaft MBH" (AIT) (in German), Graz
  8. Luisa Bentivogli; Pamela Forner; Bernardo Magnini; Emanuele Pianta (2004), "Revising WordNet Domains Hierarchy: Semantics, Coverage, and Balancing" (PDF), Proceedings of COLING 2004 Workshop on "Multilingual Linguistic Resources (in German), Geneva, Switzerland, pp. 101–108
  9. Ernesto William DeLuca; et al. (2014), "Including Knowledge Domains from the ICC into the Multilingual Lexical Linked Data Cloud", Knowledge Organization in the 21st Century. Between Historical Patterns and Future Prospects. Proc.13th Int. ISKO Conf. (in German), Krakau, Polen, pp. 258–365
  10. Ernesto William DeLuca; et al. (2014), "Die Multilingual Lexical Linked Data Cloud: Eine mögliche Zugangsoptimierung?", Information, Wissenschaft & Praxis (in German), De Gruyter, vol. 65, Heft 4-5, pp. 279–287, ISSN   1619-4292
  11. Ingetraut Dahlberg, ed. (1983), Who's Who in Classification and Indexing (in German), Frankfurt: INDEKS Verlag
  12. Ingetraut Dahlberg (2012), "A systematic new lexicon of all knowledge fields based on the Information Coding Classification.", Knowledge Organisation 39, No.2 (in German), pp. 142–150
  13. Ingetraut Dahlberg (1996), Green, R. (ed.), "Library catalogs and the Internet. Switching for future subject access.", Knowledge Organization & Change. Proc.4th Int.ISKO Conf., Washington, D.C. (in German), Frankfurt: INDEKS Verlag, pp. 155–165