Gellish English dictionary

Last updated April 03, 2019

The Gellish English Dictionary-Taxonomy is an example of an open-source “smart” electronic dictionary, in which concepts are arranged in a subtype-supertype hierarchy, thus forming a taxonomy. The dictionary-taxonomy is machine readable. It is compliant with the guidelines of ISO 16354. Apart from the fact that it is an English (business-technical) dictionary, it also defines the semantics of Gellish English, which is a computer-interpretable structured subset of the natural English language for data storage and data exchange. The dictionary-taxonomy differs from conventional dictionaries because of several additional capabilities. Therefore it is called "smart." This means that it satisfies the following criteria:^[1]

Taxonomy is the practice and science of classification. The word is also used as a count noun: a taxonomy, or taxonomic scheme, is a particular classification. The word finds its roots in the Greek language τάξις, taxis and νόμος, nomos. Originally, taxonomy referred only to the classification of organisms or a particular classification of organisms. In a wider, more general sense, it may refer to a classification of things or concepts, as well as to the principles underlying such a classification. Taxonomy is different from meronomy which is dealing with the classification of parts of a whole.

Open source is a term denoting that a product includes permission to use its source code, design documents, or content. It most commonly refers to the open-source model, in which open-source software or other products are released under an open-source license as part of the open-source-software movement. Use of the term originated with software, but has expanded beyond the software sector to cover other open content and forms of open collaboration.

An electronic dictionary is a dictionary whose data exists in digital form and can be accessed through a number of different media. Electronic dictionaries can be found in several forms, including software installed on tablet or desktop computers, mobile apps, web applications, and as a built-in function of E-readers. They may be free or require payment.

It contains definitions per concept, each represented by a unique identifier (UID; a natural number), whereas ordinary dictionaries usually provide multiple different definitions of a single term, whereas it is unclear whether those definitions are alternative definitions of the same concept or whether they are definitions of different concepts. Thus a smart dictionary explicitly distinguishes homonyms (the same term for different concepts) and explicitly specifies which terms are used as true synonyms, including also abbreviations, codes, etc.
It is defined itself in the form of a Gellish Data Tables that form a Gellish Database.
It is completely arranged as a taxonomy, which is a subtype-supertype hierarchy of concepts. This means that each concept is defined as an explicit subtype of one or more supertype concepts by specialization relations. This enables inheritance of definitions, knowledge, and requirements. An example of some definitions and hierarchy is the following:

Left hand UID	Left hand term	Relation type UID	Relation type phrase	Right hand UID	Right hand term	Definition
131737	line shaft pump	1146	is a specialization of	130058	centrifugal pump	is a centrifugal pump that has ...
130058	centrifugal pump	1146	is a specialization of	130206	pump	is a pump that ...
130206	pump	1146	is a specialization of	730006	equipment item	is an equipment item that ...

It includes also concepts with multi-term names. For example: line shaft centrifugal pump. Ordinary dictionaries will only define the separate terms, from which it cannot always be inferred what the multi-term means.
It defines kinds of relations, also called relation types or fact types, as being special kinds of concepts. Such kinds of facts do not appear in ordinary dictionaries, because they are word-based and do not recognize standard phrases. However, fact types represent semantic concepts that are used by everybody to make sentences. Because these relation types are standardized and unambiguously defined, Gellish English becomes computer interpretable. The relation types in the dictionary-taxonomy include kinds that are specific for the expression of facts that represent knowledge, requirements, definitions, and information about individual things. The relation types have names and synonyms that consist of standardized phrases. For example (for clarity, the UIDs are not shown):

Left hand term	Relation type phrase	Right hand term
is a part of	is a specialization of	relation between individual things
can have as part a	is a specialization of	relation between kinds of things
shall have as part a	is a specialization of	can be a part of a

The phrase "is a part of" is a standard phrase for composition relations that can be used to express facts that relate parts to wholes. The other standard phrase "can have as part a" is a phrase for a concept that can be used to express knowledge that a whole of a particular kind can have as a part a component of a particular kind. The "shall have as part a" relation is intended to be used to express requirements. All phrases have also inverse expressions, such as "has as part" and "can be a part of a," which denote the same concepts (relation types), but which require an inverse sequence of the related objects. The following table illustrates the use of these relation types in Gellish English expressions.

Left hand term	Relation type phrase	Right hand term
B1	is a part of	P-101
B1	is classified as a	bearing
P-101	is classified as a	pump type A
pump	can have as part a	bearing
pump type A	shall have as part a	bearing

Note: All elements in the last two expression above (pump, can/shall have as part a, and bearing) are "names" of standard English concepts that are defined in the Gellish English Dictionary-Taxonomy or a proprietary extension, such as for "pump type A."

It includes explicit relations between concepts, using the above mentioned standardized relation types. Those relations express knowledge about the concepts. This means that the dictionary-taxonomy is also an ontology. If a user of an application system classifies an individual object by a concept in the dictionary-taxonomy, then a computer system can infer that the knowledge and requirements about the concept are applicable to that individual object, and it can make that information available to that user. This includes also the knowledge and requirements that are available about the supertypes of that concepts, because such knowledge and requirements are inherited by the subtype of those supertype concepts, conforming to the subtype-supertype hierarchy (the taxonomy).
It enables automatic translation and search in databases in other languages. This is possible because each concept is represented by a language-independent UID, whereas multiple terms (names) in various languages are allowed to denote the same concept. This makes it possible for facts that are expressed in one language to be automatically presented by a computer in any other language for which a smart dictionary is available. It also makes it possible to answer queries in one language using database systems that contain facts that are expressed in other languages and then present the results in the language in which the query was formulated.
It can be extended by private and proprietary concepts and terms. For example, company specific codes and proprietary knowledge can be added as required. Guidelines for the "proper definition of a concept" are provided in the documentation.
It is computer interpretable and system independent.

Ontology study of the nature of being, becoming, existence or reality, as well as the basic categories of being and their relations

Ontology is the philosophical study of being. More broadly, it studies concepts that directly relate to being, in particular becoming, existence, reality, as well as the basic categories of being and their relations. Traditionally listed as a part of the major branch of philosophy known as metaphysics, ontology often deals with questions concerning what entities exist or may be said to exist and how such entities may be grouped, related within a hierarchy, and subdivided according to similarities and differences.

The Gellish English Dictionary-Taxonomy is available as a collection of standardized Gellish Data Tables. Each of those tables has the same standard column definitions. Thus the whole dictionary-taxonomy can be treated as if it were one table.

The Gellish English Dictionary is freely available under open-source conditions (through one of the open source licenses) via the SourceForge Web site. Further documentation is available on the Gellish official website.

An open-source license is a type of license for computer software and other products that allows the source code, blueprint or design to be used, modified and/or shared under defined terms and conditions. This allows end users and commercial companies to review and modify the source code, blueprint or design for their own customization, curiosity or troubleshooting needs. Open-source licensed software is mostly available free of charge, though this does not necessarily have to be the case. Licenses which only permit non-commercial redistribution or modification of the source code for personal use only are generally not considered as open-source licenses. However, open-source licenses may have some restrictions, particularly regarding the expression of respect to the origin of software, such as a requirement to preserve the name of the authors and a copyright statement within the code, or a requirement to redistribute the licensed software only under the same license. One popular set of open-source software licenses are those approved by the Open Source Initiative (OSI) based on their Open Source Definition (OSD).

Related Research Articles

A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields.

In programming language theory, subtyping is a form of type polymorphism in which a subtype is a datatype that is related to another datatype by some notion of substitutability, meaning that program elements, typically subroutines or functions, written to operate on elements of the supertype can also operate on elements of the subtype. If S is a subtype of T, the subtyping relation is often written S <: T, to mean that any term of type S can be safely used in a context where a term of type T is expected. The precise semantics of subtyping crucially depends on the particulars of what "safely used in a context where" means in a given programming language. The type system of a programming language essentially defines its own subtyping relation, which may well be trivial should the language support no conversion mechanisms.

In database design, object-oriented programming and design, has-a is a composition relationship where one object "belongs to" another object, and behaves according to the rules of ownership. In simple words, has-a relationship in an object is called a member field of an object. Multiple has-a relationships will combine to form a possessive hierarchy.

In knowledge representation, object-oriented programming and design, is-a is a subsumption relationship between abstractions, wherein one class A is a subclass of another class B . In other words, type A is a subtype of type B when A’s specification implies B’s specification. That is, any object that satisfies A’s specification also satisfies B’s specification, because B’s specification is weaker.

Substitutability is a principle in object-oriented programming stating that, in a computer program, if S is a subtype of T, then objects of type T may be replaced with objects of type S without altering any of the desirable properties of the program. More formally, the Liskov substitution principle (LSP) is a particular definition of a subtyping relation, called (strong) behavioral subtyping, that was initially introduced by Barbara Liskov in a 1987 conference keynote address titled Data abstraction and hierarchy. It is a semantic rather than merely syntactic relation, because it intends to guarantee semantic interoperability of types in a hierarchy, object types in particular. Barbara Liskov and Jeannette Wing described the principle succinctly in a 1994 paper as follows:

Subtype Requirement: Let $be a property provable about objects of type T . Then should be true for objects of type S where S is a subtype of T .$

A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure.

An information model in software engineering is a representation of concepts and the relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse. Typically it specifies relations between kinds of things, but may also include relations with individual things. It can provide sharable, stable, and organized structure of information requirements or knowledge for the domain context.

A classification scheme is the product of arranging things into kinds of things (classes) or into groups of classes.

The top type in the type theory of mathematics, logic, and computer science, commonly abbreviated as top or by the down tack symbol (⊤), is the universal type, sometimes called the universal supertype as all other types in any given type system are subtypes of top. In most cases it is the type which contains every possible object in the type system of interest. It is in contrast with the bottom type, or the universal subtype, which every other type is supertype of and in most cases it is the type that contains no members at all.

The ISO 15926 is a standard for data integration, sharing, exchange, and hand-over between computer systems.

Gellish is a formal language that is natural language independent, although its concepts have 'names' and definitions in various natural languages. Any natural language variant, such as Gellish Formal English is a controlled natural language. Information and knowledge can be expressed in such a way that it is computer-interpretable, as well as system-independent and natural language independent. Each natural language variant is a structured subset of that natural language and is suitable for information modeling and knowledge representation in that particular language. All expressions, concepts and individual things are represented in Gellish by (numeric) unique identifiers. This enables software to translate expressions from one formal natural language to any other formal natural language.

EXPRESS is a standard data modeling language for product data. EXPRESS is formalized in the ISO Standard for the Exchange of Product model STEP, and standardized as ISO 10303-11.

Data exchange is the process of taking data structured under a source schema and transforming it into data structured under a target schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between different computer programs.

Machine-readable dictionary (MRD) is a dictionary stored as machine (computer) data instead of being printed on paper. It is an electronic dictionary and lexical database.

Knowledge modeling is a process of creating a computer interpretable model of knowledge or standard specifications about a kind of process and/or about a kind of facility or product. The resulting knowledge model can only be computer interpretable when it is expressed in some knowledge representation language or data structure that enables the knowledge to be interpreted by software and to be stored in a database or data exchange file.
Knowledge-based engineering or knowledge-aided design is a process of computer-aided usage of such knowledge models for the design of products, facilities or processes. The design of products or facilities then uses the knowledge model to guide the creation of the facility or product that need to be designed. In other words, it used knowledge about a kind of object to create a product model of an (imaginary) individual object. Similarly, the design of a particular process implies the creation of a process model, which design activity can be guided by the knowledge that is contained in a knowledge model about such a kind of process. The resulting process model, product model or facility model is typically also stored in a database.

A facility information model is an information model of an individual facility that is integrated with data and documents about the facility. The facility can be any large facility that is designed, fabricated, constructed and installed, operated, maintained and modified; for example, a complete infrastructural network, a process plant, a building, a highway, a ship or an airplane. The difference with a product model is that a product model is typically a model about a kind of product expressed as a data structure, whereas a facility information model typically is an integration of 1000–10,000 components and their properties and relations and 10,000–50,000 documents. A facility information model is intended for users that search for data and documents about the components of the facility and their operation.

Semantic data model is a high-level semantics-based database description and structuring formalism for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a database in terms of the kinds of entities that exist in the application environment, the classifications and groupings of those entities, and the structural interconnections among them. SDM provides a collection of high-level modeling primitives to capture the semantics of an application environment. By accommodating derived information in a database structural specification, SDM allows the same information to be viewed in several ways; this makes it possible to directly accommodate the variety of needs and processing requirements typically present in database applications. The design of the present SDM is based on our experience in using a preliminary version of it. SDM is designed to enhance the effectiveness and usability of database systems. An SDM database description can serve as a formal specification and documentation tool for a database; it can provide a basis for supporting a variety of powerful user interface facilities, it can serve as a conceptual database model in the database design process; and, it can be used as the database model for a new kind of database management system.^[5]

Generic data models are generalizations of conventional data models. They define standardised general relation types, together with the kinds of things that may be related by such a relation type.

Contemporary ontologies share many structural similarities, regardless of the language in which they are expressed. Most ontologies describe individuals (instances), classes (concepts), attributes, and relations.

References

↑ "Gellish: A Product Modeling Language" . Retrieved 2010-04-30.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Gellish: A Product Modeling Language" . Retrieved 2010-04-30.