Gellish English dictionary

Last updated

The Gellish English Dictionary-Taxonomy is an example of an open-source “smart” electronic dictionary, in which concepts are arranged in a subtype-supertype hierarchy, thus forming a taxonomy. The dictionary-taxonomy is machine readable. It is compliant with the guidelines of ISO 16354. Apart from the fact that it is an English (business-technical) dictionary, it also defines the semantics of Gellish English, which is a computer-interpretable structured subset of the natural English language for data storage and data exchange. The dictionary-taxonomy differs from conventional dictionaries because of several additional capabilities. Therefore it is called "smart." This means that it satisfies the following criteria: [1]

Taxonomy is the practice and science of classification. The word is also used as a count noun: a taxonomy, or taxonomic scheme, is a particular classification. The word finds its roots in the Greek language τάξις, taxis and νόμος, nomos. Originally, taxonomy referred only to the classification of organisms or a particular classification of organisms. In a wider, more general sense, it may refer to a classification of things or concepts, as well as to the principles underlying such a classification. Taxonomy is different from meronomy which is dealing with the classification of parts of a whole.

Open source is a term denoting that a product includes permission to use its source code, design documents, or content. It most commonly refers to the open-source model, in which open-source software or other products are released under an open-source license as part of the open-source-software movement. Use of the term originated with software, but has expanded beyond the software sector to cover other open content and forms of open collaboration.

Electronic dictionary dictionary whose data exists in digital form and can be accessed through a number of different media

An electronic dictionary is a dictionary whose data exists in digital form and can be accessed through a number of different media. Electronic dictionaries can be found in several forms, including software installed on tablet or desktop computers, mobile apps, web applications, and as a built-in function of E-readers. They may be free or require payment.

Left hand UIDLeft hand termRelation type UIDRelation type phraseRight hand UIDRight hand termDefinition
131737line shaft pump1146is a specialization of130058centrifugal pumpis a centrifugal pump that has ...
130058centrifugal pump1146is a specialization of130206pumpis a pump that ...
130206pump1146is a specialization of730006equipment itemis an equipment item that ...
Left hand termRelation type phraseRight hand term
is a part ofis a specialization ofrelation between individual things
can have as part ais a specialization ofrelation between kinds of things
shall have as part ais a specialization ofcan be a part of a

The phrase "is a part of" is a standard phrase for composition relations that can be used to express facts that relate parts to wholes. The other standard phrase "can have as part a" is a phrase for a concept that can be used to express knowledge that a whole of a particular kind can have as a part a component of a particular kind. The "shall have as part a" relation is intended to be used to express requirements. All phrases have also inverse expressions, such as "has as part" and "can be a part of a," which denote the same concepts (relation types), but which require an inverse sequence of the related objects. The following table illustrates the use of these relation types in Gellish English expressions.

Left hand termRelation type phraseRight hand term
B1is a part ofP-101
B1is classified as abearing
P-101is classified as apump type A
pumpcan have as part abearing
pump type Ashall have as part abearing

Note: All elements in the last two expression above (pump, can/shall have as part a, and bearing) are "names" of standard English concepts that are defined in the Gellish English Dictionary-Taxonomy or a proprietary extension, such as for "pump type A."

Ontology study of the nature of being, becoming, existence or reality, as well as the basic categories of being and their relations

Ontology is the philosophical study of being. More broadly, it studies concepts that directly relate to being, in particular becoming, existence, reality, as well as the basic categories of being and their relations. Traditionally listed as a part of the major branch of philosophy known as metaphysics, ontology often deals with questions concerning what entities exist or may be said to exist and how such entities may be grouped, related within a hierarchy, and subdivided according to similarities and differences.

The Gellish English Dictionary-Taxonomy is available as a collection of standardized Gellish Data Tables. Each of those tables has the same standard column definitions. Thus the whole dictionary-taxonomy can be treated as if it were one table.

The Gellish English Dictionary is freely available under open-source conditions (through one of the open source licenses) via the SourceForge Web site. Further documentation is available on the Gellish official website.

An open-source license is a type of license for computer software and other products that allows the source code, blueprint or design to be used, modified and/or shared under defined terms and conditions. This allows end users and commercial companies to review and modify the source code, blueprint or design for their own customization, curiosity or troubleshooting needs. Open-source licensed software is mostly available free of charge, though this does not necessarily have to be the case. Licenses which only permit non-commercial redistribution or modification of the source code for personal use only are generally not considered as open-source licenses. However, open-source licenses may have some restrictions, particularly regarding the expression of respect to the origin of software, such as a requirement to preserve the name of the authors and a copyright statement within the code, or a requirement to redistribute the licensed software only under the same license. One popular set of open-source software licenses are those approved by the Open Source Initiative (OSI) based on their Open Source Definition (OSD).

Related Research Articles

Semantic network knowledge representation scheme that uses a directed graph to encode knowledge

A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields.

In programming language theory, subtyping is a form of type polymorphism in which a subtype is a datatype that is related to another datatype by some notion of substitutability, meaning that program elements, typically subroutines or functions, written to operate on elements of the supertype can also operate on elements of the subtype. If S is a subtype of T, the subtyping relation is often written S <: T, to mean that any term of type S can be safely used in a context where a term of type T is expected. The precise semantics of subtyping crucially depends on the particulars of what "safely used in a context where" means in a given programming language. The type system of a programming language essentially defines its own subtyping relation, which may well be trivial should the language support no conversion mechanisms.

In database design, object-oriented programming and design, has-a is a composition relationship where one object "belongs to" another object, and behaves according to the rules of ownership. In simple words, has-a relationship in an object is called a member field of an object. Multiple has-a relationships will combine to form a possessive hierarchy.

In knowledge representation, object-oriented programming and design, is-a is a subsumption relationship between abstractions, wherein one class A is a subclass of another class B . In other words, type A is a subtype of type B when A’s specification implies B’s specification. That is, any object that satisfies A’s specification also satisfies B’s specification, because B’s specification is weaker.

Substitutability is a principle in object-oriented programming stating that, in a computer program, if S is a subtype of T, then objects of type T may be replaced with objects of type S without altering any of the desirable properties of the program. More formally, the Liskov substitution principle (LSP) is a particular definition of a subtyping relation, called (strong) behavioral subtyping, that was initially introduced by Barbara Liskov in a 1987 conference keynote address titled Data abstraction and hierarchy. It is a semantic rather than merely syntactic relation, because it intends to guarantee semantic interoperability of types in a hierarchy, object types in particular. Barbara Liskov and Jeannette Wing described the principle succinctly in a 1994 paper as follows:

Subtype Requirement: Let be a property provable about objects of type T. Then should be true for objects of type S where S is a subtype of T.

A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure.

Information model representation of conceptual relationships between things

An information model in software engineering is a representation of concepts and the relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse. Typically it specifies relations between kinds of things, but may also include relations with individual things. It can provide sharable, stable, and organized structure of information requirements or knowledge for the domain context.

A classification scheme is the product of arranging things into kinds of things (classes) or into groups of classes.

The top type in the type theory of mathematics, logic, and computer science, commonly abbreviated as top or by the down tack symbol (⊤), is the universal type, sometimes called the universal supertype as all other types in any given type system are subtypes of top. In most cases it is the type which contains every possible object in the type system of interest. It is in contrast with the bottom type, or the universal subtype, which every other type is supertype of and in most cases it is the type that contains no members at all.

The ISO 15926 is a standard for data integration, sharing, exchange, and hand-over between computer systems.

Gellish is a formal language that is natural language independent, although its concepts have 'names' and definitions in various natural languages. Any natural language variant, such as Gellish Formal English is a controlled natural language. Information and knowledge can be expressed in such a way that it is computer-interpretable, as well as system-independent and natural language independent. Each natural language variant is a structured subset of that natural language and is suitable for information modeling and knowledge representation in that particular language. All expressions, concepts and individual things are represented in Gellish by (numeric) unique identifiers. This enables software to translate expressions from one formal natural language to any other formal natural language.

EXPRESS (data modeling language)

EXPRESS is a standard data modeling language for product data. EXPRESS is formalized in the ISO Standard for the Exchange of Product model STEP, and standardized as ISO 10303-11.

Data exchange is the process of taking data structured under a source schema and transforming it into data structured under a target schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between different computer programs.

Machine-readable dictionary (MRD) is a dictionary stored as machine (computer) data instead of being printed on paper. It is an electronic dictionary and lexical database.

Knowledge modeling is a process of creating a computer interpretable model of knowledge or standard specifications about a kind of process and/or about a kind of facility or product. The resulting knowledge model can only be computer interpretable when it is expressed in some knowledge representation language or data structure that enables the knowledge to be interpreted by software and to be stored in a database or data exchange file.
Knowledge-based engineering or knowledge-aided design is a process of computer-aided usage of such knowledge models for the design of products, facilities or processes. The design of products or facilities then uses the knowledge model to guide the creation of the facility or product that need to be designed. In other words, it used knowledge about a kind of object to create a product model of an (imaginary) individual object. Similarly, the design of a particular process implies the creation of a process model, which design activity can be guided by the knowledge that is contained in a knowledge model about such a kind of process. The resulting process model, product model or facility model is typically also stored in a database.

A facility information model is an information model of an individual facility that is integrated with data and documents about the facility. The facility can be any large facility that is designed, fabricated, constructed and installed, operated, maintained and modified; for example, a complete infrastructural network, a process plant, a building, a highway, a ship or an airplane. The difference with a product model is that a product model is typically a model about a kind of product expressed as a data structure, whereas a facility information model typically is an integration of 1000–10,000 components and their properties and relations and 10,000–50,000 documents. A facility information model is intended for users that search for data and documents about the components of the facility and their operation.

Semantic data model

Semantic data model is a high-level semantics-based database description and structuring formalism for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a database in terms of the kinds of entities that exist in the application environment, the classifications and groupings of those entities, and the structural interconnections among them. SDM provides a collection of high-level modeling primitives to capture the semantics of an application environment. By accommodating derived information in a database structural specification, SDM allows the same information to be viewed in several ways; this makes it possible to directly accommodate the variety of needs and processing requirements typically present in database applications. The design of the present SDM is based on our experience in using a preliminary version of it. SDM is designed to enhance the effectiveness and usability of database systems. An SDM database description can serve as a formal specification and documentation tool for a database; it can provide a basis for supporting a variety of powerful user interface facilities, it can serve as a conceptual database model in the database design process; and, it can be used as the database model for a new kind of database management system.[5]

Generic data model

Generic data models are generalizations of conventional data models. They define standardised general relation types, together with the kinds of things that may be related by such a relation type.

Contemporary ontologies share many structural similarities, regardless of the language in which they are expressed. Most ontologies describe individuals (instances), classes (concepts), attributes, and relations.

References

  1. "Gellish: A Product Modeling Language" . Retrieved 2010-04-30.