Sublanguage

Last updated

A sublanguage is a subset of a language. Sublanguages occur in natural language, computer programming language, and relational databases.

Contents

In natural language

In informatics, natural language processing, and machine translation, a sublanguage is the language of a restricted domain, particularly a technical domain. In mathematical terms, "a subset of the sentences of a language forms a sublanguage of that language if it is closed under some operations of the language: e.g., if when two members of a subset are operated on, as by and or because, the resultant is also a member of that subset". [1] [2] [3] This is a specific term for what in most linguistic study is referred to a language variety or register. [4]

In computer languages

The term sublanguage has also sometimes been used to denote a computer language that is a subset of another language. A sublanguage may be restricted syntactically (it accepts a subgrammar of the original language), and/or semantically (the set of possible outcomes for any given program is a subset of the possible outcomes in the original language).

Examples

For instance, ALGOL 68S was a subset of ALGOL 68 designed to make it possible to write a single-pass compiler for this sublanguage.

SQL (Structured Query Language) statements are classified in various ways, [5] which can be grouped into sublanguages, commonly: a data query language (DQL), a data definition language (DDL), a data control language (DCL), and a data manipulation language (DML). [6]

In relational database theory

In relational database theory, the term "sublanguage", first used for this purpose by E. F. Codd in 1970, refers to a computer language used to define or manipulate the structure and contents of a relational database management system (RDBMS). Typical sublanguages associated with modern RDBMS's are QBE (Query by Example) and SQL (Structured Query Language). In 1985, Codd encapsulated his thinking in twelve rules which every database must satisfy in order to be truly relational. [7] [8] The fifth rule is known as the Comprehensive data sublanguage rule, and states:

A relational system may support several languages and various modes of terminal use (for example, the fill-in-the-blanks mode). However, there must be at least one language whose statements are expressible, per some well-defined syntax, as character strings, and that is comprehensive in supporting all of the following items:
  • Data definition
  • View definition
  • Data manipulation (interactive and by program)
  • Integrity constraints
  • Authorization
  • Transaction boundaries (begin, commit, and rollback)

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations, including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues, including supporting concurrent access and fault tolerance.

A relational database is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems are equipped with the option of using SQL for querying and updating the database.

The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.

Structured Query Language, abbreviated as SQL, is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables.

<span class="mw-page-title-main">Object–relational database</span> Database management system

An object–relational database (ORD), or object–relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. In addition, just as with pure relational systems, it supports extension of the data model with custom data types and methods.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB/2, then DB2 until 2017 and finally changed to its present form.

First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values. Database normalization is the process of representing a database in terms of relations in standard normal forms, where first normal is a minimal requirement. SQL-92 does not support creating or using table-valued columns, which means that using only the "traditional relational database features" most relational databases will be in first normal form by necessity. Database systems which do not require first normal form are often called NoSQL systems. Newer SQL standards like SQL:1999 have started to allow so called non-atomic types, which include composite types. Even newer versions like SQL:2016 allow JSON.

In the context of SQL, data definition or data description language (DDL) is a syntax for creating and modifying database objects such as tables, indices, and users. DDL statements are similar to a computer programming language for defining data structures, especially database schemas. Common examples of DDL statements include CREATE, ALTER, and DROP.

Codd's twelve rules are a set of thirteen rules proposed by Edgar F. Codd, a pioneer of the relational model for databases, designed to define what is required from a database management system in order for it to be considered relational, i.e., a relational database management system (RDBMS). They are sometimes referred to as "Codd's Twelve Commandments".

A database server is a server which uses a database application that provides database services to other computer programs or to computers, as defined by the client–server model. Database management systems (DBMSs) frequently provide database-server functionality, and some database management systems rely exclusively on the client–server model for database access.

Dataphor is an open-source truly-relational database management system (RDBMS) and its accompanying user interface technologies, which together are designed to provide highly declarative software application development. The Dataphor Server has its own storage engine or it can be a virtual, or federated, DBMS, meaning that it can utilize other database engines for storage.

A data manipulation language (DML) is a computer programming language used for adding (inserting), deleting, and modifying (updating) data in a database. A DML is often a sublanguage of a broader database language such as SQL, with the DML comprising some of the operators in the language. Read-only selecting of data is sometimes distinguished as being part of a separate data query language (DQL), but it is closely related and sometimes also considered a component of a DML; some operators may perform both selecting (reading) and writing.

In a database, a view is the result set of a stored query, which can be queried in the same manner as a persistent database collection object. This pre-established query command is kept in the database dictionary. Unlike ordinary base tables in a relational database, a view does not form part of the physical schema: as a result set, it is a virtual table computed or collated dynamically from data in the database when access to that view is requested. Changes applied to the data in a relevant underlying table are reflected in the data shown in subsequent invocations of the view.

<span class="mw-page-title-main">Null (SQL)</span> Marker used in SQL databases to indicate a value does not exist

In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfil the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL is a reserved word used to identify this marker.

The object–relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system (RDBMS) is being served by an application program written in an object-oriented programming language or style, particularly because objects or class definitions, must be mapped to database tables defined by a relational schema.

The Alpha language was the original database language proposed by Edgar F. Codd, the inventor of the relational database approach. It was defined in Codd's 1971 paper "A Data Base Sublanguage Founded on the Relational Calculus". Alpha influenced the design of QUEL. It was eventually supplanted by SQL, which IBM developed for its first commercial relational database product.

Entity–attribute–value model (EAV) is a data model to encode, in a space-efficient manner, entities where the number of attributes that can be used to describe them is potentially vast, but the number that will actually apply to a given entity is relatively modest. Such entities correspond to the mathematical notion of a sparse matrix.

Codd's theorem states that relational algebra and the domain-independent relational calculus queries, two well-known foundational query languages for the relational model, are precisely equivalent in expressive power. That is, a database query can be formulated in one language if and only if it can be expressed in the other.

<span class="mw-page-title-main">Shadow table</span> Abstract object in computer science

Shadow tables are objects in computer science used to improve the way machines, networks and programs handle information. More specifically, a shadow table is an object that is read and written by a processor and contains data similar to its primary table, which is the table it's "shadowing". Shadow tables usually contain data that is relevant to the operation and maintenance of its primary table, but not within the subset of data required for the primary table to exist. Shadow tables are related to the data type "trails" in data storage systems. Trails are very similar to shadow tables but instead of storing identically formatted information that is different, they store a history of modifications and functions operated on a table.

The following is provided as an overview of and topical guide to databases:

References

  1. Harris, Zellig (1988). Language and Information. New York: Columbia University Press.
  2. Kittredge, Richard; Lehrberger, John (1982). Sublanguage: Studies of language in restricted semantic domains. Berlin: Walter de Gruyter.
  3. Sager, Naomi; Nhàn, Ngô Thanh (2002). "The computability of strings, transformations, and sublanguage". In Nevin, Bruce E; Johnson, Stephen M (eds.). The Legacy of Zellig Harris (PDF). Amsterdam/Philadelphia: John Benjamins. pp. 79–120. Retrieved 22 September 2020.
  4. Karlgren, Jussi (1993). "Sublanguages and Registers – A Note On Terminology" (PDF). Interacting with Computers. 5 (3): 348–350. doi:10.1016/0953-5438(93)90015-L . Retrieved 22 September 2020.
  5. SQL-92, 4.22 SQL-statements, 4.22.1 Classes of SQL-statements "There are at least five ways of classifying SQL-statements:", 4.22.2, SQL statements classified by function "The following are the main classes of SQL-statements:"; SQL:2003 4.11 SQL-statements, and later revisions.
  6. Chatham, Mark (2012). Structured Query Language By Example - Volume I: Data Query Language. p.  8. ISBN   978-1-29119951-2.
  7. Codd, E (October 14, 1985). "Computer World". Is Your DBMS Really Relational?.
  8. Codd, E (October 21, 1985). "Computer World". Does Your DBMS Run By The Rules?.