Free-form language

Last updated

In computer programming, a free-form language is a programming language in which the positioning of characters on the page in program text is insignificant. Program text does not need to be placed in specific columns as on old punched card systems, and frequently ends of lines are insignificant. Whitespace characters are used only to delimit tokens, and have no other significance. Free-form languages allow a greater degree of flexibility and have fewer syntactic rules to learn, which could lower the entry barrier for beginners. [1]

Most free-form languages descend from ALGOL, including C, Pascal, and Perl. Lisp languages are free-form, although they do not descend from ALGOL. Rexx and its dialects ooRexx and NetRexx are mostly free-form, though in some cases whitespace characters are concatenation operators. SQL, though not a full programming language, is also free-form.

Most free-form languages are also structured programming languages, which is sometimes thought to go along with the free-form syntax: Earlier imperative programming languages such as Fortran 77 used particular columns for line numbers, which many structured languages do not use or need.

Structured languages exist which are not free-form, such as ABC, Curry, Haskell, Python and others. Many of these use some variant of the off-side rule, in which indentation, rather than keywords or braces, is used to group blocks of code.

See also

Related Research Articles

<span class="mw-page-title-main">Programming language</span> Language for communicating instructions to a machine

A programming language is a system of notation for writing computer programs.

<span class="mw-page-title-main">Plain text</span> Term for computer data consisting only of unformatted characters of readable material

In computing, plain text is a loose term for data that represent only characters of readable material but not its graphical representation nor other objects. It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters. Plain text is different from formatted text, where style information is included; from structured text, where structural parts of the document such as paragraphs, sections, and the like are identified; and from binary files in which some portions must be interpreted as binary objects.

Rebol is a cross-platform data exchange language and a multi-paradigm dynamic programming language designed by Carl Sassenrath for network communications and distributed computing. It introduces the concept of dialecting: small, optimized, domain-specific languages for code and data, which is also the most notable property of the language according to its designer Carl Sassenrath:

Although it can be used for programming, writing functions, and performing processes, its greatest strength is the ability to easily create domain-specific languages or dialects

<span class="mw-page-title-main">Shell script</span> Script written for the shell, or command line interpreter, of an operating system

A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter. The various dialects of shell scripts are considered to be scripting languages. Typical operations performed by shell scripts include file manipulation, program execution, and printing text. A script which sets up the environment, runs the program, and does any necessary cleanup or logging, is called a wrapper.

<span class="mw-page-title-main">Text editor</span> Computer software used to edit plain text documents

A text editor is a type of computer program that edits plain text. Such programs are sometimes known as "notepad" software. Text editors are provided with operating systems and software development packages, and can be used to change files such as configuration files, documentation files and programming language source code.

In computer science, Backus–Naur form is a notation used to describe the syntax of programming languages or other formal languages. It was developed by John Backus and Peter Naur. BNF can be described as a metasyntax notation for context-free grammars. Backus–Naur form is applied wherever exact descriptions of languages are needed, such as in official language specifications, in manuals, and in textbooks on programming language theory. BNF can be used to describe document formats, instruction sets, and communication protocols.

Lexical tokenization is conversion of a text into meaningful lexical tokens belonging to categories defined by a "lexer" program. In case of a natural language, those categories include nouns, verbs, adjectives, punctuations etc. In case of a programming language, the categories include identifiers, operators, grouping symbols and data types. Lexical tokenization is related to the type of tokenization used in Large language models (LLMs), but with two differences. First, lexical tokenization is usually based on a lexical grammar, whereas LLM tokenizers are usually probability-based. Second, LLM tokenizers perform a second step that converts the tokens into numerical values.

In computer programming, indentation style is a convention, a.k.a. style, governing the indentation of blocks of source code. An indentation style generally involves consistent width of whitespace before each line of a block, so that the lines of code appear to be related, and dictates whether to use space or tab characters for the indentation whitespace.

Pretty-printing is the application of any of various stylistic formatting conventions to text files, such as source code, markup, and similar kinds of content. These formatting conventions may entail adhering to an indentation style, using different color and typeface to highlight syntactic elements of source code, or adjusting size, to make the content easier for people to read, and understand. Pretty-printers for source code are sometimes called code formatters or beautifiers.

Non-English-based programming languages are programming languages that do not use keywords taken from or inspired by English vocabulary.

The off-side rule describes syntax of a computer programming language that defines the bounds of a code block via indentation.

<span class="mw-page-title-main">ALGOL 68</span> Programming language

ALGOL 68 is an imperative programming language that was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and more rigorously defined syntax and semantics.

In computer programming, a one-pass compiler is a compiler that passes through the parts of each compilation unit only once, immediately translating each part into its final machine code. This is in contrast to a multi-pass compiler which converts the program into one or more intermediate representations in steps between source code and machine code, and which reprocesses the entire compilation unit in each sequential pass.

A whitespace character is a character data element that represents white space when text is rendered for display by a computer.

<span class="mw-page-title-main">Syntax (programming languages)</span> Set of rules defining correctly structured programs

In computer science, the syntax of a computer language is the rules that define the combinations of symbols that are considered to be correctly structured statements or expressions in that language. This applies both to programming languages, where the document represents source code, and to markup languages, where the document represents data.

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

<span class="mw-page-title-main">Scripting language</span> Programming language designed for scripting

In computing, a script is a relatively short and simple set of instructions that typically automate an otherwise manual process. The act of writing a script is called scripting. Scripting language or script language describes a programming language that it is used for scripting.

<span class="mw-page-title-main">Rexx</span> Command/scripting/programming language

Rexx is a programming language that can be interpreted or compiled. It was developed at IBM by Mike Cowlishaw. It is a structured, high-level programming language designed for ease of learning and reading. Proprietary and open source Rexx interpreters exist for a wide range of computing platforms; compilers exist for IBM mainframe computers.

In computer programming languages, an identifier is a lexical token that names the language's entities. Some of the kinds of entities an identifier might denote include variables, data types, labels, subroutines, and modules.

References

  1. Winkler, Till; Flatscher, Rony G. (2023). "Cognitive Load in Programming Education: Easing the Burden on beginners with REXX" (PDF). In Central European Conference on Information and Intelligent Systems. Faculty of Organization and Informatics Varazdin. pp. 171–178.