General-purpose macro processor

Last updated

A general-purpose macro processor or general purpose preprocessor is a macro processor that is not tied to or integrated with a particular language or piece of software.

Contents

A macro processor is a program that copies a stream of text from one place to another, making a systematic set of replacements as it does so. Macro processors are often embedded in other programs, such as assemblers and compilers. Sometimes they are standalone programs that can be used to process any kind of text.

Macro processors have been used for language expansion (defining new language constructs that can be expressed in terms of existing language components), for systematic text replacements that require decision making, and for text reformatting (e.g. conditional extraction of material from an HTML file).

Examples of general purpose macro processors

NameYearDescription
GPM1960sOne of the earliest macro processors was GPM (the General Purpose Macrogenerator). [1] This was developed at the University of Cambridge, UK, in the mid 1960s, under the direction of Christopher Strachey.
ML/I1960sOne particularly important general purpose macro processor was (and still is) ML/I (Macro Language One). This was developed as part of PhD research by a Cambridge postgraduate, Peter J. Brown. ML/I operates on a character stream, and requires no special format for its input, nor any special flag characters to introduce macros.
STAGE21960sA contemporary of ML/I was STAGE2, [2] part of William Waite's Mobile Programming System. [3] This too is a general purpose macro processor, but it processes input a line at a time, matching each line against specified patterns; it is notable in that it is independent of character set, requiring only that the digits 0-9 are contiguous and in that order (a condition not met by some of the 6-bit and BCD character codes of the era).
M61960sEarly macro processor developed at AT&T Bell Laboratories by Douglas McIlroy, Robert Morris and Andrew Hall. It was influenced by GPM and TRAC. Implemented in FORTRAN IV, [4] it was ported to Version 2 Unix.
SNOBOL1960s SNOBOL is a string processing language which is capable of doing most of the pre-processing which can be done by a macro processor.
XPOPXPOP was another attempt at a general macro processing language by Mark Halpern at IBM in the 1960s.
TTM1968 TTM is a recursive, interpretive language designed primarily for string manipulation, text editing, macro definition and expansion, and other applications generally classified as systems programming. It was developed in 1968 by Steven Caine and E. Kent Gordon at the California Institute of Technology. It is derived, primarily, from GAP [5] and GPM. [1]
GMP1970sAnother attempt was the GMP (General Macro Processor) developed in the mid-1970s by M Boule in the DLB/GC department of the CII Company along ideas from R.J. Chevance. Tested in association with the Bordeaux I University the first version ran the SIRIS8/IRIS80 System. It was ported to mini6 systems and was the main component involved in the system generation for this family of computers. The GMP processor used C2-Chomsky grammars to define the syntax of macros and used an imperative language to execute computations and proceed to macro expansion.
M41977 m4 was designed for Unix by Brian Kernighan and Dennis Ritchie.
ELENASoftware: Practice and Experience, Vol. 14, pp. 519–531, Jun. 1984
gema1995gema is a contextual macro processor based on pattern matching, written by David N. Gray. It replaces/enhances the concept of regular expressions by contexts. Contexts roughly corresponds to named sets of patterns. As a consequence, macros in gema closely resemble an EBNF description. [6]
GPP1996gpp is another general macro processor written by Denis Auroux. It resembles a C preprocessor, but has more general semantics and allows for customized syntax (for instance, TeX, XHTML, and Prolog-like scripts are definable). [7]
M51999m5 is a general-purpose macro processor written by William A. Ward, Jr. Unlike many macroprocessors, m5 does not directly interpret its input. Instead it uses a two-pass approach in which the first pass translates the input to an awk program, and the second pass executes the awk program to produce the final output.
pyexpander2011pyexpander is a general-purpose macro processor based on the Python programming language. In addition to simple macro replacement it allows evaluation of arbitrary Python expressions and execution of python code making pyexpander Turing complete. [8]
Text Assembler2014Text Assembler is a general-purpose text/macro processor based on the JavaScript programming language. Beyond simple macro replacement, it allows evaluating arbitrary JavaScript expressions and executing JavaScript code. It can also load JSON data models for more complex data-driven text processing tasks. [9]
PP2016PP is a text preprocessor designed for Pandoc (and more generally Markdown and reStructuredText). PP implements: Macros, literate programming, GraphViz, PlantUML and ditaa diagrams, Bash, Cmd, PowerShell, Python and Haskell scripts. [10]
minimacminimac is a minimalist general purpose macro processor. It operates as a character stream filter, recursively expanding macros as they are encountered. It is unusual for a macro processor in that it uses an explicit argument stack, and user functions are defined by concatenation (similar to the Forth language). [11]
aa_macro2017aa_macro is an open-source character-stream-based text processing language written in Python. Text is processed in a left-to-right, inside-to-outside manner. A selection of pre-defined built-in functions provide fundamental processing mechanisms that may be used directly or as elements of user-defined styles. The language is user extensible, and wtfm, an open-source web-based document preparation wrapper for the language, is available. [12] [13]

See also

Related Research Articles

Assembly language Low-level programming language

In computer programming, assembly language, sometimes abbreviated asm, is any low-level programming language in which there is a very strong correspondence between the instructions in the language and the architecture's machine code instructions. Assembly language usually has one statement per machine instruction (1:1), but constants, comments, assembler directives, symbolic labels of, e.g., memory locations, registers, and macros are generally also supported.

Intel 8080 8-bit microprocessor

The Intel 8080 ("eighty-eighty") is the second 8-bit microprocessor designed and manufactured by Intel. It first appeared in April 1974 and is an extended and enhanced variant of the earlier 8008 design, although without binary compatibility. The initial specified clock rate or frequency limit was 2 MHz, with common instructions using 4, 5, 7, 10, or 11 cycles. As a result, the processor is able to execute several hundred thousand instructions per second. Two faster variants, the 8080A-1 and 8080A-2, became available later with clock frequency limits of 3.125 MHz and 2.63 MHz respectively. The 8080 needs two support chips to function in most applications: the i8224 clock generator/driver and the i8228 bus controller. It is implemented in N-type metal-oxide-semiconductor logic (NMOS) using non-saturated enhancement mode transistors as loads thus demanding a +12 V and a −5 V voltage in addition to the main transistor–transistor logic (TTL) compatible +5 V.

Literate programming

Literate programming is a programming paradigm introduced by Donald Knuth in which a computer program is given an explanation of its logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which compilable source code can be generated. The approach is used in scientific computing and in data science routinely for reproducible research and open access purposes. Literate programming tools are used by millions of programmers today.

Macro (computer science) In computer science, a concise representation of a pattern

A macro in computer science is a rule or pattern that specifies how a certain input should be mapped to a replacement output. Applying a macro to an input is macro expansion. The input and output may be a sequence of lexical tokens or characters, or a syntax tree. Character macros are supported in software applications to make it easy to invoke common command sequences. Token and tree macros are supported in some programming languages to enable code reuse or to extend the language, sometimes for domain-specific languages.

troff, short for "typesetter roff", is the major component of a document processing system developed by AT&T Corporation for the Unix operating system. troff and the related nroff were both developed from the original roff.

In computer science, a preprocessor is a program that processes its input data to produce output that is used as input to another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.

GNU Autotools

The GNU Autotools, also known as the GNU Build System, is a suite of programming tools designed to assist in making source code packages portable to many Unix-like systems.

Programming languages can be grouped by the number and types of paradigms supported.

C preprocessor Macro preprocessor used in the C, C++, and Objective-C programming languages

The C preprocessor is the macro preprocessor for the C, Objective-C and C++ computer programming languages. The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control.

Metaprogramming is a programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyze or transform other programs, and even modify itself while running. In some cases, this allows programmers to minimize the number of lines of code to express a solution, in turn reducing development time. It also allows programs greater flexibility to efficiently handle new situations without recompilation.

Intel iAPX 432

The iAPX 432 is a discontinued computer architecture introduced in 1981. It was Intel's first 32-bit processor design. The main processor of the architecture, the general data processor, is implemented as a set of two separate integrated circuits, due to technical limitations at the time. Although some early 8086, 80186 and 80286-based systems and manuals also used the iAPX prefix for marketing reasons, the iAPX 432 and the 8086 processor lines are completely separate designs with completely different instruction sets.

A domain-specific language (DSL) is a computer language specialized to a particular application domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains. There are a wide variety of DSLs, ranging from widely used languages for common domains, such as HTML for web pages, down to languages used by only one or a few pieces of software, such as MUSH soft code. DSLs can be further subdivided by the kind of language, and include domain-specific markup languages, domain-specific modeling languages, and domain-specific programming languages. Special-purpose computer languages have always existed in the computer age, but the term "domain-specific language" has become more popular due to the rise of domain-specific modeling. Simpler DSLs, particularly ones used by a single application, are sometimes informally called mini-languages.

The PL/M programming language (an acronym of Programming Language for Microcomputers) is a high-level language conceived and developed by Gary Kildall in 1973 for Hank Smith at Intel for its microprocessors.

m4 is a general-purpose macro processor included in most Unix-like operating systems, and is a component of the POSIX standard.

In computer science, instruction selection is the stage of a compiler backend that transforms its middle-level intermediate representation (IR) into a low-level IR. In a typical compiler, instruction selection precedes both instruction scheduling and register allocation; hence its output IR has an infinite set of pseudo-registers and may still be – and typically is – subject to peephole optimization. Otherwise, it closely resembles the target machine code, bytecode, or assembly language.

A source-to-source translator, source-to-source compiler, transcompiler, or transpiler is a type of translator that takes the source code of a program written in a programming language as its input and produces an equivalent source code in the same or a different programming language. A source-to-source translator converts between programming languages that operate at approximately the same level of abstraction, while a traditional compiler translates from a higher level programming language to a lower level programming language. For example, a source-to-source translator may perform a translation of a program from Python to JavaScript, while a traditional compiler translates from a language like C to assembler or Java to bytecode. An automatic parallelizing compiler will frequently take in a high level language program as an input and then transform the code and annotate it with parallel code annotations or language constructs.

In computer programming, boilerplate code—or simply, boilerplate—are sections of code that are repeated in multiple places with little to no variation. When using languages that are considered verbose, the programmer must write a lot of boilerplate code to accomplish only minor functionality.

Template processor Software designed to combine templates with a data model to produce result documents

A template processor is software designed to combine templates with a data model to produce result documents. The language that the templates are written in is known as a template language or templating language. For purposes of this article, a result document is any kind of formatted output, including documents, web pages, or source code, either in whole or in fragments. A template engine is ordinarily included as a part of a web template system or application framework, and may be used also as a preprocessor or filter.

A binary recompiler is a compiler that takes executable binary files as input, analyzes their structure, applies transformations and optimizations, and outputs new optimized executable binaries.

References

  1. 1 2 Strachey, Christopher (October 1965). "A General Purpose Macrogenerator". Computer Journal. 8 (3): 225–241. doi: 10.1093/comjnl/8.3.225 .
  2. Waite, William M. (July 1970). "The mobile programming system: STAGE2". Communications of the ACM. New York, NY, USA: ACM. 13 (7): 415–421. doi:10.1145/362686.362691.
  3. Britten, Charles Randyl (2020-06-26). "Translation of 8080 Code to 8086 - Microsoft Translation of 8080 Code to 8086 and Other 16-Bit Processors". Archived from the original on 2021-07-23. Retrieved 2021-11-28. Stage2 was created by Prof William Waite at the University of Colorado in the late sixties as a major component of his mobile programming system, MPS. Stage2 uses a pattern matching algorithm to match input lines of text against a set of templates. Each template is the first line of a macro and when a match is recognized the code body of that macro is processed to produce output text, error messages, or create a constructed line that is submitted for further template matching. So the process is fully recursive and quite powerful in its capabilities for text transformation. In fact, it can be used to implement a programming language compiler.
  4. Cole, A. J. (1981). Macro Processors (2nd, revised ed.). CUP Archive. p. 254.
  5. Farber, D. J., 635 Assembly System - GAP. Bell Telephone Laboratories Computation Center (1964).
  6. "Gema". gema.sourceforge.net. Retrieved 2017-04-21.
  7. "GPP". logological.org/gpp. Retrieved April 21, 2017.
  8. "Welcome to pyexpander's documentation! — pyexpander 1.7.0 documentation". pyexpander.sourceforge.net. Retrieved 2017-04-21.
  9. "akiwi - Text Assembler". www.akiwi.co.uk. Retrieved 2017-04-21.
  10. Delord, Christophe. "PP - A generic Preprocessor (with Pandoc in mind)". cdsoft.fr. Retrieved 2017-10-17.
  11. "minimac". freshmeat.net. Retrieved 2017-04-21.
  12. "aa_macro" . Retrieved 2017-11-24.
  13. "wtfm" . Retrieved 2017-11-24.