C--

Last updated
C--
Paradigm imperative
Designed by Simon Peyton Jones and Norman Ramsey
First appeared1998
Typing discipline static, weak
Website https://www.cs.tufts.edu/~nr/c--/index.html
Influenced by
C

C-- (pronounced C minus minus) is a C-like programming language, designed to be generated mainly by compilers for high-level languages rather than written by human programmers. It was created by functional programming researchers Simon Peyton Jones and Norman Ramsey. Unlike many other intermediate languages, it is represented in plain ASCII text, not bytecode or another binary format. [1] [2]

Contents

There are two main branches:

Design

C-- is a "portable assembly language", [6] designed to ease the implementation of compilers that produce high-quality machine code. [7] This is done by delegating low-level code-generation and program optimization to a C-- compiler. The language's syntax borrows heavily from C while omitting or changing standard C features such as variadic functions, pointer syntax, and aspects of C's type system, because they hamper essential features of C-- and ease of code-generation.

The name of the language is an in-joke, indicating that C-- is a reduced form of C, in the same way that "C++" was chosen to connote an improved version of C. (In C, -- and ++ mean "decrement" and "increment", respectively.) [8]

Work on C-- began in the late 1990s. Since writing a custom code generator is a challenge in itself, and the compiler backends available to researchers at that time were complex and poorly documented, several projects had written compilers which generated C code (for instance, the original Modula-3 compiler). However, C is a poor choice for functional languages: it does not guarantee tail-call optimization, or support accurate garbage collection or efficient exception handling. C-- is a tightly-defined simpler alternative to C which supports all of these. Its most innovative feature is a run-time interface which allows writing of portable garbage collectors, exception handling systems and other run-time features which work with any C-- compiler.

The first version of C-- was released in April 1998 as a MSRA paper, [1] accompanied by a January 1999 paper on garbage collection. [2] A revised manual was posted in HTML form in May 1999. [9] Two sets of major changes proposed in 2000 by Norman Ramsey ("Proposed Changes") and Christian Lindig ("A New Grammar") led to C-- version 2, which was finalized around 2004 and officially released in 2005. [3]

Type system

The C-- type system is designed to reflect constraints imposed by hardware rather than conventions imposed by higher-level languages. A value stored in a register or memory may have only one type: bit-vector. However, bit-vector is a polymorphic type which comes in several widths, e.g. bits8, bits32, or bits64. A separate 32-or-64 bit family of floating-point types is supported. In addition to the bit-vector type, C-- provides a boolean type bool, which can be computed by expressions and used for control flow but cannot be stored in a register or memory. As in an assembly language, any higher type discipline, such as distinctions between signed, unsigned, float, and pointer, is imposed by the C-- operators or other syntactic constructs. C-- is not type-checked, nor does it enforce or check the calling convention. [3] :28

C-- version 2 removes the distinction between bit-vector and floating-point types. These types can be annotated with a string "kind" tag to distinguish, among other things, a variable's integer vs float typing and its storage behavior (global or local). The former is useful on targets that have separate registers for integer and floating-point values. Special types for pointers and the native word were introduced, although they are mapped to a bit-vector with a target-dependent length. [3] :10

Example code

The following C-- code calculates the sum and product of integers 1 through n [10] (n is received as an argument). It demonstrates two language features:

/* Tail recursion */exportsp;sp(bits32n){jumpsp_help(n,1,1);}sp_help(bits32n,bits32s,bits32p){ifn==1{return(s,p);}else{jumpsp_help(n-1,s+n,p*n);}}

Implementations

The specification page of C-- lists a few implementations of C--. The "most actively developed" compiler, Quick C--, was abandoned in 2013. [11]

Haskell

Some developers of C--, including Simon Peyton Jones, João Dias, and Norman Ramsey, work or have worked on GHC, whose development has led to extensions in the C-- language, forming the Cmm dialect which uses the C preprocessor for ergonomics. [4] [12]

GHC backends are responsible for further transforming C-- into executable code, via LLVM IR, slow C, or directly through the built-in native backend. [13] [14] [15] Despite the original intention, GHC does perform many of its generic optimizations on C--. As with other compiler IRs, the C-- representation can be dumped for debugging. [16] Target-specific optimizations are performed later by the backend.

Processing systems

As of 2023, most processing systems are not maintained, including release of the source code.

See also

Related Research Articles

Mercury is a functional logic programming language made for real-world uses. The first version was developed at the University of Melbourne, Computer Science department, by Fergus Henderson, Thomas Conway, and Zoltan Somogyi, under Somogyi's supervision, and released on April 8, 1995.

<span class="mw-page-title-main">Single instruction, multiple data</span> Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the called function. Inline expansion is similar to macro expansion, but occurs during compilation, without changing the source code, while macro expansion occurs prior to compilation, and results in different text that is then processed by the compiler.

BLISS is a system programming language developed at Carnegie Mellon University (CMU) by W. A. Wulf, D. B. Russell, and A. N. Habermann around 1970. It was perhaps the best known system language until C debuted a few years later. Since then, C became popular and common, and BLISS faded into obscurity. When C was in its infancy, a few projects within Bell Labs debated the merits of BLISS vs. C.

Template Haskell is an experimental language extension to the functional programming language Haskell, implemented in the Glasgow Haskell Compiler (GHC) version 6 and later.

The Glasgow Haskell Compiler (GHC) is a native or machine code compiler for the functional programming language Haskell. It provides a cross-platform software environment for writing and testing Haskell code and supports many extensions, libraries, and optimisations that streamline the process of generating and executing code. GHC is the most commonly used Haskell compiler. It is free and open-source software released under a BSD license.

<span class="mw-page-title-main">LLVM</span> Compiler backend for multiple programming languages

LLVM is a set of compiler and toolchain technologies that can be used to develop a frontend for any programming language and a backend for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. The name LLVM originally stood for Low Level Virtual Machine, though the project has expanded and the name is no longer officially an initialism.

In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recursion is particularly useful, and is often easy to optimize in implementations.

SIGPLAN is the Association for Computing Machinery's Special Interest Group (SIG) on programming languages. This SIG explores programming language concepts and tools, focusing on design, implementation, practice, and theory. Its members are programming language developers, educators, implementers, researchers, theoreticians, and users.

<span class="mw-page-title-main">FreeBASIC</span> BASIC dialect

FreeBASIC is a free and open source multiplatform compiler and programming language based on BASIC licensed under the GNU GPL for Microsoft Windows, protected-mode MS-DOS, Linux, FreeBSD and Xbox. The Xbox version is no longer maintained.

<span class="mw-page-title-main">Simon Peyton Jones</span> British computer scientist (born 1958)

Simon Peyton Jones is a British computer scientist who researches the implementation and applications of functional programming languages, particularly lazy functional programming.

In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T and a type variable a, and means that a can only be instantiated to a type whose members support the overloaded operations associated with T.

In the theory of programming languages in computer science, deforestation is a program transformation to eliminate intermediate lists or tree structures that are created and then immediately consumed by a program.

This article describes the features in the programming language Haskell.

Haskell is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell has pioneered several programming language features such as type classes, which enable type-safe operator overloading, and monadic input/output (IO). It is named after logician Haskell Curry. Haskell's main implementation is the Glasgow Haskell Compiler (GHC).

<span class="mw-page-title-main">OpenLisp</span> Family of programming languages known for symbolic computation and its use of parentheses

OpenLisp is a programming language in the Lisp family developed by Christian Jullien from Eligis. It conforms to the international standard for ISLISP published jointly by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), ISO/IEC 13816:1997(E), revised to ISO/IEC 13816:2007(E).

In computer science, a type family associates data types with other data types, using a type-level function defined by an open-ended collection of valid instances of input types and the corresponding output types.

asm.js is a subset of JavaScript designed to allow computer software written in languages such as C to be run as web applications while maintaining performance characteristics considerably better than standard JavaScript, which is the typical language used for such applications.

Idris is a purely-functional programming language with dependent types, optional lazy evaluation, and features such as a totality checker. Idris may be used as a proof assistant, but is designed to be a general-purpose programming language similar to Haskell.

Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXTSTEP operating system. Due to Apple macOS’s direct lineage from NeXTSTEP, Objective-C was the standard language used, supported, and promoted by Apple for developing macOS and iOS applications from 1997, when Apple purchased NeXT until the introduction of the Swift language in 2014.

References

  1. 1 2 Nordin, Thomas; Jones, Simon Peyton; Iglesias, Pablo Nogueira; Oliva, Dino (1998-04-23). "The C– Language Reference Manual".{{cite journal}}: Cite journal requires |journal= (help)
  2. 1 2 Reig, Fermin; Ramsey, Norman; Jones, Simon Peyton (1999-01-01). "C–: a portable assembly language that supports garbage collection": 1–28.{{cite journal}}: Cite journal requires |journal= (help)
  3. 1 2 3 4 Ramsey, Norman; Jones, Simon Peyton. "The C-- Language Specification, Version 2.0" (PDF). Retrieved 11 December 2019.
  4. 1 2 GHC Commentary: What the hell is a .cmm file?
  5. "An improved LLVM backend". April 2019.
  6. Oliva, Dino; Nordin, T.; Peyton Jones, Simon (1997-01-01). "C-: A Portable Assembly Language". Proceedings of the 1997 Workshop on Implementing Functional Languages via Microsoft.
  7. Jones, Simon Peyton; Nordin, Thomas; Oliva, Dino (1998). Clack, Chris; Hammond, Kevin; Davie, Tony (eds.). "C--: A portable assembly language". Implementation of Functional Languages. Berlin, Heidelberg: Springer: 1–19. doi:10.1007/BFb0055421. ISBN   978-3-540-68528-9.
  8. "Increment And Decrement Operators In C With Precedence". unstop.com. Retrieved 2024-06-20.
  9. Nordin, Thomas; Jones, Simon Peyton; Iglesias, Pablo Nogueira; Oliva, Dino (1999-05-23). "The C– Language Reference Manual".
  10. Ramsey, Norman; Jones, Simon Peyton; Lindig, Christian (2005-02-23), The C-- Language Specification, version 2.0 (CVS Revision 1.128) (PDF), p. 7, retrieved 2023-06-22
  11. "C-- Downloads". www.cs.tufts.edu. Retrieved 11 December 2019.
  12. "5.10. GHC Backends — Glasgow Haskell Compiler 9.8.1 User's Guide". downloads.haskell.org. Retrieved 2024-06-20.
  13. GHC Backends
  14. "Opinion piece on GHC backends". andreaspk.github.io. August 25, 2019. Retrieved 2024-06-20.
  15. "Using the Glasgow Haskell Compiler (GHC)". ProgDoer. Retrieved 2024-06-20.
  16. Debugging compilers with optimization fuel
  17. "C-- Downloads". www.cs.tufts.edu. Retrieved 2024-06-20.