Clean (programming language)

Last updated

Clean
Clean 3.0 (programming language) logo.svg
Paradigm functional
Designed by Software Technology Research Group of Radboud University Nijmegen
First appeared1987;37 years ago (1987)
Stable release
3.1 / 5 January 2022;2 years ago (2022-01-05)
Typing discipline strong, static, dynamic
OS Cross-platform
License Simplified BSD [1]
Filename extensions .icl, .dcl, .abc
Website clean.cs.ru.nl
Influenced by
Lean, Miranda, Haskell
Influenced
Haskell, Idris [2]

Clean is a general-purpose purely functional programming language. Originally called the Concurrent Clean System [3] or the Clean System, [4] [5] it has been developed by a group of researchers from the Radboud University in Nijmegen since 1987. [6] [7] Although development of the language has slowed, some researchers are still working in the language. [8] In 2018, a spin-off company was founded that uses Clean. [9]

Contents

Features

Clean shares many properties and syntax with a younger sibling language, Haskell: referential transparency, list comprehension, guards, garbage collection, higher order functions, currying, and lazy evaluation. However, Clean deals with mutable state and input/output (I/O) through a uniqueness type system, in contrast to Haskell's use of monads. The compiler takes advantage of the uniqueness type system to generate more efficient code, because it knows that at any point during the execution of the program, only one reference can exist to a value with a unique type. Therefore, a unique value can be changed in place. [10]

An integrated development environment (IDE) for Microsoft Windows is included in the Clean distribution.

Examples

Hello world:

Start="Hello, world!"

Factorial:

fac::Int->Intfac0=1facn=n*fac(n-1)Start=fac10
fac::Int->Intfacn=prod[1..n]// The product of the numbers 1 to nStart=fac10

Fibonacci sequence:

fib::Int->Intfib0=1fib1=1fibn=fib(n-2)+fib(n-1)Start=fib7
fibs::IntInt->[Int]fibsx_2x_1=[x_2:fibsx_1(x_2+x_1)]fib::Int->Intfibn=(fibs11)!!nStart=fib7

Infix operator:

(^)infixr8::IntInt->Int(^)x0=1(^)xn=x*x^(n-1)

The type declaration states that the function is a right associative infix operator with priority 8: this states that x*x^(n-1) is equivalent to x*(x^(n-1)) as opposed to (x*x)^(n-1). This operator is pre-defined in StdEnv, the Clean standard library.

How Clean works

Computing is based on graph rewriting and reduction. Constants such as numbers are graphs and functions are graph rewriting formulas. This, combined with compiling to native code, makes Clean programs which use high abstraction run relatively fast according to The Computer Language Benchmarks Game. [11] A 2008 benchmark showed that Clean native code performs similarly to the Glasgow Haskell Compiler (GHC), depending on the benchmark. [12]

Compiling

Compilation of Clean to machine code is performed as follows:

  1. Source files (.icl) and definition files (.dcl) are translated into Core Clean, a basic variant of Clean, by the compiler frontend written in Clean.
  2. Core clean is converted into Clean's platform-independent intermediate language (.abc), by the compiler backend written in Clean and C.
  3. Intermediate ABC code is converted to object code (.o) by the code generator written in C.
  4. Object code is linked with other files in the module and the runtime system and converted into a normal executable using the system linker (when available) or a dedicated linker written in Clean on Windows.

Earlier versions of the Clean compiler were written completely in C, thus avoiding bootstrapping issues.

The ABC machine

The ABC code mentioned above is an intermediate representation for an abstract machine. Because machine code generation for ABC code is relatively straightforward, it is easy to support new architectures. The ABC machine is an imperative abstract graph rewriting machine. [13] It consists of a graph store to hold the Clean graph that is being rewritten and three stacks:

The runtime system, which is linked into every executable, builds a Start node in the graph store and pushes it on the A-stack. It then begins printing it, evaluating it as needed.

Running Clean in the browser

Although Clean is typically used to generate native executables, several projects have enabled applications in web browsers. The now abandoned SAPL project compiled Core Clean to JavaScript and did not use ABC code. Since 2019, an interpreter for ABC code, written in WebAssembly, is used instead. [14] [15]

Platforms

Clean is available for Microsoft Windows (IA-32 and X86-64), macOS (X86-64), and Linux (IA-32, X86-64, and AArch64).[ citation needed ]

Some libraries are not available on all platforms, like ObjectIO which is only available on Windows. Also the feature to write dynamics to files is only available on Windows.[ citation needed ]

The availability of Clean per platform varies with each version: [16] [17]

VersionDate Linux macOS Oracle Solaris Windows Miscellaneous
IA-32 x86-64 AArch64 Motorola 68040 PowerPC x86-64 SPARC IA-32 x86-64
3.15 January 2022YesYesYesNoNoYesNoYesYes
3.02 October 2018YesYesNoNoNoYesNoYesYes
2.423 December 2011YesYesNoNoNoYesNoYesYes
2.322 December 2010YesYesNoNoNoNoNoYesYes
2.219 December 2006YesYesNoNoYesNoYesYesYes
2.1.131 May 2005YesNoNoNoYesNoYesYesNo
2.1.031 October 2003YesNoNoNoYesNoYesYesNo
2.0.212 December 2002YesNoNoNoYesNoYesYesNo
2.0.14 July 2002YesNoNoNoYesNoYesYesNo
2.021 December 2001NoNoNoNoNoNoNoYesNo
1.3.313 September 2000YesNoNoNoYesNoYesYesNo
1.3.21 July 1999NoNoNoYesYesNoYesYesNo
1.3.1January 1999YesNoNoNoYesNoYesYesNo
1.322 May 1998YesNoNoNoYesNoYesYesNo
1.2.4June 1997NoNoNoYesYesNoNoYesNo
1.2.3May 1997NoNoNoYesYesNoNoYesNo
1.213 January 1997NoNoNoYesYesNoNoNoNo
1.1.3October 1996NoNoNoNoNoNoYesNoNo OS/2 (i80386)
1.1.2September 1996YesNoNoNoNoNoYesNoNo SunOS 4 (SPARC)
1.1March 1996YesNoNoYesNoNoNoNoNo
1.0.2September 1995YesNoNoYesNoNoYesNoNo OS/2 (i80386); SunOS 4 (SPARC)
1.0May 1995NoNoNoYesNoNoNoNoNo OS/2 (i80386)
0.8.411 May 1993YesNoNoYesNoNoNoNoNoExperimental T800 transputer release
0.8.326 February 1993NoNoNoYesNoNoNoNoNo
0.8.119 October 1992NoNoNoYesNoNoNoNoNo
0.813 July 1992NoNoNoYesNoNoNoNoNo OS/2 (i80386); SunOS 3–4 (SPARC)
0.7May 1991NoNoNoYesNoNoNoNoNo SunOS 3–4 (SPARC)

Comparison to Haskell

The syntax of Clean is very similar to that of Haskell, with some notable differences. In general, Haskell has introduced more syntactic sugar than Clean: [10]

HaskellCleanRemarks
[x|x<-[1..10],isOddx]
[x\\x<-[1..10]|isOddx]
list comprehension
x:xs
[x:xs]
cons operator
dataTreea=Empty|Node(Treea)a(Treea)
::Treea=Empty|Node(Treea)a(Treea)
algebraic data type
(Eqa,Eqb)=>...
...|Eqa&Eqb
class assertions and contexts
funt@(Nodelxr)=...
funt=:(Nodelxr)=...
as-patterns
ifx>10then10elsex
if(x>10)10x
if

Related Research Articles

Forth is a procedural, concatenative, stack-oriented programming language and interactive development environment designed by Charles H. "Chuck" Moore and first used by other programmers in 1970. Although not an acronym, the language's name in its early years was often spelled in all capital letters as FORTH. The FORTH-79 and FORTH-83 implementations, which were not written by Moore, became de facto standards, and an official standardization of the language was published in 1994 as ANS Forth. A wide range of Forth derivatives existed before and after ANS Forth. The free software Gforth implementation is actively maintained, as are several commercially supported systems.

Pascal is an imperative and procedural programming language, designed by Niklaus Wirth as a small, efficient language intended to encourage good programming practices using structured programming and data structuring. It is named after French mathematician, philosopher and physicist Blaise Pascal.

A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is running. For example, a compiler that runs on a PC but generates code that runs on an Android smartphone is a cross compiler.

In compiler optimization, register allocation is the process of assigning local automatic variables and expression results to a limited number of processor registers.

The Glasgow Haskell Compiler (GHC) is a native or machine code compiler for the functional programming language Haskell. It provides a cross-platform software environment for writing and testing Haskell code and supports many extensions, libraries, and optimisations that streamline the process of generating and executing code. GHC is the most commonly used Haskell compiler. It is free and open-source software released under a BSD license. The lead developers are Simon Peyton Jones and Simon Marlow.

<span class="mw-page-title-main">Free Pascal</span> Free compiler and IDE for Pascal and ObjectPascal

Free Pascal Compiler (FPC) is a compiler for the closely related programming-language dialects Pascal and Object Pascal. It is free software released under the GNU General Public License, with exception clauses that allow static linking against its runtime libraries and packages for any purpose in combination with any other software license.

<span class="mw-page-title-main">LLVM</span> Compiler backend for multiple programming languages

LLVM is a set of compiler and toolchain technologies that can be used to develop a frontend for any programming language and a backend for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. The name LLVM originally stood for Low Level Virtual Machine, though the project has expanded and the name is no longer officially an initialism.

In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recursion is particularly useful, and is often easy to optimize in implementations.

In computer science, instruction scheduling is a compiler optimization used to improve instruction-level parallelism, which improves performance on machines with instruction pipelines. Put more simply, it tries to do the following without changing the meaning of the code:

In computer science, an abstract semantic graph (ASG) or term graph is a form of abstract syntax in which an expression of a formal or programming language is represented by a graph whose vertices are the expression's subterms. An ASG is at a higher level of abstraction than an abstract syntax tree, which is used to express the syntactic structure of an expression or program.

In computer science, graph transformation, or graph rewriting, concerns the technique of creating a new graph out of an original graph algorithmically. It has numerous applications, ranging from software engineering to layout algorithms and picture generation.

<span class="mw-page-title-main">Scala (programming language)</span> General-purpose programming language

Scala is a strong statically typed high-level general-purpose programming language that supports both object-oriented programming and functional programming. Designed to be concise, many of Scala's design decisions are intended to address criticisms of Java.

In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit, or in some cases software emulation of those features. However, technologies that emulate or supply an NX bit will usually impose a measurable overhead while using a hardware-supplied NX bit imposes no measurable overhead.

This article describes the calling conventions used when programming x86 architecture microprocessors.

Intel oneAPI DPC++/C++ Compiler and Intel C++ Compiler Classic are Intel’s C, C++, SYCL, and Data Parallel C++ (DPC++) compilers for Intel processor-based systems, available for Windows, Linux, and macOS operating systems.

<span class="mw-page-title-main">Cosmos (operating system)</span> Toolkit for building GUI and command-line based operating systems

C# Open Source Managed Operating System (Cosmos) is a toolkit for building GUI and command-line based operating systems, written mostly in the programming language C# and small amounts of a high level assembly language named X#. Cosmos is a backronym, in that the acronym was chosen before the meaning. It is open-source software released under a BSD license.

<span class="mw-page-title-main">GrGen</span> Software development tool

GrGen.NET is a software development tool that offers programming languages that are optimized for the processing of graph structured data. The core of the languages consists of modular graph rewrite rules, which are built on declarative graph pattern matching and rewriting; they are supplemented by many of the constructs that are used in imperative and object-oriented programming, and are completed with language devices known from database query languages.


Haskell is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell has pioneered a number of programming language features such as type classes, which enable type-safe operator overloading, and monadic input/output (IO). It is named after logician Haskell Curry. Haskell's main implementation is the Glasgow Haskell Compiler (GHC).

The Perl virtual machine is a stack-based process virtual machine implemented as an opcodes interpreter which runs previously compiled programs written in the Perl language. The opcodes interpreter is a part of the Perl interpreter, which also contains a compiler in one executable file, commonly /usr/bin/perl on various Unix-like systems or perl.exe on Microsoft Windows systems.

<span class="mw-page-title-main">Elm (programming language)</span> Functional programming language

Elm is a domain-specific programming language for declaratively creating web browser-based graphical user interfaces. Elm is purely functional, and is developed with emphasis on usability, performance, and robustness. It advertises "no runtime exceptions in practice", made possible by the Elm compiler's static type checking.

References

  1. "Download Clean". Clean. Retrieved 23 July 2019.
  2. "Idris - Uniqueness Types" . Retrieved 20 November 2018.
  3. "Clean 0.7: Readme". Archived from the original on 24 May 2019.
  4. "Clean 1.0: Readme". Archived from the original on 5 May 2019.
  5. "Clean 1.3: Readme". Archived from the original on 27 April 2019.
  6. "Radboud University Nijmegen: Department of Software Science: Software".
  7. "FAQ". Clean. Retrieved 26 November 2021.
  8. "Publications". Clean. Retrieved 26 November 2021.
  9. "Home". TOP Software Technology. Retrieved 26 November 2021.
  10. 1 2 ftp://ftp.cs.ru.nl/pub/Clean/papers/2007/achp2007-CleanHaskellQuickGuide.pdf
  11. "Which programming languages are fastest?". Computer Language Benchmarks Game. Archived from the original on 28 June 2011.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  12. Jansen, Jan Martin; Koopman, Pieter; Plasmeijer, Rinus (2008). "From Interpretation to Compilation" (PDF). Retrieved 21 May 2016.{{cite journal}}: Cite journal requires |journal= (help)
  13. Koopman, Pieter (10 December 1990). Functional Programs as Executable Specifications (PhD). Katholieke Universiteit Nijmegen. p. 35. ISBN   90-9003689-X.
  14. "Clean and iTasks / ABC Interpreter · GitLab". Clean and iTasks on GitLab. Retrieved 13 April 2023.
  15. Staps, Camil; van Groningen, John; Plasmeijer, Rinus (15 July 2021). "Lazy interworking of compiled and interpreted code for sandboxing and distributed systems". Proceedings of the 31st Symposium on Implementation and Application of Functional Languages. pp. 1–12. doi:10.1145/3412932.3412941. ISBN   9781450375627. S2CID   202751977.
  16. "Release history". Clean. Retrieved 7 January 2022.
  17. "Index of /Clean" . Retrieved 7 January 2022.