Paradigm | Imperative, Functional |
---|---|
Designed by | David J. Pearce |
First appeared | June 2010 |
Stable release | 0.6.1 / June 27, 2022 |
Typing discipline | Strong, safe, structural, flow-sensitive |
License | BSD |
Website | whiley |
Influenced by | |
Java, C, Python, Rust |
Whiley is an experimental programming language that combines features from the functional and imperative paradigms, and supports formal specification through function preconditions, postconditions and loop invariants. [1] The language uses flow-sensitive typing also known as "flow typing."
The Whiley project began in 2009 in response to the "Verifying Compiler Grand Challenge" put forward by Tony Hoare in 2003. [2] The first public release of Whiley was in June, 2010. [3]
Primarily developed by David Pearce, Whiley is an open source project with contributions from a small community. The system has been used for student research projects and in teaching undergraduate classes. [4] It was supported between 2012 and 2014 by the Royal Society of New Zealand's Marsden Fund. [5]
The Whiley compiler generates code for the Java virtual machine and can interoperate with Java and other JVM-based languages.
The goal of Whiley is to provide a realistic programming language where verification is used routinely and without thought. The idea of such a tool has a long history, but was strongly promoted in the early 2000s through Hoare's Verifying Compiler Grand Challenge. The purpose of this challenge was to spur new efforts to develop a verifying compiler, roughly described as follows: [6]
A verifying compiler uses mathematical and logical reasoning to check the correctness of the programs that it compiles.
— Tony Hoare
The primary purpose of such a tool is to improve software quality by ensuring a program meets a formal specification. Whiley follows many attempts to develop such tools, including notable efforts such as SPARK/Ada, ESC/Java, Spec#, Dafny, Why3, [7] and Frama-C.
Most previous attempts to develop a verifying compiler focused on extending existing programming languages with constructs for writing specifications. For example, ESC/Java and the Java Modeling Language add annotations for specifying preconditions and postconditions to Java. Likewise, Spec# and Frama-C add similar constructs to the C# and C programming languages. However, these languages are known to contain numerous features which pose difficult or insurmountable problems for verification. [8] In contrast, the Whiley language was designed from scratch in an effort to avoid common pitfalls and make verification more tractable. [9]
The syntax of Whiley follows the general appearance of imperative or object-oriented languages. Indentation syntax is chosen over the use of braces to delineate statement blocks, given a strong resemblance to Python. However, the imperative look of Whiley is somewhat misleading as the language core is functional and pure.
Whiley distinguishes a function
(which is pure) from a method
(which may have side-effects). This distinction is necessary as it allows functions to be used in specifications. A familiar set of primitive data types is available including bool
, int
, arrays (e.g. int[]
) and records (e.g. {int x, int y}
). However, unlike most programming languages the integer data type, int
, is unbounded and does not correspond to a fixed-width representation such as 32-bit two's complement. Thus, an unconstrained integer in Whiley can take on any possible integer value, subject to the memory constraints of the host environment. This choice simplifies verification, as reasoning about modular arithmetic is a known and hard problem. Compound objects (e.g. arrays or records) are not references to values on the heap as they are in languages such as Java or C# but, instead, are immutable values.
Whiley takes an unusual approach to type checking referred to as "Flow Typing." Variables can have different static types at different points in a function or method. Flow typing is similar to occurrence typing as found in Racket. [10] To aid flow typing, Whiley supports union, intersection and negation types. [11] Union types are comparable to sum types found in functional languages like Haskell but, in Whiley, they are not disjoint. Intersection and negation types are used in the context of flow typing to determine the type of a variable on the true and false branches of a runtime type test. For example, suppose a variable x
of type T
and a runtime type test x is S
. On the true branch, the type of x
becomes T & S
whilst, on the false branch, it becomes T & !S
.
Whiley uses a structural rather than nominal type system. Modula-3, Go and Ceylon are examples of other languages which support structural typing in some form.
Whiley supports reference lifetimes similar to those found in Rust. Lifetimes can be given when allocating new objects to indicate when they can be safely deallocated. References to such objects must then include lifetime identifier to prevent dangling references. Every method has an implicit lifetime referred to by this
. A variable of type &this:T
represents a reference to an object of type T
which is deallocated with the enclosing method. Subtyping between lifetimes is determined by the outlives relation. Thus, &l1:T
is a subtype of &l2:T
if lifetime l1
statically outlives lifetime l2
. A lifetime whose scope encloses another is said to outlive it. Lifetimes in Whiley differ from Rust in that they do not currently include a concept of ownership.
Whiley has no built-in support for concurrency and no formal memory model to determine how reading/writing to shared mutable state should be interpreted.
The following example illustrates many of the interesting features in Whiley, including the use of postconditions, loop invariants, type invariants, union types and flow typing. The function is intended to return the first index of an integer item
in an array of integer items
. If no such index exists, then null
is returned.
// Define the type of natural numberstypenatis(intx)wherex>=0publicfunctionindexOf(int[]items,intitem)->(int|nullindex)// If int returned, element at this position matches itemensuresindexisint==>items[index]==item// If int returned, element at this position is first matchensuresindexisint==>no{iin0..index|items[i]==item}// If null returned, no element in items matches itemensuresindexisnull==>no{iin0..|items||items[i]==item}://nati=0//whilei<|items|// No element seen so far matches itemwhereno{jin0..i|items[j]==item}://ifitems[i]==item:returnii=i+1//returnnull
In the above, the function's declared return type is given the union type int|null
which indicates that either an int
value is returned or null
is returned. The function's postcondition is made of three ensures
clauses, each of which describe different properties that must hold of the returned index
. Flow typing is employed in these clauses through the runtime type test operator, is
. For example, in the first ensures
clause, the variable index
is retyped from int|null
to just int
on the right-hand side of the implication operator (i.e. ==>
).
The above example also illustrates the use of an inductive loop invariant. The loop invariant must be shown to hold on entry to the loop, for any given iteration of the loop and when the loop exits. In this case, the loop invariant states what is known about the elements of the items
examined so far — namely, that none of them matches the given item
. The loop invariant does not affect the meaning of the program and, in some sense, might be considered as unnecessary. However, the loop invariant is required to help the automated verifier using in the Whiley Compiler to prove this function meets its specification.
The above example also defines the type nat
with an appropriate type invariant. This type is used to declare variable i
and indicate that it can never hold a negative value. In this case, the declaration prevents the need for an additional loop invariant of the form where i >= 0
which would otherwise be necessary.
Whiley began in 2009 with the first public release, v0.2.27
following in June 2010 and v0.3.0
in September that year. The language has evolved slowly with numerous syntactical changes being made to-date. Versions prior v0.3.33
supported first-class string
and char
data types, but these were removed in favour of representing strings as constrained int[]
arrays. Likewise, versions prior to v0.3.35
supported first-class set (e.g. {int}
), dictionary (e.g. {int=>bool}
) and resizeable list [int]
), but these were dropped in favour of simple arrays (e.g. int[]
). Perhaps most controversial was the removal of the real
datatype in version v0.3.38
. Many of these changes were motivated by a desire to simplify the language and make compiler development more manageable.
Another important milestone in the evolution of Whiley came in version v0.3.40
with the inclusion of reference lifetimes, developed by Sebastian Schweizer as part of his Master's Thesis at the University of Kaiserslautern.
C is a general-purpose programming language. It was created in the 1970s by Dennis Ritchie and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems code, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.
In computer science, control flow is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an imperative programming language from a declarative programming language.
In computer programming, specifically when using the imperative programming paradigm, an assertion is a predicate connected to a point in the program, that always should evaluate to true at that point in code execution. Assertions can help a programmer read the code, help a compiler compile it, or help the program detect its own defects.
The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.
In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.
In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification of the programming language in which the source code is written. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.
In computer science, a loop invariant is a property of a program loop that is true before each iteration. It is a logical assertion, sometimes checked with a code assertion. Knowing its invariant(s) is essential in understanding the effect of a loop.
In computer programming, specifically object-oriented programming, a class invariant is an invariant used for constraining objects of a class. Methods of the class should preserve the invariant. The class invariant constrains the state stored in the object.
In computer programming, foreach loop is a control flow statement for traversing items in a collection. foreach is usually used in place of a standard for loop statement. Unlike other for loop constructs, however, foreach loops usually maintain no explicit counter: they essentially say "do this to everything in this set", rather than "do this x times". This avoids potential off-by-one errors and makes code simpler to read. In object-oriented languages, an iterator, even if implicit, is often used as the means of traversal.
The syntax of Java is the set of rules defining how a Java program is written and interpreted.
In some programming languages, const is a type qualifier that indicates that the data is read-only. While this can be used to declare constants, const in the C family of languages differs from similar constructs in other languages in that it is part of the type, and thus has complicated behavior when combined with pointers, references, composite data types, and type-checking. In other languages, the data is not in a single memory location, but copied at compile time for each use. Languages which use it include C, C++, D, JavaScript, Julia, and Rust.
The Java Modeling Language (JML) is a specification language for Java programs, using Hoare style pre- and postconditions and invariants, that follows the design by contract paradigm. Specifications are written as Java annotation comments to the source files, which hence can be compiled with any Java compiler.
Generics are a facility of generic programming that were added to the Java programming language in 2004 within version J2SE 5.0. They were designed to extend Java's type system to allow "a type or method to operate on objects of various types while providing compile-time type safety". The aspect compile-time type safety required that parametrically polymorphic functions are not implemented in the Java virtual machine, since type safety is impossible in this case.
This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.
In computer programming, a variable-length array (VLA), also called variable-sized or runtime-sized, is an array data structure whose length is determined at runtime, instead of at compile time. In the language C, the VLA is said to have a variably modified data type that depends on a value.
In computer science, array is a data type that represents a collection of elements, each selected by one or more indices that can be computed at run time during program execution. Such a collection is usually called an array variable or array value. By analogy with the mathematical concepts vector and matrix, array types with one and two indices are often called vector type and matrix type, respectively. More generally, a multidimensional array type can be called a tensor type, by analogy with the physical concept, tensor.
Parallel Specification and Implementation Language (ParaSail) is an object-oriented parallel programming language. Its design and ongoing implementation is described in a blog and on its official website.
Java bytecode is the instruction set of the Java virtual machine (JVM), the language to which Java and other JVM-compatible source code is compiled. Each instruction is represented by a single byte, hence the name bytecode, making it a compact form of data.
In programming language theory, flow-sensitive typing is a type system where the type of an expression depends on its position in the control flow.
Dafny is an imperative and functional compiled language that compiles to other programming languages, such as C#, Java, JavaScript, Go and Python. It supports formal specification through preconditions, postconditions, loop invariants, loop variants, termination specifications and read/write framing specifications. The language combines ideas from the functional and imperative paradigms; it includes support for object-oriented programming. Features include generic classes, dynamic allocation, inductive datatypes and a variation of separation logic known as implicit dynamic frames for reasoning about side effects. Dafny was created by Rustan Leino at Microsoft Research after his previous work on developing ESC/Modula-3, ESC/Java, and Spec#.