Flix (programming language)

Last updated
Flix
Paradigm Multi-paradigm: functional, imperative, logic
Developer Aarhus University, open-source contributors
Typing discipline inferred, static, strong, structural
Platform JVM
License Apache License 2.0. [1]
Filename extensions .flix
Website flix.dev
Influenced by
F#, Go, Haskell, OCaml, Scala

Flix is a functional, imperative, and logic programming language developed at Aarhus University, with funding from the Independent Research Fund Denmark, [2] and by a community of open source contributors. [3] The Flix language supports algebraic data types, pattern matching, parametric polymorphism, currying, higher-order functions, extensible records, [4] channel and process-based concurrency, and tail call elimination. Two notable features of Flix are its type and effect system [5] and its support for first-class Datalog constraints. [6]

Contents

The Flix type and effect system supports Hindley-Milner-style type inference. The system separates pure and impure code: if an expression is typed as pure then it cannot produce an effect at run-time. Higher-order functions can enforce that they are given pure (or impure) function arguments. The type and effect system supports effect polymorphism [7] [8] which means that the effect of a higher-order function may depend on the effect(s) of its argument(s).

Flix supports Datalog programs as first-class values. A Datalog program value, i.e. a collection of Datalog facts and rules, can be passed to and returned from functions, stored in data structures, and composed with other Datalog program values. The minimal model of a Datalog program value can be computed and is itself a Datalog program value. In this way, Flix can be viewed as a meta programming language for Datalog. Flix supports stratified negation and the Flix compiler ensures stratification at compile-time. [9] Flix also supports an enriched form of Datalog constraints where predicates are given lattice semantics. [10] [11] [12] [13]

Overview

Flix is a programming language in the ML-family of languages. Its type and effect system is based on Hindley-Milner with several extensions, including row polymorphism and Boolean unification. The syntax of Flix is inspired by Scala and uses short keywords and curly braces. Flix supports uniform function call syntax which allows a function call f(x, y, z) to be written as x.f(y, z). The concurrency model of Flix is inspired by Go and based on channels and processes. A process is a light-weight thread that does not share (mutable) memory with another process. Processes communicate over channels which are bounded or unbounded queues of immutable messages.

While many programming languages support a mixture of functional and imperative programming, the Flix type and effect system tracks the purity of every expression making it possible to write parts of a Flix program in a purely functional style with purity enforced by the effect system.

Flix programs compile to JVM bytecode and are executable on the Java Virtual Machine (JVM). [14] The Flix compiler performs whole program compilation, eliminates polymorphism via monomorphization, [15] and uses tree shaking to remove unreachable code. Monomorphization avoids boxing of primitive values at the cost of longer compilation times and larger executable binaries. Flix has some support for interoperability with programs written in Java. [16]

Flix supports tail call elimination which ensures that function calls in tail position never consume stack space and hence cannot cause the call stack to overflow. [17] Since the JVM instruction set lacks explicit support for tail calls, such calls are emulated using a form of reusable stack frames. [18] Support for tail call elimination is important since all iteration in Flix is expressed through recursion.

The Flix compiler disallows most forms of unused or redundant code, including: unused local variables, unused functions, unused formal parameters, unused type parameters, and unused type declarations, such unused constructs are reported as compiler errors. [19] Variable shadowing is also disallowed. The stated rationale is that unused or redundant code is often correlated with erroneous code [20]

A Visual Studio Code extension for Flix is available. [21] The extension is based on the Language Server Protocol, a common interface between IDEs and compilers being developed by Microsoft.

Flix is open source software available under the Apache 2.0 License.

Examples

Hello world

The following program prints "Hello World!" when compiled and executed:

defmain():Unit&Impure=Console.printLine("Hello World!")

The type and effect signature of the main function specifies that it has no parameters, returns a value of type Unit, and that the function is impure. The main function is impure because it invokes printLine which is impure.

Algebraic data types and pattern matching

The following program fragment declares an algebraic data type (ADT) named Shape:

enumShape{caseCircle(Int),// has circle radiuscaseSquare(Int),// has side lengthcaseRectangle(Int,Int)// has height and width}

The ADT has three constructors: Circle, Square, and Rectangle.

The following program fragment uses pattern matching to destruct a Shape value:

defarea(s:Shape):Int=matchs{caseCircle(r)=>3*(r*r)caseSquare(w)=>w*wcaseRectangle(h,w)=>h*w}

Higher-order functions

The following program fragment defines a higher-order function named twice that when given a function f from Int to Int returns a function that applies f to its input two times:

deftwice(f:Int->Int):Int->Int=x->f(f(x))

We can use the function twice as follows:

twice(x->x+1)(0)

Here the call to twice(x -> x + 1) returns a function that will increment its argument two times. Thus the result of the whole expression is 0 + 1 + 1 = 2.

Parametric polymorphism

The following program fragment illustrates a polymorphic function that maps a function f: a -> b over a list of elements of type a returning a list of elements of type b:

defmap(f:a->b,l:List[a]):List[b]=matchl{caseNil=>Nilcasex::xs=>f(x)::map(f,xs)}

The map function recursively traverses the list l and applies f to each element constructing a new list.

Flix supports type parameter elision hence it is not required that the type parameters a and b are explicitly introduced.

Extensible records

The following program fragment shows how to construct a record with two fields x and y:

defpoint2d():{x:Int,y:Int}={x=1,y=2}

Flix uses row polymorphism to type records. The sum function below takes a record that has x and y fields (and possibly other fields) and returns the sum of the two fields:

defsum(r:{x:Int,y:Int|rest}):Int=r.x+r.y

The following are all valid calls to the sum function:

sum({x=1,y=2})sum({y=2,x=1})sum({x=1,y=2,z=3})

Notable features

Polymorphic effects

The Flix type and effect system separates pure and impure expressions. [5] [22] [23] A pure expression is guaranteed to be referentially transparent. A pure function always returns the same value when given the same argument(s) and cannot have any (observable) side-effects.

For example, the following expression is of type Int and is Pure:

1+2:Int&Pure

whereas the following expression is Impure:

Console.printLine("Hello World"):Unit&Impure

A higher-order function can specify that a function argument must be pure, impure, or that it is effect polymorphic.

For example, the definition of Set.exists requires that its function argument f is pure:

// The syntax a -> Bool is short-hand for a -> Bool & Puredefexists(f:a->Bool,xs:Set[a]):Bool=...

The requirement that f must be pure ensures that implementation details do not leak. For example, since f is pure it cannot be used to determine in what order the elements of the set are traversed. If f was impure such details could leak, e.g. by passing a function that also prints the current element, revealing the internal element order inside the set.

A higher-order function can also require that a function is impure.

For example, the definition of List.foreach requires that its function argument f is impure:

// The syntax a ~> Unit is short-hand for a -> Unit & Impuredefforeach(f:a~>Unit,xs:List[a]):Unit&Impure

The requirement that f must be impure ensures that the code makes sense: It would be meaningless to call List.foreach with a pure function since it always returns Unit.

The type and effect is sound, but not complete. That is, if a function is pure then it cannot cause an effect, whereas if a function is impure then it may, but not necessarily, cause an effect. For example, the following expression is impure even though it cannot produce an effect at run-time:

if(1==2)Console.printLine("Hello World!")else()

A higher-order function can also be effect polymorphic: its effect(s) can depend on its argument(s).

For example, the standard library definition of List.map is effect polymorphic: [24]

defmap(f:a->b&e,xs:List[a]):List[b]&e

The List.map function takes a function f from elements of type a to b with effect e. The effect of the map function is itself e. Consequently, if List.map is invoked with a pure function then the entire expression is pure whereas if it is invoked with an impure function then the entire expression is impure. It is effect polymorphic.

A higher-order function that takes multiple function arguments may combine their effects.

For example, the standard library definition of forward function composition >> is pure if both its function arguments are pure: [25]

def>>(f:a->b&e1,g:b->c&e2):a->c&(e1ande2)=x->g(f(x))

The type and effect signature can be understood as follows: The >> function takes two function arguments: f with effect e1 and g with effect e2. The effect of >> is effect polymorphic in the conjunction of e1 and e2. If both are pure (their effect is true) then the overall expression is pure (true). Otherwise it is impure.

The type and effect system allows arbitrary boolean expressions to control the purity of function arguments.

For example, it is possible to express a higher-order function h that accepts two function arguments f and g of which at most one is impure:

defh(f:a->b&e1,g:b->c&(note1ore2)):Unit

If h is called with a function argument f which is impure (false) then the second argument must be pure (true). Conversely, if f is pure (true) then g may be pure (true) or impure (false). It is a compile-time error to call h with two impure functions.

The type and effect system can be used to ensure that statement expressions are useful, i.e. that if an expression or function is evaluated and its result is discarded then it must have a side-effect. For example, compiling the program fragment below:

defmain():Unit&Impure=List.map(x->2*x,1::2::Nil);Console.printLine("Hello World")

causes a compiler error:

-- Redundancy Error -------------------------------------------------->> Useless expression: It has no side-effect(s) and its result is discarded.2 |     List.map(x -> 2 * x, 1 :: 2 :: Nil);        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^        useless expression.

because it is non-sensical to evaluate the pure expression List.map(x -> 2 * x, 1 :: 2 :: Nil) and then to discard its result. Most likely the programmer wanted to use the result (or alternatively the expression is redundant and could be deleted). Consequently, Flix rejects such programs.

First-class datalog constraints

Flix supports Datalog programs as first-class values. [6] [9] [26] A Datalog program is a logic program that consists of a collection of unordered facts and rules. Together, the facts and rules imply a minimal model, a unique solution to any Datalog program. In Flix, Datalog program values can be passed to and returned from functions, stored in data structures, composed with other Datalog program values, and solved. The solution to a Datalog program (the minimal model) is itself a Datalog program. Thus, it is possible to construct pipelines of Datalog programs where the solution, i.e. "output", of one Datalog program becomes the "input" to another Datalog program.

The following edge facts define a graph:

Edge(1,2).Edge(2,3).Edge(3,4).

The following Datalog rules compute the transitive closure of the edge relation:

Path(x,y):-Edge(x,y).Path(x,z):-Path(x,y),Edge(y,z).

The minimal model of the facts and rules is:

Edge(1,2).Edge(2,3).Edge(3,4).Path(1,2).Path(2,3).Path(3,4).Path(1,3).Path(1,4).Path(2,4).

In Flix, Datalog programs are values. The above program can be embedded in Flix as follows:

defmain():#{Edge(Int, Int), Path(Int, Int)} = letf=#{Edge(1,2).Edge(2,3).Edge(3,4).};letp=#{Path(x,y):-Edge(x,y).Path(x,z):-Path(x,y),Edge(y,z).};solvef<+>p

The local variable f holds a Datalog program value that consists of the edge facts. Similarly, the local variable p is a Datalog program value that consists of the two rules. The f <+> p expression computes the composition (i.e. union) of the two Datalog programs f and p. The solve expression computes the minimal model of the combined Datalog program, returning the edge and path facts shown above.

Since Datalog programs are first-class values, we can refactor the above program into several functions. For example:

defedges():#{Edge(Int, Int), Path(Int, Int)} = #{Edge(1,2).Edge(2,3).Edge(3,4).}defclosure():#{Edge(Int, Int), Path(Int, Int)} = #{Path(x,y):-Edge(x,y).Path(x,z):-Path(x,y),Edge(y,z).}defmain():#{Edge(Int, Int), Path(Int, Int)} = solve edges() <+> closure()

The un-directed closure of the graph can be computed by adding the rule:

Path(x,y):-Path(y,x).

We can modify the closure function to take a Boolean argument that determines whether we want to compute the directed or un-directed closure:

defclosure(directed:Bool):#{Edge(Int, Int), Path(Int, Int)} = letp1=#{Path(x,y):-Edge(x,y).Path(x,z):-Path(x,y),Edge(y,z).};letp2=#{Path(y,x):-Path(x,y).};if(directed)p1else(p1<+>p2)

Type-safe composition

The Flix type system ensures that Datalog program values are well-typed.

For example, the following program fragment does not type check:

letp1=Edge(123,456).;letp2=Edge("a","b").;p1<+>p2;

because in p1 the type of the Edge predicate is Edge(Int, Int) whereas in p2 it has type Edge(String, String). The Flix compiler rejects such programs as ill-typed.

Stratified negation

The Flix compiler ensures that every Datalog program value constructed at run-time is stratified. Stratification is important because it guarantees the existence of a unique minimal model in the presence of negation. Intuitively, a Datalog program is stratified if there is no recursion through negation, [27] i.e. a predicate cannot depend negatively on itself. Given a Datalog program, a cycle detection algorithm can be used to determine if it is stratified.

For example, the following Flix program contains an expression that cannot be stratified:

defmain():#{Male(String), Husband(String), Bachelor(String)} = letp1=Husband(x):-Male(x),notBachelor(x).;letp2=Bachelor(x):-Male(x),notHusband(x).;p1<+>p2// illegal, not stratified.

because the last expression constructs a Datalog program value whose precedence graph contains a negative cycle: the Bachelor predicate negatively depends on the Husband predicate which in turn (positively) depends on the Bachelor predicate.

The Flix compiler computes the precedence graph for every Datalog program valued expression and determines its stratification at compile-time. If an expression is not stratified, the program is rejected by the compiler.

The stratification is sound, but conservative. For example, the following program is unfairly rejected:

defmain():#{A(Int), B(Int)} = if(true)A(x):-A(x),notB(x).elseB(x):-B(x),notA(x).

The type system conservatively assumes that both branches of the if expression can be taken and consequently infers that there may be a negative cycle between the A and B predicates. Thus the program is rejected. This is despite the fact that at run-time the main function always returns a stratified Datalog program value.

Design philosophy

Flix is designed around a collection of stated principles: [28]

The principles also list several programming language features that have been deliberately omitted. In particular, Flix lacks support for:

Related Research Articles

In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.

ML is a functional programming language. It is known for its use of the polymorphic Hindley–Milner type system, which automatically assigns the data types of most expressions without requiring explicit type annotations, and ensures type safety; there is a formal proof that a well-typed ML program does not cause runtime type errors. ML provides pattern matching for function arguments, garbage collection, imperative programming, call-by-value and currying. While a general-purpose programming language, ML is used heavily in programming language research and is one of the few languages to be completely specified and verified using formal semantics. Its types and pattern matching make it well-suited and commonly used to operate on other formal languages, such as in compiler writing, automated theorem proving, and formal verification.

OCaml is a general-purpose, high-level, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez, and others.

In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.

Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular for writing compilers, for programming language research, and for developing theorem provers.

In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every term. Usually the terms are various language constructs of a computer program, such as variables, expressions, functions, or modules. A type system dictates the operations that can be performed on a term. For variables, the type system determines the allowed values of that term. Type systems formalize and enforce the otherwise implicit categories the programmer uses for algebraic data types, data structures, or other components.

In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:

In programming language theory and type theory, polymorphism is the use of a single symbol to represent multiple different types.

Type inference, sometimes called type reconstruction, refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some branches of computer science and linguistics.

In computer programming, a parameter or a formal argument is a special kind of variable used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are the values of the arguments with which the subroutine is going to be called/invoked. An ordered list of parameters is usually included in the definition of a subroutine, so that, each time the subroutine is called, its arguments for that call are evaluated, and the resulting values can be assigned to the corresponding parameters.

In computer programming, a default argument is an argument to a function that a programmer is not required to specify. In most programming languages, functions may take one or more arguments. Usually, each argument must be specified in full. Later languages allow the programmer to specify default arguments that always have a value, even if one is not specified when calling the function.

In computer programming, a thunk is a subroutine used to inject a calculation into another subroutine. Thunks are primarily used to delay a calculation until its result is needed, or to insert operations at the beginning or end of the other subroutine. They have many other applications in compiler code generation and modular programming.

In functional programming, continuation-passing style (CPS) is a style of programming in which control is passed explicitly in the form of a continuation. This is contrasted with direct style, which is the usual style of programming. Gerald Jay Sussman and Guy L. Steele, Jr. coined the phrase in AI Memo 349 (1975), which sets out the first version of the Scheme programming language. John C. Reynolds gives a detailed account of the numerous discoveries of continuations.

Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties from Prolog. It is often used as a query language for deductive databases. Datalog has been applied to problems in data integration, networking, program analysis, and more.

In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a parameter-passing strategy that defines the kind of value that is passed to the function for each parameter and whether to evaluate the parameters of a function call, and if so in what order. The notion of reduction strategy is distinct, although some authors conflate the two terms and the definition of each term is not widely agreed upon.

In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T and a type variable a, and means that a can only be instantiated to a type whose members support the overloaded operations associated with T.

In computer programming, a pure function is a function that has the following properties:

  1. the function return values are identical for identical arguments, and
  2. the function has no side effects.

In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.

Soufflé is an open source parallel logic programming language, influenced by Datalog. Soufflé includes both an interpreter and a compiler that targets parallel C++. Soufflé has been used to build static analyzers, disassemblers, and tools for binary reverse engineering. Soufflé is considered by academic researchers to be high-performance and "state of the art," and is often used in benchmarks in academic papers.

Logic programming is a programming paradigm that includes languages based on formal logic, including Datalog and Prolog. This article describes the syntax and semantics of the purely declarative subset of these languages. Confusingly, the name "logic programming" also refers to a specific programming language that roughly corresponds to the declarative subset of Prolog. Unfortunately, the term must be used in both senses in this article.

References

  1. "Apache License 2.0". 27 July 2022 via GitHub.
  2. "Forskningsprojekter". Danmarks Frie Forskningsfond (in Danish).
  3. "Flix Authors". GitHub. 27 July 2022.
  4. Leijen, Daan. "Extensible records with scoped labels". Trends in Functional Programming.
  5. 1 2 Madsen, Magnus; van de Pol, Jaco (13 November 2020). "Polymorphic Types and Effects with Boolean Unification". Proceedings of the ACM on Programming Languages. 4 (OOPSLA): 1–29. doi: 10.1145/3428222 . S2CID   227044242.
  6. 1 2 Madsen, Magnus; Lhoták, Ondřej (13 November 2020). "Fixpoints for the Masses: Programming with First-class Datalog Constraints". Proceedings of the ACM on Programming Languages. 4 (OOPSLA): 125:1–125:28. doi: 10.1145/3428193 . S2CID   227107960.
  7. Lucassen, J. M.; Gifford, D. K. (1988). "Polymorphic effect systems". Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '88. pp. 47–57. doi: 10.1145/73560.73564 . ISBN   0897912527. S2CID   13015611.
  8. Leijen, Daan (5 June 2014). "Koka: Programming with Row Polymorphic Effect Types". Electronic Proceedings in Theoretical Computer Science. 153: 100–126. arXiv: 1406.2061 . doi:10.4204/EPTCS.153.8. S2CID   14902937.
  9. 1 2 "Programming Flix - Fixpoints". flix.dev.
  10. Madsen, Magnus; Yee, Ming-Ho; Lhoták, Ondřej (August 2016). "From Datalog to flix: a declarative language for fixed points on lattices". ACM SIGPLAN Notices. 51 (6): 194–208. doi:10.1145/2980983.2908096.
  11. Madsen, Magnus; Lhoták, Ondřej (2018). "Safe and sound program analysis with Flix". Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp. 38–48. doi:10.1145/3213846.3213847. ISBN   9781450356992. S2CID   49427988.
  12. Keidel, Sven; Erdweg, Sebastian (10 October 2019). "Sound and reusable components for abstract interpretation". Proceedings of the ACM on Programming Languages. 3 (OOPSLA): 1–28. doi: 10.1145/3360602 . S2CID   203631644.
  13. Gong, Qing. Extending Parallel Datalog with Lattice. Pennsylvania State University.
  14. Yee, Ming-Ho (2016-09-15). Implementing a Functional Language for Flix. University of Waterloo.
  15. "Monomorphise". mlton.org.
  16. "Programming Flix - Interoperability". flix.dev.
  17. Madsen, Magnus; Zarifi, Ramin; Lhoták, Ondřej (2018). "Tail call elimination and data representation for functional languages on the Java virtual machine". Proceedings of the 27th International Conference on Compiler Construction. pp. 139–150. doi:10.1145/3178372.3179499. ISBN   9781450356442. S2CID   3432962.
  18. Tauber, Tomáš; Bi, Xuan; Shi, Zhiyuan; Zhang, Weixin; Li, Huang; Zhang, Zhenrui; Oliveira, Bruno C. D. S. (2015). "Memory-Efficient Tail Calls in the JVM with Imperative Functional Objects". Programming Languages and Systems. Lecture Notes in Computer Science. 9458: 11–28. doi:10.1007/978-3-319-26529-2_2. ISBN   978-3-319-26528-5.
  19. "Redundancies as Compile-Time Errors". flix.dev.
  20. Engler, D. (October 2003). "Using redundancies to find errors". IEEE Transactions on Software Engineering. 29 (10): 915–928. doi:10.1109/TSE.2003.1237172.
  21. "flix - Visual Studio Marketplace". marketplace.visualstudio.com.
  22. "Programming Flix - Effects". flix.dev.
  23. "Rust Internals - Flix Polymorphic Effects". 15 November 2020.
  24. "The Flix API - List". api.flix.dev.
  25. "The Flix API - Prelude". api.flix.dev.
  26. Arntzenius, Michael; Krishnaswami, Neel (January 2020). "Seminaïve evaluation for a higher-order functional language". Proceedings of the ACM on Programming Languages. 4 (POPL): 1–28. doi: 10.1145/3371090 . S2CID   208305062.
  27. Minker, Jack. Foundations of deductive databases and logic programming. Morgan Kaufmann.
  28. "The Flix Programming Language - Principles". flix.dev. Retrieved 28 August 2020.
  29. "Taming Impurity with Polymorphic Effects". flix.dev.