Flow-sensitive typing

Last updated

In programming language theory, flow-sensitive typing (also called flow typing or occurrence typing) is a type system where the type of an expression depends on its position in the control flow.

Contents

In statically typed languages, a type of an expression is determined by the types of the sub-expressions that compose it. However, in flow-sensitive typing, an expression's type may be updated to a more specific type if it follows an operation that validates its type. Validating operations can include type predicates, imperative updates, and control flow.

Examples

Ceylon

See the following example in Ceylon which illustrates the concept:

// Object? means the variable "name" is of type Object or else nullvoidhello(Object?name){if(isStringname){// "name" now has type String in this blockprint("Hello, ``name``!");// and it is possible to call String methods on the variableprint(" String.size is ``name.size``");}elseif(existsname){// "name" now has type Object in this blockprint("Hello, object ``name``!");}else{print("Hello, world!");}}hello(null);hello(1);hello("John Doe");

and which outputs:

Hello, world! Hello, object 1! Hello, John Doe!  String.size is 8

Kotlin

See this example in Kotlin:

funhello(obj:Any){// A type cast fails if `obj` is not a StringobjasString// Since the type cast did not fail, `obj` must be a String!vall=obj.lengthprintln("'$obj' is a string of length $l")}hello("Mooooo")

Benefits

This technique coupled with type inference reduces the need for writing type annotations for all variables or to do type casting, like is seen with dynamic languages that use duck typing. It reduces verbosity and makes for terser code, easier to read and modify.

It can also help language implementers provide implementations that execute dynamic languages faster by predicting the type of objects statically. [1]

Finally, it increases type safety and can prevent problems due to null pointers [ how? ], labeled by C.A.R. Hoare—the null reference inventor—as "the billion dollar mistake" [2]

Implementations

Typed Scheme, a type system for Scheme, was the first type system with this feature. [3] Its successor, Typed Racket (a dialect of Racket), still makes extensive use of occurrence typing. [4] Shortly after Typed Scheme, David J. Pearce independently reinvented flow-typing in Whiley. [5] [6]

Typed JavaScript observed that in "scripting" languages, flow-typing depends on more than conditional predicates; it also depends on state and control flow. [7] This style has since been adopted in languages like Ceylon, [8] TypeScript [9] and Facebook Flow. [10]

There are also a few languages that don't have union types but do have nullable types, that have a limited form of this feature that only applies to nullable types, such as C#, [11] Kotlin, [12] [13] and Lobster. [14]

Alternatives

Pattern matching reaches the same goals as flow-sensitive typing, namely reducing verbosity and making up for terser code, easier to read and modify. It achieves this is in a different way, it allows to match the type of a structure, extract data out of it at the same time by declaring new variable. As such, it reduces the ceremony around type casting and value extraction. Pattern matching works best when used in conjunction with algebraic data types because all the cases can be enumerated and statically checked by the compiler.

See this example mock for Java: [15]

inteval(Noden){returnswitch(n){// try to type cast "Node" into "IntNode", and create the variable "i" of type "int".// If that works, then return the value of "i"caseIntNode(inti)->i;// try to type cast "Node" into "NegNode", and create the variable "n" of type "Node".// If that works, then return the negation of evaluating the "n" nodecaseNegNode(Noden)->-eval(n);// try to type cast "Node" into "AddNode", and create the variables "left" and "right" of type "Node".// If that works, then return the addition of evaluating the "left" and "right" nodescaseAddNode(Nodeleft,Noderight)->eval(left)+eval(right);// try to type cast "Node" into "MulNode", and create the variables "left" and "right" of type "Node".// If that works, then return the multiplication of evaluating the "left" and "right" nodescaseMulNode(Nodeleft,Noderight)->eval(left)*eval(right);// no "default" because the compiler knows all the possible cases have been enumerated};}

In a statically typed language, the advantage of pattern matching over flow-sensitive typing is that the type of a variable always stays the same: it does not change depending on control flow. When writing down the pattern to be matched, a new variable is declared that will have the new type.

Related Research Articles

OCaml is a general-purpose, high-level, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez, and others.

In object-oriented (OO) and functional programming, an immutable object is an object whose state cannot be modified after it is created. This is in contrast to a mutable object, which can be modified after it is created. In some cases, an object is considered immutable even if some internally used attributes change, but the object's state appears unchanging from an external point of view. For example, an object that uses memoization to cache the results of expensive computations could still be considered an immutable object.

In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite type, i.e., a type formed by combining other types.

In computer science, reflective programming or reflection is the ability of a process to examine, introspect, and modify its own structure and behavior.

In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as that value, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

<span class="mw-page-title-main">Java syntax</span> Set of rules defining correctly structured program

The syntax of Java is the set of rules defining how a Java program is written and interpreted.

In mathematics and in computer programming, a variadic function is a function of indefinite arity, i.e., one which accepts a variable number of arguments. Support for variadic functions differs widely among programming languages.

Exception handling syntax is the set of keywords and/or structures provided by a computer programming language to allow exception handling, which separates the handling of errors that arise during a program's operation from its ordinary processes. Syntax for exception handling varies between programming languages, partly to cover semantic differences but largely to fit into each language's overall syntactic structure. Some languages do not call the relevant concept "exception handling"; others may not have direct facilities for it, but can still provide means to implement it.

<span class="mw-page-title-main">JavaScript syntax</span> Set of rules defining correctly structured programs

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

Haxe is a high-level cross-platform programming language and compiler that can produce applications and source code for many different computing platforms from one code-base. It is free and open-source software, released under the MIT License. The compiler, written in OCaml, is released under the GNU General Public License (GPL) version 2.

In object-oriented computer programming, a null object is an object with no referenced value or with defined neutral (null) behavior. The null object design pattern, which describes the uses of such objects and their behavior, was first published as "Void Value" and later in the Pattern Languages of Program Design book series as "Null Object".

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

The syntax and semantics of PHP, a programming language, form a set of rules that define how a PHP program can be written and interpreted.

The computer programming language, C#, introduces several new features in version 2.0. These include:

<span class="mw-page-title-main">Ceylon (programming language)</span>

Ceylon was an object-oriented, strongly statically typed programming language with an emphasis on immutability, created by Red Hat. Ceylon programs run on the Java virtual machine (JVM), and could be compiled to JavaScript. The language design focuses on source code readability, predictability, toolability, modularity, and metaprogrammability.

<span class="mw-page-title-main">Elm (programming language)</span> Functional programming language

Elm is a domain-specific programming language for declaratively creating web browser-based graphical user interfaces. Elm is purely functional, and is developed with emphasis on usability, performance, and robustness. It advertises "no runtime exceptions in practice", made possible by the Elm compiler's static type checking.

Kotlin is a cross-platform, statically typed, general-purpose high-level programming language with type inference. Kotlin is designed to interoperate fully with Java, and the JVM version of Kotlin's standard library depends on the Java Class Library, but type inference allows its syntax to be more concise. Kotlin mainly targets the JVM, but also compiles to JavaScript or native code via LLVM. Language development costs are borne by JetBrains, while the Kotlin Foundation protects the Kotlin trademark.

<span class="mw-page-title-main">Nim (programming language)</span> Programming language

Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level systems programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.

Whiley is an experimental programming language that combines features from the functional and imperative paradigms, and supports formal specification through function preconditions, postconditions and loop invariants. The language uses flow-sensitive typing also known as "flow typing."

This article compares the syntax for defining and instantiating an algebraic data type (ADT), sometimes also referred to as a tagged union, in various programming languages.

References

  1. Lukas Eder (11 December 2014). "The Inconvenient Truth About Dynamic vs. Static Typing". blog.jooq.org. Retrieved 11 March 2016.
  2. Tony Hoare (2009-08-25). "Null References: The Billion Dollar Mistake". InfoQ.com. I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
  3. "The Design and Implementation of Typed Scheme | Lambda the Ultimate, 2008". lambda-the-ultimate.org.
  4. "5 Occurrence Typing". docs.racket-lang.org.
  5. David J. Pearce (22 September 2010). "On Flow-Sensitive Types in Whiley". whiley.org. Archived from the original on 11 March 2016. Retrieved 11 March 2016.
  6. David J. Pearce (8 April 2012). "Whiley - Flow Typing". whiley.org. Archived from the original on 11 March 2016. Retrieved 11 March 2016.
  7. "Typing Local Control and State Using Flow Analysis" . Retrieved 14 November 2016.
  8. "Ceylon - Quick introduction - Typesafe null and flow-sensitive typing". ceylon-lang.org. Retrieved 11 March 2016.
  9. Ryan Cavanaugh (18 November 2014). "TypeScript 1.4 sneak peek: union types, type guards, and more". blogs.msdn.microsoft.com. Retrieved 11 March 2016.
  10. Avik Chaudhuri; Basil Hosmer; Gabriel Levi (18 November 2014). "Flow, a new static type checker for JavaScript". code.facebook.com. Retrieved 11 March 2016.
  11. "Design with nullable reference types". docs.microsoft.com.
  12. "Null Safety". kotlinlang.org. Retrieved 11 March 2016.
  13. "Type Checks and Casts". kotlinlang.org. Retrieved 11 March 2016.
  14. "The Lobster Type System". aardappel.github.io.
  15. Gavin Bierman and Brian Goetz (19 September 2023). "JEP 441: Pattern Matching for switch". openjdk.org. Retrieved 14 November 2023.