Strong and weak typing

Last updated

In computer programming, one of the many ways that programming languages are colloquially classified is whether the language's type system makes it strongly typed or weakly typed (loosely typed). However, there is no precise technical definition of what the terms mean and different authors disagree about the implied meaning of the terms and the relative rankings of the "strength" of the type systems of mainstream programming languages. [1] For this reason, writers who wish to write unambiguously about type systems often eschew the terms "strong typing" and "weak typing" in favor of specific expressions such as "type safety".

Contents

Generally, a strongly typed language has stricter typing rules at compile time, which implies that errors and exceptions are more likely to happen during compilation. Most of these rules affect variable assignment, function return values, procedure arguments and function calling. Dynamically typed languages (where type checking happens at run time) can also be strongly typed. In dynamically typed languages, values, rather than variables, have types.

A weakly typed language has looser typing rules and may produce unpredictable or even erroneous results or may perform implicit type conversion at runtime. [2] A different but related concept is latent typing.

History

In 1974, Barbara Liskov and Stephen Zilles defined a strongly-typed language as one in which "whenever an object is passed from a calling function to a called function, its type must be compatible with the type declared in the called function." [3] In 1977, K. Jackson wrote, "In a strongly typed language each data area will have a distinct type and each process will state its communication requirements in terms of these types." [4]

Definitions of "strong" or "weak"

A number of different language design decisions have been referred to as evidence of "strong" or "weak" typing. Many of these are more accurately understood as the presence or absence of type safety, memory safety, static type-checking, or dynamic type-checking.

"Strong typing" generally refers to use of programming language types in order to both capture invariants of the code, and ensure its correctness, and definitely exclude certain classes of programming errors. Thus there are many "strong typing" disciplines used to achieve these goals.

Implicit type conversions and "type punning"

Some programming languages make it easy to use a value of one type as if it were a value of another type. This is sometimes described as "weak typing".

For example, Aahz Maruch observes that "Coercion occurs when you have a statically typed language and you use the syntactic features of the language to force the usage of one type as if it were a different type (consider the common use of void* in C). Coercion is usually a symptom of weak typing. Conversion, on the other hand, creates a brand-new object of the appropriate type." [5]

As another example, GCC describes this as type-punning and warns that it will break strict aliasing. Thiago Macieira discusses several problems that can arise when type-punning causes the compiler to make inappropriate optimizations. [6]

There are many examples of languages that allow implicit type conversions, but in a type-safe manner. For example, both C++ and C# allow programs to define operators to convert a value from one type to another with well-defined semantics. When a C++ compiler encounters such a conversion, it treats the operation just like a function call. In contrast, converting a value to the C type void* is an unsafe operation that is invisible to the compiler.

Pointers

Some programming languages expose pointers as if they were numeric values, and allow users to perform arithmetic on them. These languages are sometimes referred to as "weakly typed", since pointer arithmetic can be used to bypass the language's type system.

Untagged unions

Some programming languages support untagged unions, which allow a value of one type to be viewed as if it were a value of another type.

Static type-checking

In Luca Cardelli's article Typeful Programming, [7] a "strong type system" is described as one in which there is no possibility of an unchecked runtime type error. In other writing, the absence of unchecked run-time errors is referred to as safety or type safety; Tony Hoare's early papers call this property security. [8]

Variation across programming languages

Some of these definitions are contradictory, others are merely conceptually independent, and still others are special cases (with additional constraints) of other, more "liberal" (less strong) definitions. Because of the wide divergence among these definitions, it is possible to defend claims about most programming languages that they are either strongly or weakly typed. For instance:

See also

Related Research Articles

C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

<span class="mw-page-title-main">Common Lisp</span> Programming language standard

Common Lisp (CL) is a dialect of the Lisp programming language, published in American National Standards Institute (ANSI) standard document ANSI INCITS 226-1994 (S2018). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived from the ANSI Common Lisp standard.

<span class="mw-page-title-main">Programming language</span> Language for communicating instructions to a machine

A programming language is a system of notation for writing computer programs.

OCaml is a general-purpose, high-level, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez, and others.

Java and C++ are two prominent object-oriented programming languages. By many language popularity metrics, the two languages have dominated object-oriented and high-performance software development for much of the 21st century, and are often directly compared and contrasted. Java's syntax was based on C/C++.

In computer programming, the scope of a name binding is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts of the program, the name may refer to a different entity, or to nothing at all. Scope helps prevent name collisions by allowing the same name to refer to different objects – as long as the names have separate scopes. The scope of a name binding is also known as the visibility of an entity, particularly in older or more technical literature—this is in relation to the referenced entity, not the referencing name.

In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every term. Usually the terms are various language constructs of a computer program, such as variables, expressions, functions, or modules. A type system dictates the operations that can be performed on a term. For variables, the type system determines the allowed values of that term. Type systems formalize and enforce the otherwise implicit categories the programmer uses for algebraic data types, data structures, or other components.

In computer programming, a reference is a value that enables a program to indirectly access a particular datum, such as a variable's value or a record, in the computer's memory or in some other storage device. The reference is said to refer to the datum, and accessing the datum is called dereferencing the reference. A reference is distinct from the datum itself.

In computer science, a dynamic programming language is a class of high-level programming languages which at runtime execute many common programming behaviours that static programming languages perform during compilation. These behaviors could include an extension of the program, by adding new code, by extending objects and definitions, or by modifying the type system. Although similar behaviors can be emulated in nearly any language, with varying degrees of difficulty, complexity and performance costs, dynamic languages provide direct tools to make use of them. Many of these features were first implemented as native features in the Lisp programming language.

In programming language theory and type theory, polymorphism is the use of a single symbol to represent multiple different types.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

In computer science, type conversion, type casting, type coercion, and type juggling are different ways of changing an expression from one data type to another. An example would be the conversion of an integer value into a floating point value or its textual representation as a string, and vice versa. Type conversions can take advantage of certain features of type hierarchies or data representations. Two important aspects of a type conversion are whether it happens implicitly (automatically) or explicitly, and whether the underlying data representation is converted from one representation into another, or a given representation is merely reinterpreted as the representation of another data type. In general, both primitive and compound data types can be converted.

In computer science, type safety and type soundness are the extent to which a programming language discourages or prevents type errors. Type safety is sometimes alternatively considered to be a property of facilities of a computer language; that is, some facilities are type-safe and their usage will not result in type errors, while other facilities in the same language may be type-unsafe and a program using them may encounter type errors. The behaviors classified as type errors by a given programming language are usually those that result from attempts to perform operations on values that are not of the appropriate data type, e.g., adding a string to an integer when there's no definition on how to handle this case. This classification is partly based on opinion.

This article compares two programming languages: C# with Java. While the focus of this article is mainly the languages and their features, such a comparison will necessarily also consider some features of platforms and libraries. For a more detailed comparison of the platforms, see Comparison of the Java and .NET platforms.

In computing, late binding or dynamic linkage—though not an identical process to dynamically linking imported code libraries—is a computer programming mechanism in which the method being called upon an object, or the function being called with arguments, is looked up by name at runtime. In other words, a name is associated with a particular operation or object at runtime, rather than during compilation. The name dynamic binding is sometimes used, but is more commonly used to refer to dynamic scope.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

Haxe is a high-level cross-platform programming language and compiler that can produce applications and source code for many different computing platforms from one code-base. It is free and open-source software, released under the MIT License. The compiler, written in OCaml, is released under the GNU General Public License (GPL) version 2.

This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.

In computer programming, a constant is a value that is not altered by the program during normal execution. When associated with an identifier, a constant is said to be "named," although the terms "constant" and "named constant" are often used interchangeably. This is contrasted with a variable, which is an identifier with a value that can be changed during normal execution. To simplify, constants' values remains, while the values of variables varies, both hence their names. where as the constant variable of variation is the number that results two variables.

In programming languages, name resolution is the resolution of the tokens within program expressions to the intended program components.

References

  1. "What to know before debating type systems | Ovid [blogs.perl.org]". blogs.perl.org. Retrieved 2023-06-27.
  2. "CS1130. Transition to OO programming. – Spring 2012 --self-paced version". Cornell University, Department of Computer Science. 2005. Archived from the original on 2015-11-23. Retrieved 2015-11-23.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  3. Liskov, B; Zilles, S (1974). "Programming with abstract data types". ACM SIGPLAN Notices. 9 (4): 50–59. CiteSeerX   10.1.1.136.3043 . doi:10.1145/942572.807045.
  4. Jackson, K. (1977). "Parallel processing and modular software construction". Design and Implementation of Programming Languages . Lecture Notes in Computer Science. Vol. 54. pp. 436–443. doi:10.1007/BFb0021435. ISBN   3-540-08360-X.
  5. Aahz. "Typing: Strong vs. Weak, Static vs. Dynamic" . Retrieved 16 August 2015.
  6. "Type-punning and strict-aliasing - Qt Blog". Qt Blog. Retrieved 18 February 2020.
  7. Luca Cardelli, "Typeful programming"
  8. Hoare, C. A. R. 1974. Hints on Programming Language Design. In Computer Systems Reliability, ed. C. Bunyan. Vol. 20 pp. 505–534.
  9. InfoWorld. 1983-04-25. Retrieved 16 August 2015.
  10. Kernighan, Brian (1981). "Why Pascal is not my favorite programming language". Archived from the original on 2012-04-06. Retrieved 2011-10-22.
  11. "CLHS: Chapter 4" . Retrieved 16 August 2015.
  12. "CMUCL User's Manual: The Compiler". Archived from the original on 8 March 2016. Retrieved 16 August 2015.