K (programming language)

Last updated
K
Paradigm array, functional
Designed by Arthur Whitney
Developer Kx Systems
First appeared1993;30 years ago (1993)
Typing discipline dynamic, strong
Website kx.com
Influenced by
A+, APL, Scheme
Influenced
Q, Shakti

K is a proprietary array processing programming language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the foundation for kdb+, an in-memory, column-based database, and other related financial products. [1] The language, originally developed in 1993, is a variant of APL and contains elements of Scheme. Advocates of the language emphasize its speed, facility in handling arrays, and expressive syntax. [2]

Contents

History

Before developing K, Arthur Whitney had worked extensively with APL, first at I. P. Sharp Associates alongside Ken Iverson and Roger Hui, and later at Morgan Stanley developing financial applications. At Morgan Stanley, Whitney helped to develop A+, a variant of APL, to facilitate migrating APL applications from IBM mainframe computers to a network of Sun workstations. A+ had a smaller set of primitive functions and was designed for speed and to handle large sets of time series data. [3]

In 1993, Whitney left Morgan Stanley and developed the first version of the K language. At the same time he formed Kx Systems to commercialize the product and signed an exclusive contract with Union Bank of Switzerland (UBS). For the next four years he developed various financial and trading applications using K for UBS.

The contract ended in 1997 when UBS merged with Swiss Bank. In 1998, Kx Systems released kdb+, a database built on K. kdb was an in-memory, column-oriented database and included ksql, a query language with an SQL-like syntax. Since then, several financial products have been developed with K and kdb+. kdb+/tick and kdb+/taq were developed in 2001. kdb+, a 64-bit version of kdb+ was released in 2003 and kdb+/tick and kdb+/taq were released in 2004. kdb+ included Q, a language that merged the functions of the underlying K language and ksql. [4]

Whitney released a derivative of K called Shakti in 2018. [5]

Overview

K shares key features with APL. They are both interpreted, interactive languages noted for concise and expressive syntax. They have simple rules of precedence based on right to left evaluation. The languages contain a rich set of primitive functions designed for processing arrays. These primitive functions include mathematical operations that work on arrays as whole data objects, and array operations, such as sorting or reversing the order of an array. In addition, the language contains special operators that combine with primitive functions to perform types of iteration and recursion. As a result, complex and extended transformations of a dataset can be expressed as a chain of sub-expressions, with each link performing a segment of the calculation and passing the results to the next link in the chain.

Like APL, the primitive functions and operators are represented by single or double characters; however, unlike APL, K restricts itself to the ASCII character set (as does another APL variant, J). To allow for this, the set of primitive functions for K is smaller and heavily overloaded, with each of the ASCII symbols representing two or more distinct functions or operations. In a given expression, the actual function referenced is determined by the context. As a result, K expressions can be opaque and difficult to parse for humans. For example, in the following contrived expression the exclamation point ! refers to three distinct functions:

2!!7!4

Reading from right to left the first ! is modulo division that is performed on 7 and 4 resulting in 3. The next ! is enumeration and lists the integers less than 3, resulting in the list 0 1 2. The final ! is rotation where the list on the right is rotated two times to the left producing the final result of 2 0 1.

The second core distinction of K is that functions are first-class objects, a concept borrowed from Scheme. First-class functions can be used in the same contexts where a data value can be used. Functions can be specified as anonymous expressions and used directly with other expressions. Function expressions are specified in K using curly brackets. For example, in the following expression a quadratic expression is defined as a function and applied to the values 0 1 2 and 3:

{(3*x^2)+(2*x)+1}'!4

In K, named functions are simply function expressions stored to a variable in the same way any data value is stored to a variable.

a:25f:{(x^2)-1}

Functions can be passed as an argument to another function or returned as a result from a function.

Examples

K is an interpreted language where every statement is evaluated and its results displayed immediately. Literal expressions such as strings evaluate to themselves. Consequently, the Hello world-program is trivial:

"Hello world!" 

The following expression sorts a list of strings by their lengths:

x@>#:'x

The expression is evaluated from right to left as follows:

  1. #:'x returns the length of each word in the list x.
  2. > returns the indices that would sort a list of values in descending order.
  3. @ uses the integer values on the right to index into the original list of strings.

A function to determine if a number is prime can be written as:

{&/x!/:2_!x}

The function is evaluated from right to left:

  1. !x enumerate the positive integers less than x.
  2. 2_ drops the first two elements of the enumeration (0 and 1).
  3. x!/: performs modulo division between the original integer and each value in the truncated list.
  4. &/ find the minimum value of the list of modulo result.

If x is not prime then one of the values returned by the modulo operation will be 0 and consequently the minimal value of the list. If x is prime then the minimal value will be 1, because x mod 2 is 1 for any prime greater than 2.

The function below can be used to list all of the prime numbers between 1 and R with:

2_&{&/x!/:2_!x}'!R

The expression is evaluated from right to left

  1. !R enumerate the integers less than R.
  2. ' apply each value of the enumeration to the prime number function on the left. This will return a list of 0's and 1's.
  3. & return the indices of the list where the value is 1.
  4. 2_ drop the first two elements of the enumeration (0 and 1)

K financial products

K is the foundation for a family of financial products. Kdb+ is an in-memory, column-based database with much of the same functions of a relational database management system. The database supports SQL, SQL-92 and ksql, a query language with a syntax similar to SQL and designed for column based queries and array analysis.

Kdb+ is available for several operating systems, including Solaris, Linux, macOS, and Windows (32-bit or 64-bit).

See also

Related Research Articles

<span class="mw-page-title-main">APL (programming language)</span> Functional programming language for arrays

APL is a programming language developed in the 1960s by Kenneth E. Iverson. Its central datatype is the multidimensional array. It uses a large range of special graphic symbols to represent most functions and operators, leading to very concise code. It has been an important influence on the development of concept modeling, spreadsheets, functional programming, and computer math packages. It has also inspired several other programming languages.

<span class="mw-page-title-main">J (programming language)</span> Programming language

The J programming language, developed in the early 1990s by Kenneth E. Iverson and Roger Hui, is an array programming language based primarily on APL.

In computer programming, the interpreter pattern is a design pattern that specifies how to evaluate sentences in a language. The basic idea is to have a class for each symbol in a specialized computer language. The syntax tree of a sentence in the language is an instance of the composite pattern and is used to evaluate (interpret) the sentence for a client. See also Composite pattern.

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In computer science, a generator is a routine that can be used to control the iteration behaviour of a loop. All generators are also iterators. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.

<span class="mw-page-title-main">Foreach loop</span> Control flow statement for traversing items in a collection

In computer programming, foreach loop is a control flow statement for traversing items in a collection. foreach is usually used in place of a standard for loop statement. Unlike other for loop constructs, however, foreach loops usually maintain no explicit counter: they essentially say "do this to everything in this set", rather than "do this x times". This avoids potential off-by-one errors and makes code simpler to read. In object-oriented languages, an iterator, even if implicit, is often used as the means of traversal.

In computer science, the Boolean is a data type that has one of two possible values which is intended to represent the two truth values of logic and Boolean algebra. It is named after George Boole, who first defined an algebraic system of logic in the mid 19th century. The Boolean data type is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a programmer-specified Boolean condition evaluates to true or false. It is a special case of a more general logical data type—logic does not always need to be Boolean.

A bit array is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level parallelism in hardware to perform operations quickly. A typical bit array stores kw bits, where w is the number of bits in the unit of storage, such as a byte or word, and k is some nonnegative integer. If w does not divide the number of bits to be stored, some space is wasted due to internal fragmentation.

<span class="mw-page-title-main">Arthur Whitney (computer scientist)</span> Canadian computer scientist

Arthur Whitney is a Canadian computer scientist most notable for developing three programming languages inspired by APL: A+, k, and q, and for co-founding the U.S. companies Kx Systems and Shakti Software.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

The programming language APL is distinctive in being symbolic rather than lexical: its primitives are denoted by symbols, not words. These symbols were originally devised as a mathematical notation to describe algorithms. APL programmers often assign informal names when discussing functions and operators but the core functions and operators provided by the language are denoted by non-textual symbols.

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

PLANC is a high-level programming language.

Language Integrated Query is a Microsoft .NET Framework component that adds native data querying capabilities to .NET languages, originally released as a major part of .NET Framework 3.5 in 2007.

This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.

List comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists. It follows the form of the mathematical set-builder notation as distinct from the use of map and filter functions.

Q is a programming language for array processing, developed by Arthur Whitney. It is proprietary software, commercialized by Kx Systems. Q serves as the query language for kdb+, a disk based and in-memory, column-based database. Kdb+ is based on the language k, a terse variant of the language APL. Q is a thin wrapper around k, providing a more readable, English-like interface. One of the use cases is financial time series analysis, as one could do inexact time matches. An example is to match the a bid and the ask before that. Both timestamps slightly differ and are matched anyway.

kdb+ is a column-based relational time series database (TSDB) with in-memory (IMDB) abilities, developed and marketed by KX. The database is commonly used in high-frequency trading (HFT) to store, analyze, process, and retrieve large data sets at high speed. kdb+ has the ability to handle billions of records and analyzes data within a database. The database is available in 32-bit and 64-bit versions for several operating systems. Financial institutions use kdb+ to analyze time series data such as stock or commodity exchange data. The database has also been used for other time-sensitive data applications including commodity markets such as energy trading, telecommunications, sensor data, log data, machine and computer network usage monitoring along with real time analytics in Formula One racing.

The syntax of the SQL programming language is defined and maintained by ISO/IEC SC 32 as part of ISO/IEC 9075. This standard is not freely available. Despite the existence of the standard, SQL code is not completely portable among different database systems without adjustments.

A direct function is an alternative way to define a function and operator in the programming language APL. A direct operator can also be called a dop. They were invented by John Scholes in 1996. They are a unique combination of array programming, higher-order function, and functional programming, and are a major distinguishing advance of early 21st century APL over prior versions.

References

  1. "Kx Systems".
  2. Iverson, Kenneth. "Notation as a Tool of Thought". Archived from the original on 2013-09-20. Retrieved 2015-02-23.
  3. "arthur bio and interview".
  4. Garland, Simon (December 28, 2004), Q Language Widening the Appeal of Vectors, Vector UK, archived from the original on January 1, 2007
  5. "Shakti".