Q (programming language from Kx Systems)

Last updated
q
Paradigm Array, functional
Designed by Arthur Whitney
Developer Kx Systems
First appeared2003 [1]
Stable release
4.0 / March 17, 2020;3 years ago (2020-03-17) [2]
Typing discipline Dynamic, strong
Website code.kx.com
Influenced by
A+, APL, Scheme, k

Q is a programming language for array processing, developed by Arthur Whitney. It is proprietary software, commercialized by Kx Systems. Q serves as the query language for kdb+, a disk based and in-memory, column-based database. Kdb+ is based on the language k, a terse variant of the language APL. Q is a thin wrapper around k, providing a more readable, English-like interface. One of the use cases is financial time series analysis, as one could do inexact time matches. An example is to match the a bid and the ask before that. Both timestamps slightly differ and are matched anyway. [3]

Contents

Overview

The fundamental building blocks of q are atoms, lists, and functions. Atoms are scalars and include the data types numeric, character, date, and time. Lists are ordered collections of atoms (or other lists) upon which the higher level data structures dictionaries and tables are internally constructed. A dictionary is a map of a list of keys to a list of values. A table is a transposed dictionary of symbol keys and equal length lists (columns) as values. A keyed table, analogous to a table with a primary key placed on it, is a dictionary where the keys and values are arranged as two tables.

The following code demonstrates the relationships of the data structures. Expressions to evaluate appear prefixed with the q) prompt, with the output of the evaluation shown beneath:

q)`john/ an atom of type symbol`johnq)50/ an atom of type integer50q)`john`jack/ a list of symbols`john`jackq)5060/ a list of integers5060q)`john`jack!5060/ a list of symbols and a list of integers combined to form a dictionaryjohn|50jack|60q)`name`age!(`john`jack;5060)/ an arrangement termed a column dictionaryname|johnjackage|5060q)flip`name`age!(`john`jack;5060)/ when transposed via the function "flip", the column dictionary becomes a tablenameage--------john50jack60q)(flip(enlist`name)!enlist`john`jack)!flip(enlist`age)!enlist5060/ two equal length tables combined as a dictionary become a keyed tablename|age----|---john|50jack|60

These entities are manipulated via functions, which include the built-in functions that come with Q (which are defined as K macros) and user-defined functions. Functions are a data type, and can be placed in lists, dictionaries and tables, or passed to other functions as parameters.

Examples

Like K, Q is interpreted and the result of the evaluation of an expression is immediately displayed, unless terminated with a semi-colon. The Hello world program is thus trivial:

q)"Hello world!""Hello world!"

The following expression sorts a list of strings stored in the variable x descending by their lengths:

x@idesccounteachx

The expression is evaluated from right to left as follows:

  1. "count each x" returns the length of each word in the list x.
  2. "idesc" returns the indices that would sort a list of values in descending order.
  3. @ use the integer values on the right to index into the original list of strings.

The factorial function can be implemented directly in Q as

{prd1+tilx}

or recursively as

{$[x=0;1;x*.z.s[x-1]]}

Note that in both cases the function implicitly takes a single argument called x - in general it is possible to use up to three implicit arguments, named x, y and z, or to give arguments local variable bindings explicitly.

In the direct implementation, the expression "til x" enumerates the integers from 0 to x-1, "1+" adds 1 to every element of the list and "prd" returns the product of the list.

In the recursive implementation, the syntax "$[condition; expr1; expr2]" is a ternary conditional - if the condition is true then expr1 is returned; otherwise expr2 is returned. The expression ".z.s" is loosely equivalent to 'this' in Java or 'self' in Python - it is a reference to the containing object, and enables functions in q to call themselves.

When x is an integer greater than 2, the following function will return 1 if it is a prime, otherwise 0:

{minxmod2_tilx}

The function is evaluated from right to left:

  1. "til x" enumerate the non-negative integers less than x.
  2. "2_" drops the first two elements of the enumeration (0 and 1).
  3. "x mod" performs modulo division between the original integer and each value in the truncated list.
  4. "min" find the minimum value of the list of modulo result.

The q programming language contains its own table query syntax called qSQL, which resembles traditional SQL but has important differences, mainly due to the fact that the underlying tables are oriented by column, rather than by row.

q)showt:([]name:`john`jack`jill`jane;age:50605020)/ define a simple table and assign to "t"nameage--------john50jack60jill50jane20
q)selectfromtwherenamelike"ja*",age>50nameage--------jack60q)selectrows:countibyagefromtage|rows---|----20|150|260|1

Related Research Articles

The SECD machine is a highly influential virtual machine and abstract machine intended as a target for functional programming language compilers. The letters stand for Stack, Environment, Control, Dump—the internal registers of the machine. The registers Stack, Control, and Dump point to stacks, and Environment points to an associative array.

<span class="mw-page-title-main">S-expression</span> Data serialization format

In computer programming, an S-expression is an expression in a like-named notation for nested list (tree-structured) data. S-expressions were invented for and popularized by the programming language Lisp, which uses them for source code as well as data.

In computer science, Backus–Naur form or Backus normal form (BNF) is a metasyntax notation for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats, instruction sets and communication protocols. It is applied wherever exact descriptions of languages are needed: for instance, in official language specifications, in manuals, and in textbooks on programming language theory.

In computer science, a set is an abstract data type that can store unique values, without any particular order. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.

Integer BASIC is a BASIC interpreter written by Steve Wozniak for the Apple I and Apple II computers. Originally available on cassette for the Apple I in 1976, then included in ROM on the Apple II from its release in 1977, it was the first version of BASIC used by many early home computer owners.

In computer programming, a parameter or a formal argument is a special kind of variable used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are the values of the arguments with which the subroutine is going to be called/invoked. An ordered list of parameters is usually included in the definition of a subroutine, so that, each time the subroutine is called, its arguments for that call are evaluated, and the resulting values can be assigned to the corresponding parameters.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Conditional (computer programming)</span> Control flow statement that executes code according to some condition(s)

In computer science, conditionals are programming language commands for handling decisions. Specifically, conditionals perform different computations or actions depending on whether a programmer-defined Boolean condition evaluates to true or false. In terms of control flow, the decision is always achieved by selectively altering the control flow based on some condition . Although dynamic dispatch is not usually classified as a conditional construct, it is another way to select between alternatives at runtime. Conditional statements are the checkpoints in the programe that determines behaviour according to situation.

An attribute grammar is a formal way to supplement a formal grammar with semantic information processing. Semantic information is stored in attributes associated with terminal and nonterminal symbols of the grammar. The values of attributes are result of attribute evaluation rules associated with productions of the grammar. Attributes allow to transfer information from anywhere in the abstract syntax tree to anywhere else, in a controlled and formal way.

<span class="mw-page-title-main">Foreach loop</span> Control flow statement for traversing items in a collection

In computer programming, foreach loop is a control flow statement for traversing items in a collection. foreach is usually used in place of a standard for loop statement. Unlike other for loop constructs, however, foreach loops usually maintain no explicit counter: they essentially say "do this to everything in this set", rather than "do this x times". This avoids potential off-by-one errors and makes code simpler to read. In object-oriented languages, an iterator, even if implicit, is often used as the means of traversal.

The SQL SELECT statement returns a result set of records, from one or more tables.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

K is a proprietary array processing programming language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the foundation for kdb+, an in-memory, column-based database, and other related financial products. The language, originally developed in 1993, is a variant of APL and contains elements of Scheme. Advocates of the language emphasize its speed, facility in handling arrays, and expressive syntax.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

<span class="mw-page-title-main">Null (SQL)</span> Marker used in SQL databases to indicate a value does not exist

In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfil the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL is a reserved word used to identify this marker.

A truth table is a mathematical table used in logic—specifically in connection with Boolean algebra, boolean functions, and propositional calculus—which sets out the functional values of logical expressions on each of their functional arguments, that is, for each combination of values taken by their logical variables. In particular, truth tables can be used to show whether a propositional expression is true for all legitimate input values, that is, logically valid.

The structure of the Perl programming language encompasses both the syntactical rules of the language and the general ways in which programs are organized. Perl's design philosophy is expressed in the commonly cited motto "there's more than one way to do it". As a multi-paradigm, dynamically typed language, Perl allows a great degree of flexibility in program design. Perl also encourages modularization; this has been attributed to the component-based design structure of its Unix roots, and is responsible for the size of the CPAN archive, a community-maintained repository of more than 100,000 modules.

<span class="mw-page-title-main">Yesod (web framework)</span>

Yesod is a free and open-source web framework based on Haskell for productive development of type-safe, REST model based, high performance web applications, developed by Michael Snoyman et al.

Tcl is a high-level, general-purpose, interpreted, dynamic programming language. It was designed with the goal of being very simple but powerful. Tcl casts everything into the mold of a command, even programming constructs like variable assignment and procedure definition. Tcl supports multiple programming paradigms, including object-oriented, imperative, functional, and procedural styles.

kdb+ is a column-based relational time series database (TSDB) with in-memory (IMDB) abilities, developed and marketed by KX. The database is commonly used in high-frequency trading (HFT) to store, analyze, process, and retrieve large data sets at high speed. kdb+ has the ability to handle billions of records and analyzes data within a database. The database is available in 32-bit and 64-bit versions for several operating systems. Financial institutions use kdb+ to analyze time series data such as stock or commodity exchange data. The database has also been used for other time-sensitive data applications including commodity markets such as energy trading, telecommunications, sensor data, log data, machine and computer network usage monitoring along with real time analytics in Formula One racing.

References

  1. "Q Language Widening the Appeal of Vectors". Archived from the original on January 1, 2007. Retrieved June 1, 2016.{{cite web}}: CS1 maint: unfit URL (link)
  2. "Changes in 4.0" (Press release). Palo Alto: Kx Systems. Mar 17, 2020. Retrieved Apr 15, 2020.
  3. "Q Reference Card" . Retrieved 15 April 2020.

Further reading