Code golf

Last updated January 24, 2025

Code golf is a type of recreational computer programming competition in which participants strive to achieve the shortest possible source code that solves a certain problem.^[1]^[2] Code golf challenges and tournaments may also be named with the programming language used (for example, Perl golf).

Etymology

The term "code golf" is derived from the similarity of its goal with that of conventional golf, where participants seek to achieve the lowest possible score, rather than the highest, as is the standard in most sports and game scoring systems. While conventional golf players try to minimize the number of club strokes needed to complete the course, code golfers strive to reduce the number of characters necessary to write the program.

History

The length of the shortest possible program that produces a given output (in any fixed programming language) is known as the Kolmogorov complexity of the output, and its mathematical study dates to the work of Andrey Kolmogorov in 1963. Code golf, however, can be more general than this, as it often specifies a general input-output transformation that must be performed rather than asking for a single output with no input.

Whilst the term "code golf" was apparently first used in 1999 with Perl,^[3] and later popularised through the use of Perl to write a program that performed RSA encryption,^[4] a similar informal competition is known to have been popular with earlier APL hackers. The challenging nature of aggressively optimizing for program size has itself long been recognized; for example, a 1962 coding manual for Regnecentralen's GIER computer notes that "it is a time-consuming sport to code with the least possible number of instructions" and recommends against it for practical programming.^[5] Today the term has grown to cover a wide variety of languages, which has even triggered the creation of dedicated golfing languages.

Dedicated golfing languages

Several new languages have been created specifically with code golfing in mind. Examples include GolfScript, Flogscript, Stuck, and Vyxal, which are Turing-complete languages that provide constructs for concisely expressing ideas in code. Because golfing languages compete for extreme brevity, their design sacrifices readability, which is important for practical production environments, and therefore they are often esoteric. Sometimes, however, a language is designed for a practical purpose but turns out to be suitable for code golf.

An example of GolfScript code to print 1000 digits of pi:^[6]

;'' 6666,-2%{2+.2/@*\/10.3??2*+}* `1000<~\;

Code golf websites include novel golfing languages created by users to win code golf challenges. Other popular languages include 05AB1E, Husk, Pyth, CJam and Jelly.

Types of code golf

Some code golf questions, such as those posed on general programming sites, may not require implementation in a specific programming language. However, this limits the style of problems that it is possible for the problem designers to pose (for example, by limiting the use of certain language features). In addition, the creation of such "open" questions has resulted in the design of code golf specific programming language dialects such as REBMU (a dialect of REBOL). Both online and live competitions may also include time limits.

Related Research Articles

In algorithmic information theory, the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963 and is a generalization of classical information theory.

In the computer science subfield of algorithmic information theory, a Chaitin constant or halting probability is a real number that, informally speaking, represents the probability that a randomly constructed program will halt. These numbers are formed from a construction due to Gregory Chaitin.

In computing, Common Gateway Interface (CGI) is an interface specification that enables web servers to execute an external program to process HTTP or HTTPS user requests.

A "Hello, World!" program is usually a simple computer program that emits to the screen a message similar to "Hello, World!". A small piece of code in most general-purpose programming languages, this program is used to illustrate a language's basic syntax. Such program is often the first written by a student of a new programming language, but such a program can also be used as a sanity check to ensure that the computer software intended to compile or run source code is correctly installed, and that its operator understands how to use it.

Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".

sed is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed was based on the scripting features of the interactive editor ed and the earlier qed. It was one of the earliest tools to support regular expressions, and remains in use for text processing, most notably with the substitution command. Popular alternative tools for plaintext string manipulation and "stream editing" include AWK and Perl.

An esoteric programming language is a programming language designed to test the boundaries of computer programming language design, as a proof of concept, as software art, as a hacking interface to another language, or as a joke. The use of the word esoteric distinguishes them from languages that working developers use to write software. The creators of most esolangs do not intend them to be used for mainstream programming, although some esoteric features, such as visuospatial syntax, have inspired practical applications in the arts. Such languages are often popular among hackers and hobbyists.

In computer science, a preprocessor is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.

A string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where, "foo" is a string literal with value foo. Methods such as escape sequences can be used to avoid the problem of delimiter collision and allow the delimiters to be embedded in a string. There are many alternate notations for specifying string literals especially in complicated cases. The exact notation depends on the programming language in question. Nevertheless, there are general guidelines that most modern programming languages follow.

In computing, a polyglot is a computer program or script written in a valid form of multiple programming languages or file formats. The name was coined by analogy to multilingualism. A polyglot file is composed by combining syntax from two or more different formats.

Minimum Description Length (MDL) is a model selection principle where the shortest description of the data is the best model. MDL methods learn through a data compression perspective and are sometimes described as mathematical applications of Occam's razor. The MDL principle can be extended to other forms of inductive inference and learning, for example to estimation and sequential prediction, without explicitly identifying a single model of the data.

In algorithmic information theory, algorithmic probability, also known as Solomonoff probability, is a mathematical method of assigning a prior probability to a given observation. It was invented by Ray Solomonoff in the 1960s. It is used in inductive inference theory and analyses of algorithms. In his general theory of inductive inference, Solomonoff uses the method together with Bayes' rule to obtain probabilities of prediction for an algorithm's future outputs.

dc is a cross-platform reverse-Polish calculator which supports arbitrary-precision arithmetic. It was written by Lorinda Cherry and Robert Morris at Bell Labs. It is one of the oldest Unix utilities, preceding even the invention of the C programming language. Like other utilities of that vintage, it has a powerful set of features but terse syntax. Traditionally, the bc calculator program was implemented on top of dc.

A delimiter is a sequence of one or more characters for specifying the boundary between separate, independent regions in plain text, mathematical expressions or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values. Another example of a delimiter is the time gap used to separate letters and words in the transmission of Morse code.

In some programming languages, eval, short for evaluate, is a function which evaluates a string as though it were an expression in the language, and returns a result; in others, it executes multiple lines of code as though they had been included instead of the line including the eval. The input to eval is not necessarily a string; it may be structured representation of code, such as an abstract syntax tree, or of special type such as code. The analog for a statement is exec, which executes a string as if it were a statement; in some languages, such as Python, both are present, while in other languages only one of either eval or exec is.

An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" IR must be accurate – capable of representing the source code without loss of information – and independent of any particular source or target language. An IR may take one of several forms: an in-memory data structure, or a special tuple- or stack-based code readable by the program. In the latter case it is also called an intermediate language.

Hard coding is the software development practice of embedding data directly into the source code of a program or other executable object, as opposed to obtaining the data from external sources or generating it at runtime.

Algorithmic information theory (AIT) is a branch of theoretical computer science that concerns itself with the relationship between computation and information of computably generated objects, such as strings or any other data structure. In other words, it is shown within algorithmic information theory that computational incompressibility "mimics" the relations or inequalities found in information theory. According to Gregory Chaitin, it is "the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker and shaking vigorously."

A webform, web form or HTML form on a web page allows a user to enter data that is sent to a server for processing. Forms can resemble paper or database forms because web users fill out the forms using checkboxes, radio buttons, or text fields. For example, forms can be used to enter shipping or credit card data to order a product, or can be used to retrieve search results from a search engine.

References

↑ Code Golf Stack Exchange. About code-golf. Retrieved 2021-12-21.
↑ "Introduction to Code-golf | ASSIST Software Romania" . Retrieved 2023-03-23.
↑ Greg Bacon (1999-05-28). "Re: Incrementing a value in a slice". Newsgroup: comp.lang.perl.misc. Usenet: 7imnti$mjh$1@info2.uah.edu . Retrieved 2011-07-12.
↑ Back, Adam. "RSA in 5 lines of perl" . Retrieved 2011-01-10.
↑ Andersen, Christian; Gram, Christian (1962). Lærebog i Kodning for GIER (PDF). Vol. 1 (3 ed.). Copenhagen: Regnecentralen. p. 104. Retrieved 2020-05-16.
↑ "GolfScript Examples" . Retrieved 2023-03-23.

External links

CodeGolf.StackExchange.com: Questions and answers on programming puzzles and code golf
Golf on the esoteric programming languages wiki
List of dedicated golfing languages
"A (mostly) complete list of languages designed for code-golfing" with URLs to their GitHub repos

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Code Golf Stack Exchange. About code-golf. Retrieved 2021-12-21.

[2] "Introduction to Code-golf | ASSIST Software Romania" . Retrieved 2023-03-23.

[perl-golf-coined-3] Greg Bacon (1999-05-28). "Re: Incrementing a value in a slice". Newsgroup: comp.lang.perl.misc. Usenet: 7imnti$mjh$1@info2.uah.edu . Retrieved 2011-07-12.

[rsa-4] Back, Adam. "RSA in 5 lines of perl" . Retrieved 2011-01-10.

[GIER-5] Andersen, Christian; Gram, Christian (1962). Lærebog i Kodning for GIER (PDF). Vol. 1 (3 ed.). Copenhagen: Regnecentralen. p. 104. Retrieved 2020-05-16.

[6] "GolfScript Examples" . Retrieved 2023-03-23.

[1]

[2]

[3]

[4]

[5]

[6]