Pseudocode

Last updated

In computer science, pseudocode is a description of the steps in an algorithm using a mix of conventions of programming languages (like assignment operator, conditional operator, loop) with informal, usually self-explanatory, notation of actions and conditions. [1] [2] Although pseudocode shares features with regular programming languages, it is intended for human reading rather than machine control. Pseudocode typically omits details that are essential for machine implementation of the algorithm. The programming language is augmented with natural language description details, where convenient, or with compact mathematical notation. The purpose of using pseudocode is that it is easier for people to understand than conventional programming language code, and that it is an efficient and environment-independent description of the key principles of an algorithm. It is commonly used in textbooks and scientific publications to document algorithms and in planning of software and other algorithms.

Contents

No broad standard for pseudocode syntax exists, as a program in pseudocode is not an executable program; however, certain limited standards exist (such as for academic assessment). Pseudocode resembles skeleton programs, which can be compiled without errors. Flowcharts, drakon-charts and Unified Modelling Language (UML) charts can be thought of as a graphical alternative to pseudocode, but need more space on paper. Languages such as HAGGIS bridge the gap between pseudocode and code written in programming languages.

Application

Textbooks and scientific publications related to computer science and numerical computation often use pseudocode in description of algorithms, so that all programmers can understand them, even if they do not all know the same programming languages. In textbooks, there is usually an accompanying introduction explaining the particular conventions in use. The level of detail of the pseudocode may in some cases approach that of formalized general-purpose languages.

A programmer who needs to implement a specific algorithm, especially an unfamiliar one, will often start with a pseudocode description, and then "translate" that description into the target programming language and modify it to interact correctly with the rest of the program. Programmers may also start a project by sketching out the code in pseudocode on paper before writing it in its actual language, as a top-down structuring approach, with a process of steps to be followed as a refinement.

The pseudocode is widely used in standardization, with the notable examples provided by the MPEG standards that make a very heavy use of C-like pseudocode and cannot be understood without grasping the details of the code. [3]

Syntax

Pseudocode generally does not actually obey the syntax rules of any particular language; there is no systematic standard form. Some writers borrow style and syntax from control structures from some conventional programming language, although this is discouraged. [4] [5] Some syntax sources include Fortran, Pascal, BASIC, C, C++, Java, Lisp, and ALGOL. Variable declarations are typically omitted. Function calls and blocks of code, such as code contained within a loop, are often replaced by a one-line natural language sentence.

Depending on the writer, pseudocode may therefore vary widely in style, from a near-exact imitation of a real programming language at one extreme, to a description approaching formatted prose at the other.

An example of pseudocode (for the mathematical game fizz buzz)

Pascal style:

procedurefizzbuzz;fori:=1to100doprint_number:=true;ifiisdivisibleby3thenbeginprint"Fizz";print_number:=neutral;end;ifiisdivisibleby5thenbeginprint"Buzz";print_number:=false;end;ifprint_number,printi;printanewline;end

C style:

fizzbuzz(){for(i=1;i<=100;i++){print_number=true;if(iisdivisibleby3){print"Fizz";print_number=false;}if(iisdivisibleby5){print"Buzz";print_number=false;}if(print_number)printi;printanewline;}}

Python style:

deffizzbuzz():foriinrange(1,101):print_number=trueifiisdivisibleby3:print"Fizz"print_number=falseifiisdivisibleby5:print"Buzz"print_number=falseifprint_number:printiprintanewline

Mathematical style pseudocode

In numerical computation, pseudocode often consists of mathematical notation, typically from matrix and set theory, mixed with the control structures of a conventional programming language, and perhaps also natural language descriptions. This is a compact and often informal notation that can be understood by a wide range of mathematically trained people, and is frequently used as a way to describe mathematical algorithms. For example, the sum operator (capital-sigma notation) or the product operator (capital-pi notation) may represent a for-loop and a selection structure in one expression:

Return 

Normally non-ASCII typesetting is used for the mathematical equations, for example by means of markup languages, such as TeX or MathML, or proprietary formula editors.

Mathematical style pseudocode is sometimes referred to as pidgin code, for example pidgin ALGOL (the origin of the concept), pidgin Fortran , pidgin BASIC , pidgin Pascal , pidgin C , and pidgin Lisp .

Common mathematical symbols

Type of operationSymbolExample
Assignment← or :=c ← 2πr, c := 2πr
Comparison=, ≠, <, >, ≤, ≥
Arithmetic+, −, ×, /, mod
Floor/ceiling⌊, ⌋, ⌈, ⌉a ← ⌊b⌋ + ⌈c
Logicaland, or
Sums, productsΣ Πh ← ΣaA 1/a

Example

Here follows a longer example of mathematical-style pseudocode, for the Ford–Fulkerson algorithm:

algorithm ford-fulkerson isinput: Graph G with flow capacity c,             source node s,             sink node toutput: Flow f such that f is maximal from s to t(Note that f(u,v) is the flow from node u to node v, and c(u,v) is the flow capacity from node u to node v)for each edge (u, v) inGEdof(u, v) ← 0         f(v, u) ← 0      while there exists a path p from s to tin the residual network Gfdo         let cf be the flow capacity of the residual network Gfcf(p) ← min{cf(u, v) | (u, v) inp}         for each edge (u, v) inpdof(u, v)f(u, v) + cf(p)             f(v, u) ← −f(u, v)returnf

Machine compilation of pseudocode style languages

Natural language grammar in programming languages

Various attempts to bring elements of natural language grammar into computer programming have produced programming languages such as HyperTalk, Lingo, AppleScript, SQL, Inform, and to some extent Python. In these languages, parentheses and other special characters are replaced by prepositions, resulting in quite verbose code. These languages are typically dynamically typed, meaning that variable declarations and other boilerplate code can be omitted. Such languages may make it easier for a person without knowledge about the language to understand the code and perhaps also to learn the language. However, the similarity to natural language is usually more cosmetic than genuine. The syntax rules may be just as strict and formal as in conventional programming, and do not necessarily make development of the programs easier.

Mathematical programming languages

An alternative to using mathematical pseudocode (involving set theory notation or matrix operations) for documentation of algorithms is to use a formal mathematical programming language that is a mix of non-ASCII mathematical notation and program control structures. Then the code can be parsed and interpreted by a machine.

Several formal specification languages include set theory notation using special characters. Examples are:

Some array programming languages include vectorized expressions and matrix operations as non-ASCII formulas, mixed with conventional control structures. Examples are:

See also

Related Research Articles

<span class="mw-page-title-main">Algorithm</span> Sequence of operations for a task

In mathematics and computer science, an algorithm is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing calculations and data processing. More advanced algorithms can use conditionals to divert the code execution through various routes and deduce valid inferences, achieving automation eventually. Using human characteristics as descriptors of machines in metaphorical ways was already practiced by Alan Turing with terms such as "memory", "search" and "stimulus".

<span class="mw-page-title-main">ALGOL</span> Family of programming languages

ALGOL is a family of imperative computer programming languages originally developed in 1958. ALGOL heavily influenced many other languages and was the standard method for algorithm description used by the Association for Computing Machinery (ACM) in textbooks and academic sources for more than thirty years.

Computer programming or coding is the composition of sequences of instructions, called programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of procedures, by writing code in one or more programming languages. Programmers typically use high-level programming languages that are more easily intelligible to humans than machine code, which is directly executed by the central processing unit. Proficient programming usually requires expertise in several different subjects, including knowledge of the application domain, details of programming languages and generic code libraries, specialized algorithms, and formal logic.

<span class="mw-page-title-main">Programming language</span> Language for communicating instructions to a machine

A programming language is a system of notation for writing computer programs.

<span class="mw-page-title-main">String (computer science)</span> Sequence of characters, data type

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed. A string is generally considered as a data type and is often implemented as an array data structure of bytes that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence data types and structures.

In computer science, Backus–Naur form is a notation used to describe the syntax of programming languages or other formal languages. It was developed by John Backus and Peter Naur. BNF can be described as a metasyntax notation for context-free grammars. Backus–Naur form is applied wherever exact descriptions of languages are needed, such as in official language specifications, in manuals, and in textbooks on programming language theory. BNF can be used to describe document formats, instruction sets, and communication protocols.

Iteration is the repetition of a process in order to generate a sequence of outcomes. Each repetition of the process is a single iteration, and the outcome of each iteration is then the starting point of the next iteration.

A modeling language is any artificial language that can be used to express data, information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure Programing language.

<span class="mw-page-title-main">Visual programming language</span> Programming language written graphically by a user

In computing, a visual programming language or block coding is a programming language that lets users create programs by manipulating program elements graphically rather than by specifying them textually. A VPL allows programming with visual expressions, spatial arrangements of text and graphic symbols, used either as elements of syntax or secondary notation. For example, many VPLs are based on the idea of "boxes and arrows", where boxes or other screen objects are treated as entities, connected by arrows, lines or arcs which represent relations.

<span class="mw-page-title-main">Flowchart</span> Diagram that represents a workflow or process

A flowchart is a type of diagram that represents a workflow or process. A flowchart can also be defined as a diagrammatic representation of an algorithm, a step-by-step approach to solving a task.

The vertical bar, |, is a glyph with various uses in mathematics, computing, and typography. It has many names, often related to particular meanings: Sheffer stroke, pipe, bar, or, vbar, and others.

Skeleton programming is a style of computer programming based on simple high-level program structures and so called dummy code. Program skeletons resemble pseudocode, but allow parsing, compilation and testing of the code. Dummy code is inserted in a program skeleton to simulate processing and avoid compilation error messages. It may involve empty function declarations, or functions that return a correct result only for a simple test case where the expected response of the code is known.

In computer programming, pidgin code is a mixture of several programming languages in the same program, or pseudocode that is a mixture of a programming language with natural language descriptions. Hence the name: the mixture is a programming language analogous to a pidgin in natural languages.

The structured program theorem, also called the Böhm–Jacopini theorem, is a result in programming language theory. It states that a class of control-flow graphs can compute any computable function if it combines subprograms in only three specific ways. These are

  1. Executing one subprogram, and then another subprogram (sequence)
  2. Executing one of two subprograms according to the value of a boolean expression (selection)
  3. Repeatedly executing a subprogram as long as a boolean expression is true (iteration)

Fortress is a discontinued experimental programming language for high-performance computing, created by Sun Microsystems with funding from DARPA's High Productivity Computing Systems project. One of the language designers was Guy L. Steele Jr., whose previous work includes Scheme, Common Lisp, and Java.

<span class="mw-page-title-main">Recursion (computer science)</span> Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

The following outline is provided as an overview of and topical guide to computer programming:

<span class="mw-page-title-main">Comment (computer programming)</span> Explanatory note in the source code of a computer program

In computer programming, a comment is a programmer-readable explanation or annotation in the source code of a computer program. They are added with the purpose of making the source code easier for humans to understand, and are generally ignored by compilers and interpreters. The syntax of comments in various programming languages varies considerably.

<span class="mw-page-title-main">DRAKON</span> Algorithm mapping tool

DRAKON is a free and open source algorithmic visual programming and modeling language developed as part of the defunct Soviet Union Buran space program in 1986 following the need in increase of software development productivity. The visual language provides a uniform way to represent processes in flowcharts.

Caret is the name used familiarly for the character ^ provided on most QWERTY keyboards by typing ⇧ Shift+6. The symbol has a variety of uses in programming and mathematics. The name "caret" arose from its visual similarity to the original proofreader's caret, a mark used in proofreading to indicate where a punctuation mark, word, or phrase should be inserted into a document. The formal ASCII standard (X3.64.1977) calls it a "circumflex".

References

  1. Reisig 2007, p. 23, Pseudocode Programs and Their Semantics.
  2. An often-repeated definition of pseudocode since at least 2003 is "a detailed yet readable description of what a computer program or algorithm must do, expressed in a formally-styled natural language"
  3. Mitchell et al. 1996, p. 105.
  4. McConnell, Steve (2004). Code Complete. Pearson Education. p. 54. ISBN   978-0-7356-1967-8. Avoid syntactic elements from the target programming language
  5. Invitation to Computer Science, 8th Edition by Schneider/Gersting, "Keep statements language independent" as quoted in this stackexchange question

Further reading