Programming by permutation

Last updated

Programming by permutation, sometimes called "programming by accident" or "shotgunning", is an approach to software development wherein a programming problem is solved by iteratively making small changes (permutations) and testing each change to see if it behaves as desired. This approach sometimes seems attractive when the programmer does not fully understand the code and believes that one or more small modifications may result in code that is correct.

This tactic is not productive when:

Programming by permutation gives little or no assurance about the quality of the code produced—it is the polar opposite of formal verification.

Programmers are often compelled to program by permutation when an API is insufficiently documented. This lack of clarity drives others to copy and paste from reference code which is assumed to be correct, but was itself written as a result of programming by permutation.

In some cases where the programmer can logically explain that exactly one out of a small set of variations must work, programming by permutation leads to correct code (which then can be verified) and makes it unnecessary to think about the other (wrong) variations.

Example

For example, the following code sample in C (intended to find and copy a series of digits from a larger string) has several problems:

#include<stdio.h>#include<string.h>#include<ctype.h>intmain(void){constchar*buffer="123abc";chardestination[10];inti=0;intj=0;intl=strlen(buffer);while(i<l){if(isdigit(buffer[i])){destination[j++]=buffer[i++];}++i;}destination[j]='\0';printf("%s\n",destination);}

First of all, it doesn't give the right answer. With the given starting string, it produces the output "13", when the correct answer is "123". A programmer who does not recognize the structural problems may seize on one statement, saying "ah, there's an extra increment". The line "++i" is removed; but testing the code results in an infinite loop. "Oops, wrong increment." The former statement is added back, and the line above it is changed to remove the post-increment of variable i:

if(isdigit(buffer[i])){destination[j++]=buffer[i];}

Testing the code now produces the correct answer, "123". The programmer sighs in contentment: "There, that did it. All finished now." Additional testing with various other input strings bears out this conclusion.

Of course, other problems remain. Because the programmer does not take the trouble to fully understand the code, they go unrecognized:

While the solution is correct for a limited set of input strings, it is not fully correct, and since the programmer did not bother to understand the code, the errors will not be discovered until a later testing stage, if at all.

Also known as "Trial and Error", "Generate and Test", "Poke and Hope", [1] "The Birdshot Method" and the "Million Monkeys Programming Style".

Related Research Articles

The Cyclone programming language is intended to be a safe dialect of the C language. Cyclone is designed to avoid buffer overflows and other vulnerabilities that are possible in C programs, without losing the power and convenience of C as a tool for system programming.

In computing, a segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory. On standard x86 computers, this is a form of general protection fault. The operating system kernel will, in response, usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal. Processes can in some cases install a custom signal handler, allowing them to recover on their own, but otherwise the OS default signal handler is used, generally causing abnormal termination of the process, and sometimes a core dump.

OCaml is a general-purpose, high-level multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez, and others.

In computer programming, an infinite loop is a sequence of instructions that, as written, will continue endlessly, unless an external intervention occurs. It may be intentional.

Defensive programming is a form of defensive design intended to develop programs that are capable of detecting potential security abnormalities and make predetermined responses. It ensures the continuing function of a piece of software under unforeseen circumstances. Defensive programming practices are often used where high availability, safety, or security is needed.

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

<span class="mw-page-title-main">Code injection</span> Computer bug exploit caused by invalid data

Code injection is the exploitation of a computer bug that is caused by processing invalid data. The injection is used by an attacker to introduce code into a vulnerable computer program and change the course of execution. The result of successful code injection can be disastrous, for example, by allowing computer viruses or computer worms to propagate.

In computing, an uninitialized variable is a variable that is declared but is not set to a definite known value before it is used. It will have some value, but not a predictable one. As such, it is a programming error and a common source of bugs in software.

In some programming languages, const is a type qualifier that indicates that the data is read-only. While this can be used to declare constants, const in the C family of languages differs from similar constructs in other languages in being part of the type, and thus has complicated behavior when combined with pointers, references, composite data types, and type-checking. In other languages, the data is not in a single memory location, but copied at compile time on each use. Languages which utilize it include C, C++, D, JavaScript, Julia, and Rust.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

In number theory, a narcissistic number in a given number base is a number that is the sum of its own digits each raised to the power of the number of digits.

A scanf format string is a control parameter used in various functions to specify the layout of an input string. The functions can then divide the string and translate into values of appropriate data types. String scanning functions are often supplied in standard libraries.Scanf is a function that reads formatted data from the standard input string, which is usually the keyboard and writes the results whenever called in the specified arguments.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

The Luhn mod N algorithm is an extension to the Luhn algorithm that allows it to work with sequences of values in any even-numbered base. This can be useful when a check digit is required to validate an identification string composed of letters, a combination of letters and digits or any arbitrary set of N characters where N is divisible by 2.

In computing, the producer-consumer problem is a family of problems described by Edsger W. Dijkstra since 1965.

<span class="mw-page-title-main">Secure coding</span> Software development methodology

Secure coding is the practice of developing computer software in such a way that guards against the accidental introduction of security vulnerabilities. Defects, bugs and logic flaws are consistently the primary cause of commonly exploited software vulnerabilities. Through the analysis of thousands of reported vulnerabilities, security professionals have discovered that most vulnerabilities stem from a relatively small number of common software programming errors. By identifying the insecure coding practices that lead to these errors and educating developers on secure alternatives, organizations can take proactive steps to help significantly reduce or eliminate vulnerabilities in software before deployment.

FastCode is an open source programming project aimed at providing enhanced runtime library routines for Embarcadero Delphi and C++ Builder. Since it was started in 2003 by Dennis Kjaer Christensen, it has contributed highly optimised functionality to the 32-bit Delphi runtime library (RTL). FastCode is unique among contributions to commercial compiler runtime libraries for its community-driven and open source nature.

The write is one of the most basic routines provided by a Unix-like operating system kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to output data from a program by directly using a system call. The destination is identified by a numeric code. The data to be written, for instance a piece of text, is defined by a pointer and a size, given in number of bytes.

References

  1. Catania, Anthony (1987). ""ever, although there are some algebraic skills, topics on factorization, rational."". Mathematics and Computer Education . University of Michigan. 21: 74.