Jump threading

Last updated

In computing, jump threading is a compiler optimization of one jump directly to a second jump. If the second condition is a subset or inverse of the first, it can be eliminated, or threaded through the first jump. [1] This is easily done in a single pass through the program, following acyclic chained jumps until the compiler arrives at a fixed point.

Contents

Benefits

The primary benefit of jump threading is the reduction of the amount of dynamically executed jumps. This makes way for further optimizations as there is a decrease in the number of conditionals, which will improve performance. On average one can expect 2-3 instructions being omitted as a result from a successful removal of a runtime branch. [2]

Examples

The following pseudocode demonstrates when a jump may be threaded.

   10. a = SomeNumber();    20. IF a > 10 GOTO 50    ...    50. IF a > 0 GOTO 100    ...

The jump on line 50 will always be taken if the jump on line 20 is taken. Therefore, for as long as line 100 is within the reachable range of the jump (or the size of the jump doesn't matter), the jump on line 20 may safely be modified to jump directly to line 100.

Another example shows jump threading of 2 partial overlap conditions:

voidbaz(boolx,booly,boolz){if(x&&y)bar();if(y||z)foo();}

The above can be transformed into:

voidbaz(boolx,booly,boolz){if(x&&y){bar();gotojmp;}if(y||z){jmp:foo();}}

If the first branch is taken, x and y are both true (logical conjunction), hence evaluation of expression y || z is not needed (logical disjunction). Therefore, a jump to label jmp is performed. [2]

See also

Related Research Articles

A one-instruction set computer (OISC), sometimes referred to as an ultimate reduced instruction set computer (URISC), is an abstract machine that uses only one instruction – obviating the need for a machine language opcode. With a judicious choice for the single instruction and given arbitrarily many resources, an OISC is capable of being a universal computer in the same manner as traditional computers that have multiple instructions. OISCs have been recommended as aids in teaching computer architecture and have been used as computational models in structural computing research. The first carbon nanotube computer is a 1-bit one-instruction set computer.

Programming style, also known as code style, is a set of rules or guidelines used when writing the source code for a computer program. It is often claimed that following a particular programming style will help programmers read and understand source code conforming to the style, and help to avoid introducing errors.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Conditional (computer programming)</span> Control flow statement that executes code according to some condition(s)

In computer science, conditionals are programming language commands for handling decisions. Specifically, conditionals perform different computations or actions depending on whether a programmer-defined Boolean condition evaluates to true or false. In terms of control flow, the decision is always achieved by selectively altering the control flow based on some condition . Although dynamic dispatch is not usually classified as a conditional construct, it is another way to select between alternatives at runtime. Conditional statements are the checkpoints in the programe that determines behaviour according to situation.

In computer science, a union is a value that may have any of several representations or formats within the same position in memory; that consists of a variable that may hold such a data structure. Some programming languages support special data types, called union types, to describe such values and variables. In other words, a union type definition will specify which of a number of permitted primitive types may be stored in its instances, e.g., "float or long integer". In contrast with a record, which could be defined to contain both a float and an integer; in a union, there is only one value at any given time.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recursion is particularly useful, and is often easy to optimize in implementations.

In computer programming, unreachable code is part of the source code of a program which can never be executed because there exists no control flow path to the code from the rest of the program.

In computer science, function composition is an act or mechanism to combine simple functions to build more complicated ones. Like the usual composition of functions in mathematics, the result of each function is passed as the argument of the next, and the result of the last one is the result of the whole.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

<span class="mw-page-title-main">JavaScript syntax</span> Set of rules defining correctly structured programs

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

The C and C++ programming languages are closely related but have many significant differences. C++ began as a fork of an early, pre-standardized C, and was designed to be mostly source-and-link compatible with C compilers of the time. Due to this, development tools for the two languages are often integrated into a single product, with the programmer able to specify C or C++ as their source language.

In compiler optimization, escape analysis is a method for determining the dynamic scope of pointers – where in the program a pointer can be accessed. It is related to pointer analysis and shape analysis.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In computer programming, a branch table or jump table is a method of transferring program control (branching) to another part of a program using a table of branch or jump instructions. It is a form of multiway branch. The branch table construction is commonly used when programming in assembly language but may also be generated by compilers, especially when implementing optimized switch statements whose values are densely packed together.

setjmp.h is a header defined in the C standard library to provide "non-local jumps": control flow that deviates from the usual subroutine call and return sequence. The complementary functions setjmp and longjmp provide this functionality.

ALGOL 68-R was the first implementation of the Algorithmic Language ALGOL 68.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

In programming languages, a label is a sequence of characters that identifies a location within source code. In most languages, labels take the form of an identifier, often followed by a punctuation character. In many high-level languages, the purpose of a label is to act as the destination of a GOTO statement. In assembly language, labels can be used anywhere an address can. Also in Pascal and its derived variations. Some languages, such as Fortran and BASIC, support numeric labels. Labels are also used to identify an entry point into a compiled sequence of statements.

Nemerle is a general-purpose, high-level, statically typed programming language designed for platforms using the Common Language Infrastructure (.NET/Mono). It offers functional, object-oriented, aspect-oriented, reflective and imperative features. It has a simple C#-like syntax and a powerful metaprogramming system.

References

  1. "Optimize Options - Using the GNU Compiler Collection (GCC)".
  2. 1 2 "A gentle introduction to jump threading optimizations | Red Hat Developer". developers.redhat.com. Retrieved 2023-01-08.