Dependence analysis

Last updated

In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. Broadly speaking, a statement S2 depends on S1 if S1 must be executed before S2. Broadly, there are two classes of dependencies--control dependencies and data dependencies .

Contents

Dependence analysis determines whether it is safe to reorder or parallelize statements.

Control dependencies

Control dependency is a situation in which a program instruction executes if the previous instruction evaluates in a way that allows its execution.

A statement S2 is control dependent on S1 (written ) if and only if S2's execution is conditionally guarded by S1. S2 is control dependent on S1 if and only if where is the post dominance frontier of statement . The following is an example of such a control dependence:

S1       if x > 2 goto L1 S2       y := 3    S3   L1: z := y + 1

Here, S2 only runs if the predicate in S1 is false.

Data dependencies

A data dependence arises from two statements which access or modify the same resource.

Flow(True) dependence

A statement S2 is flow dependent on S1 (written ) if and only if S1 modifies a resource that S2 reads and S1 precedes S2 in execution. The following is an example of a flow dependence (RAW: Read After Write):

S1       x := 10 S2       y := x + c

Antidependence

A statement S2 is antidependent on S1 (written ) if and only if S2 modifies a resource that S1 reads and S1 precedes S2 in execution. The following is an example of an antidependence (WAR: Write After Read):

S1       x := y + c S2       y := 10

Here, S2 sets the value of y but S1 reads a prior value of y.

Output dependence

A statement S2 is output dependent on S1 (written ) if and only if S1 and S2 modify the same resource and S1 precedes S2 in execution. The following is an example of an output dependence (WAW: Write After Write):

S1       x := 10 S2       x := 20

Here, S2 and S1 both set the variable x.

Input dependence

A statement S2 is input dependent on S1 (written ) if and only if S1 and S2 read the same resource and S1 precedes S2 in execution. The following is an example of an input dependence (RAR: Read-After-Read):

S1       y := x + 3 S2       z := x + 5

Here, S2 and S1 both access the variable x. This dependence does not prohibit reordering.

Loop dependencies

The problem of computing dependencies within loops, which is a significant and nontrivial problem, is tackled by loop dependence analysis, which extends the dependence framework given here.

See also

Further reading

Related Research Articles

In electrical engineering, the Y-Δ transform, also written wye-delta and also known by many other names, is a mathematical technique to simplify the analysis of an electrical network. The name derives from the shapes of the circuit diagrams, which look respectively like the letter Y and the Greek capital letter Δ. This circuit transformation theory was published by Arthur Edwin Kennelly in 1899. It is widely used in analysis of three-phase electric power circuits.

In the domain of central processing unit (CPU) design, hazards are problems with the instruction pipeline in CPU microarchitectures when the next instruction cannot execute in the following clock cycle, and can potentially lead to incorrect computation results. Three common types of hazards are data hazards, structural hazards, and control hazards.

In the fields of databases and transaction processing, a schedule of a system is an abstract model to describe execution of transactions running in the system. Often it is a list of operations (actions) ordered by time, performed by a set of transactions that are executed together in the system. If the order in time between certain operations is not determined by the system, then a partial order is used. Examples of such operations are requesting a read operation, reading, writing, aborting, committing, requesting a lock, locking, etc. Not all transaction operation types should be included in a schedule, and typically only selected operation types are included, as needed to reason about and describe certain phenomena. Schedules and schedule properties are fundamental concepts in database concurrency control theory.

The Lotka–Volterra equations, also known as the Lotka–Volterra predator–prey model, are a pair of first-order nonlinear differential equations, frequently used to describe the dynamics of biological systems in which two species interact, one as a predator and the other as prey. The populations change through time according to the pair of equations:

<span class="mw-page-title-main">Deterministic finite automaton</span> Finite-state machine

In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run. In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

In computing, a memory barrier, also known as a membar, memory fence or fence instruction, is a type of barrier instruction that causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued before and after the barrier instruction. This typically means that operations issued prior to the barrier are guaranteed to be performed before operations issued after the barrier.

Data-flow analysis is a technique for gathering information about the possible set of values calculated at various points in a computer program. A program's control-flow graph (CFG) is used to determine those parts of a program to which a particular value assigned to a variable might propagate. The information gathered is often used by compilers when optimizing a program. A canonical example of a data-flow analysis is reaching definitions.

In mathematics, the total derivative of a function f at a point is the best linear approximation near this point of the function with respect to its arguments. Unlike partial derivatives, the total derivative approximates the function with respect to all of its arguments, not just a single one. In many situations, this is the same as considering all partial derivatives simultaneously. The term "total derivative" is primarily used when f is a function of several variables, because when f is a function of a single variable, the total derivative is the same as the ordinary derivative of the function.

In control theory, a time-invariant (TI) system has a time-dependent system function that is not a direct function of time. Such systems are regarded as a class of systems in the field of system analysis. The time-dependent system function is a function of the time-dependent input function. If this function depends only indirectly on the time-domain, then that is a system that would be considered time-invariant. Conversely, any direct dependence on the time-domain of the system function could be considered as a "time-varying system".

In compiler theory, a reaching definition for a given instruction is an earlier instruction whose target variable can reach the given one without an intervening assignment. For example, in the following code:

d1 : y := 3 d2 : x := y

In compiler theory, loop optimization is the process of increasing execution speed and reducing the overheads associated with loops. It plays an important role in improving cache performance and making effective use of parallel processing capabilities. Most execution time of a scientific program is spent on loops; as such, many compiler optimization techniques have been developed to make them faster.

In computer science, loop dependence analysis is a process which can be used to find dependencies within iterations of a loop with the goal of determining different relationships between statements. These dependent relationships are tied to the order in which different statements access memory locations. Using the analysis of these relationships, execution of the loop can be organized to allow multiple processors to work on different portions of the loop in parallel. This is known as parallel processing. In general, loops can consume a lot of processing time when executed as serial code. Through parallel processing, it is possible to reduce the total execution time of a program through sharing the processing load among multiple processors.

Automatic parallelization, also auto parallelization, or autoparallelization refers to converting sequential code into multi-threaded and/or vectorized code in order to use multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. Fully automatic parallelization of sequential programs is a challenge because it requires complex program analysis and the best approach may depend upon parameter values that are not known at compilation time.

Difference in differences is a statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using observational study data, by studying the differential effect of a treatment on a 'treatment group' versus a 'control group' in a natural experiment. It calculates the effect of a treatment on an outcome by comparing the average change over time in the outcome variable for the treatment group to the average change over time for the control group. Although it is intended to mitigate the effects of extraneous factors and selection bias, depending on how the treatment group is chosen, this method may still be subject to certain biases.

A data dependency in computer science is a situation in which a program statement (instruction) refers to the data of a preceding statement. In compiler theory, the technique used to discover data dependencies among statements is called dependence analysis.

In compiler theory, a greatest common divisor test is the test used in study of loop optimization and loop dependence analysis to test the dependency between loop statements.

Loop-level parallelism is a form of parallelism in software programming that is concerned with extracting parallel tasks from loops. The opportunity for loop-level parallelism often arises in computing programs where data is stored in random access data structures. Where a sequential program will iterate over the data structure and operate on indices one at a time, a program exploiting loop-level parallelism will use multiple threads or processes which operate on some or all of the indices at the same time. Such parallelism provides a speedup to overall execution time of the program, typically in line with Amdahl's law.

An economic input-output life-cycle assessment, or EIO-LCA involves the use of aggregate sector-level data to quantify the amount of environmental impact that can be directly attributed to each sector of the economy and how much each sector purchases from other sectors in producing its output. Combining such data sets can enable accounting for long chains, which somewhat alleviates the scoping problem of traditional life-cycle assessments. EIO-LCA analysis traces out the various economic transactions, resource requirements and environmental emissions required for producing a particular product or service.

In logic, a quantifier is an operator that specifies how many individuals in the domain of discourse satisfy an open formula. For instance, the universal quantifier in the first order formula expresses that everything in the domain satisfies the property denoted by . On the other hand, the existential quantifier in the formula expresses that there exists something in the domain which satisfies that property. A formula where a quantifier takes widest scope is called a quantified formula. A quantified formula must contain a bound variable and a subformula specifying a property of the referent of that variable.

Control dependency is a situation in which a program instruction executes if the previous instruction evaluates in a way that allows its execution.