Register allocation

Last updated

In compiler optimization, register allocation is the process of assigning local automatic variables and expression results to a limited number of processor registers.

Contents

Register allocation can happen over a basic block (local register allocation), over a whole function/procedure (global register allocation), or across function boundaries traversed via call-graph (interprocedural register allocation). When done per function/procedure the calling convention may require insertion of save/restore around each call-site.

Context

Principle

Different number of scalar registers in the most common architectures
Architecture 32 bit 64 bit
ARM 15 31
Intel x86 8 16
MIPS 32 32
POWER/PowerPC 32 32
RISC-V 16/32 32
SPARC 31 31


In many programming languages, the programmer may use any number of variables. The computer can quickly read and write registers in the CPU, so the computer program runs faster when more variables can be in the CPU's registers. [1] Also, sometimes code accessing registers is more compact, so the code is smaller, and can be fetched faster if it uses registers rather than memory. However, the number of registers is limited. Therefore, when the compiler is translating code to machine-language, it must decide how to allocate variables to the limited number of registers in the CPU. [2] [3]

Not all variables are in use (or "live") at the same time, so, over the lifetime of a program, a given register may be used to hold different variables. However, two variables in use at the same time cannot be assigned to the same register without corrupting one of the variables. If there are not enough registers to hold all the variables, some variables may be moved to and from RAM. This process is called "spilling" the registers. [4] Over the lifetime of a program, a variable can be both spilled and stored in registers: this variable is then considered as "split". [5] Accessing RAM is significantly slower than accessing registers [6] and so a compiled program runs slower. Therefore, an optimizing compiler aims to assign as many variables to registers as possible. A high "Register pressure" is a technical term that means that more spills and reloads are needed; it is defined by Braun et al. as "the number of simultaneously live variables at an instruction". [7]

In addition, some computer designs cache frequently-accessed registers. So, programs can be further optimized by assigning the same register to a source and destination of a move instruction whenever possible. This is especially important if the compiler is using an intermediate representation such as static single-assignment form (SSA). In particular, when SSA is not fully optimized it can artificially generate additional move instructions.

Components of register allocation

Register allocation consists therefore of choosing where to store the variables at runtime, i.e. inside or outside registers. If the variable is to be stored in registers, then the allocator needs to determine in which register(s) this variable will be stored. Eventually, another challenge is to determine the duration for which a variable should stay at the same location.

A register allocator, disregarding the chosen allocation strategy, can rely on a set of core actions to address these challenges. These actions can be gathered in several different categories: [8]

Move insertion
This action consists of increasing the number of move instructions between registers, i.e. make a variable live in different registers during its lifetime, instead of one. This occurs in the split live range approach.
Spilling
This action consists of storing a variable into memory instead of registers. [9]
Assignment
This action consists of assigning a register to a variable. [10]
Coalescing
This action consists of limiting the number of moves between registers, thus limiting the total number of instructions. For instance, by identifying a variable live across different methods, and storing it into one register during its whole lifetime. [9]

Many register allocation approaches optimize for one or more specific categories of actions.

Intel 386 registers Registers CPU i386.png
Intel 386 registers

Common problems raised in register allocation

Register allocation raises several problems that can be tackled (or avoided) by different register allocation approaches. Three of the most common problems are identified as follows:

Aliasing
In some architectures, assigning a value to one register can affect the value of another: this is called aliasing. For example, the x86 architecture has four general purpose 32-bit registers that can also be used as 16-bit or 8-bit registers. [11] In this case, assigning a 32-bit value to the eax register will affect the value of the al register.
Pre-coloring
This problem is an act to force some variables to be assigned to particular registers. For example, in PowerPC calling conventions, parameters are commonly passed in R3-R10 and the return value is passed in R3. [12]
NP-Problem
Chaitin et al. showed that register allocation is an NP-complete problem. They reduce the graph coloring problem to the register allocation problem by showing that for an arbitrary graph, a program can be constructed such that the register allocation for the program (with registers representing nodes and machine registers representing available colors) would be a coloring for the original graph. As Graph Coloring is an NP-Hard problem and Register Allocation is in NP, this proves the NP-completeness of the problem. [13]

Register allocation techniques

Register allocation can happen over a basic block of code: it is said to be "local", and was first mentioned by Horwitz et al. [14] As basic blocks do not contain branches, the allocation process is thought to be fast, because the management of control-flow graph merge points in register allocation reveals itself[ clarification needed ] a time-consuming operation. [15] However, this approach is thought not to produce as optimized code as the "global" approach, which operates over the whole compilation unit (a method or procedure for instance). [16]

Graph-coloring allocation

Graph-coloring allocation is the predominant approach to solve register allocation. [17] [18] It was first proposed by Chaitin et al. [4] In this approach, nodes in the graph represent live ranges (variables, temporaries, virtual/symbolic registers) that are candidates for register allocation. Edges connect live ranges that interfere, i.e., live ranges that are simultaneously live at at least one program point. Register allocation then reduces to the graph coloring problem in which colors (registers) are assigned to the nodes such that two nodes connected by an edge do not receive the same color. [19]

Using liveness analysis, an interference graph can be built. The interference graph, which is an undirected graph where the nodes are the program's variables, is used to model which variables cannot be allocated to the same register. [20]

Principle

The main phases in a Chaitin-style graph-coloring register allocator are: [18]

Chaitin et al.'s iterative graph coloring based register allocator Chaitin et al.'s iteravite graph coloring based register allocator.png
Chaitin et al.'s iterative graph coloring based register allocator
  1. Renumber: discover live range information in the source program.
  2. Build: build the interference graph.
  3. Coalesce: merge the live ranges of non-interfering variables related by copy instructions.
  4. Spill cost: compute the spill cost of each variable. This assesses the impact of mapping a variable to memory on the speed of the final program.
  5. Simplify: construct an ordering of the nodes in the inferences graph
  6. Spill Code: insert spill instructions, i.e. loads and stores to commute values between registers and memory.
  7. Select: assign a register to each variable.

Drawbacks and further improvements

The graph-coloring allocation has three major drawbacks. First, it relies on graph-coloring, which is an NP-complete problem, to decide which variables are spilled. Finding a minimal coloring graph is indeed an NP-complete problem. [21] Second, unless live-range splitting is used, evicted variables are spilled everywhere: store instructions are inserted as early as possible, i.e., just after variable definitions; load instructions are respectively inserted late, just before variable use. Third, a variable that is not spilled is kept in the same register throughout its whole lifetime. [22]

On the other hand, a single register name may appear in multiple register classes, where a class is a set of register names that are interchangeable in a particular role. Then, multiple register names may be aliases for a single hardware register. [23] Finally, graph coloring is an aggressive technique for allocating registers, but is computationally expensive due to its use of the interference graph, which can have a worst-case size that is quadratic in the number of live ranges. [24] The traditional formulation of graph-coloring register allocation implicitly assumes a single bank of non-overlapping general-purpose registers and does not handle irregular architectural features like overlapping registers pairs, special purpose registers and multiple register banks. [25]

One later improvement of Chaitin-style graph-coloring approach was found by Briggs et al.: it is called conservative coalescing. This improvement adds a criterion to decide when two live ranges can be merged. Mainly, in addition to the non-interfering requirements, two variables can only be coalesced if their merging will not cause further spilling. Briggs et al. introduces a second improvement to Chaitin's works which is biased coloring. Biased coloring tries to assign the same color in the graph-coloring to live range that are copy related. [18]

Linear scan

Linear scan is another global register allocation approach. It was first proposed by Poletto et al. in 1999. [26] In this approach, the code is not turned into a graph. Instead, all the variables are linearly scanned to determine their live range, represented as an interval. Once the live ranges of all variables have been figured out, the intervals are traversed chronologically. Although this traversal could help identifying variables whose live ranges interfere, no interference graph is being built and the variables are allocated in a greedy way. [24]

The motivation for this approach is speed; not in terms of execution time of the generated code, but in terms of time spent in code generation. Typically, the standard graph coloring approaches produce quality code, but have a significant overhead, [27] [28] the used graph coloring algorithm having a quadratic cost. [29] Owing to this feature, linear scan is the approach currently used in several JIT compilers, like the Hotspot client compiler, V8, Jikes RVM, [5] and the Android Runtime (ART). [30] The Hotspot server compiler uses graph coloring for its superior code. [31]

Pseudocode

This describes the algorithm as first proposed by Poletto et al., [32] where:

  • R is the number of available registers.
  • active is the list, sorted in order of increasing end point, of live intervals overlapping the current point and placed in registers.
LinearScanRegisterAllocation     active ← {}     for each live interval i, in order of increasing start point do         ExpireOldIntervals(i)         if length(active) = R then             SpillAtInterval(i)         else             register[i] ← a register removed from pool of free registers             add i to active, sorted by increasing end point  ExpireOldIntervals(i)for each interval jin active, in order of increasing end point doif endpoint[j] ≥ startpoint[i] thenreturn          remove j from active         add register[j] to pool of free registers  SpillAtInterval(i)     spill ← last interval in active     if endpoint[spill] > endpoint[i] then         register[i] ← register[spill]         location[spill] ← new stack location         remove spill from active         add i to active, sorted by increasing end point     else         location[i] ← new stack location

Drawbacks and further improvements

However, the linear scan presents two major drawbacks. First, due to its greedy aspect, it does not take lifetime holes into account, i.e. "ranges where the value of the variable is not needed". [33] [34] Besides, a spilled variable will stay spilled for its entire lifetime.

Shorter live ranges with SSA approach Shorter live ranges with SSA approach.png
Shorter live ranges with SSA approach

Many other research works followed up on the Poletto's linear scan algorithm. Traub et al., for instance, proposed an algorithm called second-chance binpacking aiming at generating code of better quality. [35] [36] In this approach, spilled variables get the opportunity to be stored later in a register by using a different heuristic from the one used in the standard linear scan algorithm. Instead of using live intervals, the algorithm relies on live ranges, meaning that if a range needs to be spilled, it is not necessary to spill all the other ranges corresponding to this variable.

Linear scan allocation was also adapted to take advantage from the SSA form: the properties of this intermediate representation simplify the allocation algorithm and allow lifetime holes to be computed directly. [37] First, the time spent in data-flow graph analysis, aimed at building the lifetime intervals, is reduced, namely because variables are unique. [38] It consequently produces shorter live intervals, because each new assignment corresponds to a new live interval. [39] [40] To avoid modeling intervals and liveness holes, Rogers showed a simplification called future-active sets that successfully removed intervals for 80% of instructions. [41]

Rematerialization

Coalescing

In the context of register allocation, coalescing is the act of merging variable-to-variable move operations by allocating those two variables to the same location. The coalescing operation takes place after the interference graph is built. Once two nodes have been coalesced, they must get the same color and be allocated to the same register, once the copy operation becomes unnecessary. [42]

Doing coalescing might have both positive and negative impacts on the colorability of the interference graph. [9] For example, one negative impact that coalescing could have on graph inference colorability is when two nodes are coalesced, as the result node will have a union of the edges of those being coalesced. [9] A positive impact of coalescing on inference graph colorability is, for example, when a node interferes with both nodes being coalesced, the degree of the node is reduced by one which leads to improving the overall colorability of the interference graph. [43]

There are several coalescing heuristics available: [44]

Aggressive coalescing
it was first introduced by Chaitin original register allocator. This heuristic aims at coalescing any non-interfering, copy-related nodes. [45] From the perspective of copy elimination, this heuristic has the best results. [46] On the other hand, aggressive coalescing could impact the colorability of the inference graph. [43]
Conservative Coalescing
it mainly uses the same heuristic as aggressive coalescing but it merges moves if, and only if, it does not compromise the colorability of the interference graph. [47]
Iterated coalescing
it removes one particular move at the time, while keeping the colorability of the graph. [48]
Optimistic coalescing
it is based on aggressive coalescing, but if the inference graph colorability is compromised, then it gives up as few moves as possible. [49]

Mixed approaches

Hybrid allocation

Some other register allocation approaches do not limit to one technique to optimize register's use. Cavazos et al., for instance, proposed a solution where it is possible to use both the linear scan and the graph coloring algorithms. [50] In this approach, the choice between one or the other solution is determined dynamically: first, a machine learning algorithm is used "offline", that is to say not at runtime, to build a heuristic function that determines which allocation algorithm needs to be used. The heuristic function is then used at runtime; in light of the code behavior, the allocator can then choose between one of the two available algorithms. [51]

Trace register allocation is a recent approach developed by Eisl et al. [3] [5] This technique handles the allocation locally: it relies on dynamic profiling data to determine which branches will be the most frequently used in a given control flow graph. It then infers a set of "traces" (i.e. code segments) in which the merge point is ignored in favor of the most used branch. Each trace is then independently processed by the allocator. This approach can be considered as hybrid because it is possible to use different register allocation algorithms between the different traces. [52]

Split allocation

Split allocation is another register allocation technique that combines different approaches, usually considered as opposite. For instance, the hybrid allocation technique can be considered as split because the first heuristic building stage is performed offline, and the heuristic use is performed online. [24] In the same fashion, B. Diouf et al. proposed an allocation technique relying both on offline and online behaviors, namely static and dynamic compilation. [53] [54] During the offline stage, an optimal spill set is first gathered using Integer Linear Programming. Then, live ranges are annotated using the compressAnnotation algorithm which relies on the previously identified optimal spill set. Register allocation is performed afterwards during the online stage, based on the data collected in the offline phase. [55]

In 2007, Bouchez et al. suggested as well to split the register allocation in different stages, having one stage dedicated to spilling, and one dedicated to coloring and coalescing. [56]

Comparison between the different techniques

Several metrics have been used to assess the performance of one register allocation technique against the other. Register allocation has typically to deal with a trade-off between code quality, i.e. code that executes quickly, and analysis overhead, i.e. the time spent determining analyzing the source code to generate code with optimized register allocation. From this perspective, execution time of the generated code and time spent in liveness analysis are relevant metrics to compare the different techniques. [57]

Once relevant metrics have been chosen, the code on which the metrics will be applied should be available and relevant to the problem, either by reflecting the behavior of real-world application, or by being relevant to the particular problem the algorithm wants to address. The more recent articles about register allocation uses especially the Dacapo benchmark suite. [58]

See also

Related Research Articles

<span class="mw-page-title-main">Gregory Chaitin</span> Argentine-American mathematician

Gregory John Chaitin is an Argentine-American mathematician and computer scientist. Beginning in the late 1960s, Chaitin made contributions to algorithmic information theory and metamathematics, in particular a computer-theoretic result equivalent to Gödel's incompleteness theorem. He is considered to be one of the founders of what is today known as algorithmic complexity together with Andrei Kolmogorov and Ray Solomonoff. Along with the works of e.g. Solomonoff, Kolmogorov, Martin-Löf, and Leonid Levin, algorithmic information theory became a foundational part of theoretical computer science, information theory, and mathematical logic. It is a common subject in several computer science curricula. Besides computer scientists, Chaitin's work draws attention of many philosophers and mathematicians to fundamental problems in mathematical creativity and digital philosophy.

An optimizing compiler is a compiler designed to generate code that is optimized in aspects such as minimizing program execution time, memory use, storage size, and power consumption. Optimization is generally implemented as a sequence of optimizing transformations, algorithms that transform code to produce semantically equivalent code optimized for some aspect.

<span class="mw-page-title-main">Control-flow graph</span> Graphical representation of a computer program or algorithm

In computer science, a control-flow graph (CFG) is a representation, using graph notation, of all paths that might be traversed through a program during its execution. The control-flow graph was discovered by Frances E. Allen, who noted that Reese T. Prosser used boolean connectivity matrices for flow analysis before.

<span class="mw-page-title-main">Interval graph</span> Intersection graph for intervals on the real number line

In graph theory, an interval graph is an undirected graph formed from a set of intervals on the real line, with a vertex for each interval and an edge between vertices whose intervals intersect. It is the intersection graph of the intervals.

In compiler design, static single assignment form is a type of intermediate representation (IR) where each variable is assigned exactly once. SSA is used in most high-quality optimizing compilers for imperative languages, including LLVM, the GNU Compiler Collection, and many commercial compilers.

<span class="mw-page-title-main">Dominator (graph theory)</span> When every path in a control-flow graph must go through one node to reach another

In computer science, a node d of a control-flow graph dominates a node n if every path from the entry node to n must go through d. Notationally, this is written as d dom n. By definition, every node dominates itself.

<span class="mw-page-title-main">Graph coloring</span> Methodic assignment of colors to elements of a graph

In graph theory, graph coloring is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices of a graph such that no two adjacent vertices are of the same color; this is called a vertex coloring. Similarly, an edge coloring assigns a color to each edge so that no two adjacent edges are of the same color, and a face coloring of a planar graph assigns a color to each face or region so that no two faces that share a boundary have the same color.

The Fulkerson Prize for outstanding papers in the area of discrete mathematics is sponsored jointly by the Mathematical Optimization Society (MOS) and the American Mathematical Society (AMS). Up to three awards of $1,500 each are presented at each (triennial) International Symposium of the MOS. Originally, the prizes were paid out of a memorial fund administered by the AMS that was established by friends of the late Delbert Ray Fulkerson to encourage mathematical excellence in the fields of research exemplified by his work. The prizes are now funded by an endowment administered by MPS.

In computer science, rematerialization or remat is a compiler optimization which saves time by recomputing a value instead of loading it from memory. It is typically tightly integrated with register allocation, where it is used as an alternative to spilling registers to memory. It was conceived by Gregory Chaitin, Marc Auslander, Ashok Chandra, John Cocke, Martin Hopkins and Peter Markstein and implemented in the Pl.8 compiler for the 801 Minicomputer in the late 1970s. Later improvements were made by Preston Briggs, Keith D. Cooper, and Linda Torczon in 1992.

Chaitin's algorithm is a bottom-up, graph coloring register allocation algorithm that uses cost/degree as its spill metric. It is named after its designer, Gregory Chaitin. Chaitin's algorithm was the first register allocation algorithm that made use of coloring of the interference graph for both register allocations and spilling.

In computer science, instruction selection is the stage of a compiler backend that transforms its middle-level intermediate representation (IR) into a low-level IR. In a typical compiler, instruction selection precedes both instruction scheduling and register allocation; hence its output IR has an infinite set of pseudo-registers and may still be – and typically is – subject to peephole optimization. Otherwise, it closely resembles the target machine code, bytecode, or assembly language.

In software engineering, profiling is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization, and more specifically, performance engineering.

In graph theory, the treewidth of an undirected graph is an integer number which specifies, informally, how far the graph is from being a tree. The smallest treewidth is 1; the graphs with treewidth 1 are exactly the trees and the forests. An example of graphs with treewidth at most 2 are the series–parallel graphs. The maximal graphs with treewidth exactly k are called k-trees, and the graphs with treewidth at most k are called partial k-trees. Many other well-studied graph families also have bounded treewidth.

In compiler optimization, escape analysis is a method for determining the dynamic scope of pointers – where in the program a pointer can be accessed. It is related to pointer analysis and shape analysis.

Thread Level Speculation (TLS), also known as Speculative Multi-threading, or Speculative Parallelization, is a technique to speculatively execute a section of computer code that is anticipated to be executed later in parallel with the normal execution on a separate independent thread. Such a speculative thread may need to make assumptions about the values of input variables. If these prove to be invalid, then the portions of the speculative thread that rely on these input variables will need to be discarded and squashed. If the assumptions are correct the program can complete in a shorter time provided the thread was able to be scheduled efficiently.

<span class="mw-page-title-main">Incremental computing</span> Software feature

Incremental computing, also known as incremental computation, is a software feature which, whenever a piece of data changes, attempts to save time by only recomputing those outputs which depend on the changed data. When incremental computing is successful, it can be significantly faster than computing new outputs naively. For example, a spreadsheet software package might use incremental computation in its recalculation features, to update only those cells containing formulas which depend on the changed cells.

<span class="mw-page-title-main">Greedy coloring</span> One-by-one assignment of colors to graph vertices

In the study of graph coloring problems in mathematics and computer science, a greedy coloring or sequential coloring is a coloring of the vertices of a graph formed by a greedy algorithm that considers the vertices of the graph in sequence and assigns each vertex its first available color. Greedy colorings can be found in linear time, but they do not, in general, use the minimum number of colors possible.

<span class="mw-page-title-main">History of compiler construction</span>

In computing, a compiler is a computer program that transforms source code written in a programming language or computer language, into another computer language. The most common reason for transforming source code is to create an executable program.

<span class="mw-page-title-main">Kathryn S. McKinley</span> American computer scientist

Kathryn S. McKinley is an American computer scientist noted for her research on compilers, runtime systems, and computer architecture. She is also known for her leadership in broadening participation in computing. McKinley was co-chair of CRA-W from 2011 to 2014.

<span class="mw-page-title-main">Sum coloring</span>

In graph theory, a sum coloring of a graph is a labeling of its vertices by positive integers, with no two adjacent vertices having equal labels, that minimizes the sum of the labels. The minimum sum that can be achieved is called the chromatic sum of the graph. Chromatic sums and sum coloring were introduced by Supowit in 1987 using non-graph-theoretic terminology, and first studied in graph theoretic terms by Ewa Kubicka in her 1989 doctoral thesis.

References

  1. Ditzel & McLellan 1982, p. 48.
  2. Runeson & Nyström 2003, p. 242.
  3. 1 2 Eisl et al. 2016, p. 14:1.
  4. 1 2 Chaitin et al. 1981, p. 47.
  5. 1 2 3 Eisl et al. 2016, p. 1.
  6. "Latency Comparison Numbers in computer/network". blog.morizyun.com. 6 January 2018. Retrieved 2019-01-08.
  7. Braun & Hack 2009, p. 174.
  8. Koes & Goldstein 2009, p. 21.
  9. 1 2 3 4 Bouchez, Darte & Rastello 2007b, p. 103.
  10. Colombet, Brandner & Darte 2011, p. 26.
  11. "Intel® 64 and IA-32 Architectures Software Developer's Manual, Section 3.4.1" (PDF). Intel. May 2019. Archived from the original (PDF) on 2019-05-25.
  12. "32-bit PowerPC function calling conventions".
  13. Bouchez, Darte & Rastello 2006, p. 4.
  14. Horwitz et al. 1966, p. 43.
  15. Farach & Liberatore 1998, p. 566.
  16. Eisl et al. 2017, p. 92.
  17. Eisl, Leopoldseder & Mössenböck 2018, p. 1.
  18. 1 2 3 Briggs, Cooper & Torczon 1992, p. 316.
  19. Poletto & Sarkar 1999, p. 896.
  20. Runeson & Nyström 2003, p. 241.
  21. Book 1975, p. 618–619.
  22. Colombet, Brandner & Darte 2011, p. 1.
  23. Smith, Ramsey & Holloway 2004, p. 277.
  24. 1 2 3 Cavazos, Moss & O’Boyle 2006, p. 124.
  25. Runeson & Nyström 2003, p. 240.
  26. Poletto & Sarkar 1999, p. 895.
  27. Poletto & Sarkar 1999, p. 902.
  28. Wimmer & Mössenböck 2005, p. 132.
  29. Johansson & Sagonas 2002, p. 102.
  30. The Evolution of ART - Google I/O 2016. Google. 25 May 2016. Event occurs at 3m47s.
  31. Paleczny, Vick & Click 2001, p. 1.
  32. Poletto & Sarkar 1999, p. 899.
  33. Eisl et al. 2016, p. 2.
  34. Traub, Holloway & Smith 1998, p. 143.
  35. Traub, Holloway & Smith 1998, p. 141.
  36. Poletto & Sarkar 1999, p. 897.
  37. Wimmer & Franz 2010, p. 170.
  38. Mössenböck & Pfeiffer 2002, p. 234.
  39. Mössenböck & Pfeiffer 2002, p. 233.
  40. Mössenböck & Pfeiffer 2002, p. 229.
  41. Rogers 2020.
  42. Chaitin 1982, p. 90.
  43. 1 2 Ahn & Paek 2009, p. 7.
  44. Park & Moon 2004, p. 736.
  45. Chaitin 1982, p. 99.
  46. Park & Moon 2004, p. 738.
  47. Briggs, Cooper & Torczon 1994, p. 433.
  48. George & Appel 1996, p. 212.
  49. Park & Moon 2004, p. 741.
  50. Eisl et al. 2017, p. 11.
  51. Cavazos, Moss & O’Boyle 2006, p. 124-127.
  52. Eisl et al. 2016, p. 4.
  53. Diouf et al. 2010, p. 66.
  54. Cohen & Rohou 2010, p. 1.
  55. Diouf et al. 2010, p. 72.
  56. Bouchez, Darte & Rastello 2007a, p. 1.
  57. Poletto & Sarkar 1999, p. 901-910.
  58. Blackburn et al. 2006, p. 169.
  59. Flajolet, Raoult & Vuillemin 1979.

Sources