Parameterized complexity

Last updated

In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to multiple parameters of the input or output. The complexity of a problem is then measured as a function of those parameters. This allows the classification of NP-hard problems on a finer scale than in the classical setting, where the complexity of a problem is only measured as a function of the number of bits in the input. This appears to have been first demonstrated in Gurevich, Stockmeyer & Vishkin (1984). The first systematic work on parameterized complexity was done by Downey & Fellows (1999).

Contents

Under the assumption that P  NP, there exist many natural problems that require superpolynomial running time when complexity is measured in terms of the input size only but that are computable in a time that is polynomial in the input size and exponential or worse in a parameter k. Hence, if k is fixed at a small value and the growth of the function over k is relatively small then such problems can still be considered "tractable" despite their traditional classification as "intractable".

The existence of efficient, exact, and deterministic solving algorithms for NP-complete, or otherwise NP-hard, problems is considered unlikely, if input parameters are not fixed; all known solving algorithms for these problems require time that is exponential (so in particular superpolynomial) in the total size of the input. However, some problems can be solved by algorithms that are exponential only in the size of a fixed parameter while polynomial in the size of the input. Such an algorithm is called a fixed-parameter tractable (FPT) algorithm, because the problem can be solved efficiently (i.e., in polynomial time) for constant values of the fixed parameter.

Problems in which some parameter k is fixed are called parameterized problems. A parameterized problem that allows for such an FPT algorithm is said to be a fixed-parameter tractable problem and belongs to the class FPT, and the early name of the theory of parameterized complexity was fixed-parameter tractability.

Many problems have the following form: given an object x and a nonnegative integer k, does x have some property that depends on k? For instance, for the vertex cover problem, the parameter can be the number of vertices in the cover. In many applications, for example when modelling error correction, one can assume the parameter to be "small" compared to the total input size. Then it is challenging to find an algorithm that is exponential only in k, and not in the input size.

In this way, parameterized complexity can be seen as two-dimensional complexity theory. This concept is formalized as follows:

A parameterized problem is a language , where is a finite alphabet. The second component is called the parameter of the problem.
A parameterized problem L is fixed-parameter tractable if the question "?" can be decided in running time , where f is an arbitrary function depending only on k. The corresponding complexity class is called FPT.

For example, there is an algorithm that solves the vertex cover problem in time, [1] where n is the number of vertices and k is the size of the vertex cover. This means that vertex cover is fixed-parameter tractable with the size of the solution as the parameter.

Complexity classes

FPT

FPT contains the fixed parameter tractable problems, which are those that can be solved in time for some computable function f. Typically, this function is thought of as single exponential, such as , but the definition admits functions that grow even faster. This is essential for a large part of the early history of this class. The crucial part of the definition is to exclude functions of the form , such as .

The class FPL (fixed parameter linear) is the class of problems solvable in time for some computable function f. [2] FPL is thus a subclass of FPT. An example is the Boolean satisfiability problem, parameterised by the number of variables. A given formula of size m with k variables can be checked by brute force in time . A vertex cover of size k in a graph of order n can be found in time , so the vertex cover problem is also in FPL.

An example of a problem that is thought not to be in FPT is graph coloring parameterised by the number of colors. It is known that 3-coloring is NP-hard, and an algorithm for graph k-coloring in time for would run in polynomial time in the size of the input. Thus, if graph coloring parameterised by the number of colors were in FPT, then P = NP.

There are a number of alternative definitions of FPT. For example, the running-time requirement can be replaced by . Also, a parameterised problem is in FPT if it has a so-called kernel. Kernelization is a preprocessing technique that reduces the original instance to its "hard kernel", a possibly much smaller instance that is equivalent to the original instance but has a size that is bounded by a function in the parameter.

FPT is closed under a parameterised notion of reductions called fpt-reductions. Such reductions transform an instance of some problem into an equivalent instance of another problem (with ) and can be computed in time where is a polynomial.

Obviously, FPT contains all polynomial-time computable problems. Moreover, it contains all optimisation problems in NP that allow an efficient polynomial-time approximation scheme (EPTAS).

W hierarchy

The W hierarchy is a collection of computational complexity classes. A parameterized problem is in the class W[i], if every instance can be transformed (in fpt-time) to a combinatorial circuit that has weft at most i, such that if and only if there is a satisfying assignment to the inputs that assigns 1 to exactly k inputs. The weft is the largest number of logical units with fan-in greater than two on any path from an input to the output. The total number of logical units on the paths (known as depth) must be limited by a constant that holds for all instances of the problem.

Note that and for all . The classes in the W hierarchy are also closed under fpt-reduction.

A complete problem for W[i] is Weighted i-Normalized Satisfiability: [3] given a Boolean formula written as an AND of ORs of ANDs of ... of possibly negated variables, with layers of ANDs or ORs (and i alternations between AND and OR), can it be satisfied by setting exactly k variables to 1?

Many natural computational problems occupy the lower levels, W[1] and W[2].

W[1]

Examples of W[1]-complete problems include

  • deciding if a given graph contains a clique of size k
  • deciding if a given graph contains an independent set of size k
  • deciding if a given nondeterministic single-tape Turing machine accepts within k steps ("short Turing machine acceptance" problem). This also applies to nondeterministic Turing machines with f(k) tapes and even f(k) of f(k)-dimensional tapes, but even with this extension, the restriction to f(k) tape alphabet size is fixed-parameter tractable. Crucially, the branching of the Turing machine at each step is allowed to depend on n, the size of the input. In this way, the Turing machine may explore nO(k) computation paths.

W[2]

Examples of W[2]-complete problems include

  • deciding if a given graph contains a dominating set of size k
  • deciding if a given nondeterministic multi-tape Turing machine accepts within k steps ("short multi-tape Turing machine acceptance" problem). Crucially, the branching is allowed to depend on n (like the W[1] variant), as is the number of tapes. An alternate W[2]-complete formulation allows only single-tape Turing machines, but the alphabet size may depend on n.

W[t]

can be defined using the family of Weighted Weft-t-Depth-d SAT problems for : is the class of parameterized problems that fpt-reduce to this problem, and .

Here, Weighted Weft-t-Depth-d SAT is the following problem:

  • Input: A Boolean formula of depth at most d and weft at most t, and a number k. The depth is the maximal number of gates on any path from the root to a leaf, and the weft is the maximal number of gates of fan-in at least three on any path from the root to a leaf.
  • Question: Does the formula have a satisfying assignment of Hamming weight exactly k?

It can be shown that for the problem Weighted t-Normalize SAT is complete for under fpt-reductions. [4] Here, Weighted t-Normalize SAT is the following problem:

  • Input: A Boolean formula of depth at most t with an AND-gate on top, and a number k.
  • Question: Does the formula have a satisfying assignment of Hamming weight exactly k?

W[P]

W[P] is the class of problems that can be decided by a nondeterministic -time Turing machine that makes at most nondeterministic choices in the computation on (a k-restricted Turing machine). Flum & Grohe (2006)

It is known that FPT is contained in W[P], and the inclusion is believed to be strict. However, resolving this issue would imply a solution to the P versus NP problem.

Other connections to unparameterised computational complexity are that FPT equals W[P] if and only if circuit satisfiability can be decided in time , or if and only if there is a computable, nondecreasing, unbounded function f such that all languages recognised by a nondeterministic polynomial-time Turing machine using nondeterministic choices are in P.

W[P] can be loosely thought of as the class of problems where we have a set S of n items, and we want to find a subset of size k such that a certain property holds. We can encode a choice as a list of k integers, stored in binary. Since the highest any of these numbers can be is n, bits are needed for each number. Therefore total bits are needed to encode a choice. Therefore we can select a subset with nondeterministic choices.

XP

XP is the class of parameterized problems that can be solved in time for some computable function f. These problems are called slicewise polynomial, in the sense that each "slice" of fixed k has a polynomial algorithm, although possibly with a different exponent for each k. Compare this with FPT, which merely allows a different constant prefactor for each value of k. XP contains FPT, and it is known that this containment is strict by diagonalization.

para-NP

para-NP is the class of parameterized problems that can be solved by a nondeterministic algorithm in time for some computable function f. It is known that if and only if . [5]

A problem is para-NP-hard if it is -hard already for a constant value of the parameter. That is, there is a "slice" of fixed k that is -hard. A parameterized problem that is -hard cannot belong to the class , unless . A classic example of a -hard parameterized problem is graph coloring, parameterized by the number k of colors, which is already -hard for (see Graph coloring#Computational complexity).

A hierarchy

The A hierarchy is a collection of computational complexity classes similar to the W hierarchy. However, while the W hierarchy is a hierarchy contained in NP, the A hierarchy more closely mimics the polynomial-time hierarchy from classical complexity. It is known that A[1] = W[1] holds.

See also

Notes

  1. Chen, Kanj & Xia 2006
  2. Grohe (1999)
  3. Downey, Rod G.; Fellows, Michael R. (August 1995). "Fixed-Parameter Tractability and Completeness I: Basic Results". SIAM Journal on Computing. 24 (4): 873–921. doi:10.1137/S0097539792228228. ISSN   0097-5397.
  4. Buss, Jonathan F; Islam, Tarique (2006). "Simplifying the weft hierarchy". Theoretical Computer Science . 351 (3): 303–313. doi: 10.1016/j.tcs.2005.10.002 .
  5. Flum & Grohe (2006), p. 39.

Related Research Articles

The P versus NP problem is a major unsolved problem in theoretical computer science. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved.

In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm.

<span class="mw-page-title-main">NP (complexity)</span> Complexity class used to classify decision problems

In computational complexity theory, NP is a complexity class used to classify decision problems. NP is the set of decision problems for which the problem instances, where the answer is "yes", have proofs verifiable in polynomial time by a deterministic Turing machine, or alternatively the set of problems that can be solved in polynomial time by a nondeterministic Turing machine.

In computational complexity theory, the complexity class #P (pronounced "sharp P" or, sometimes "number P" or "hash P") is the set of the counting problems associated with the decision problems in the set NP. More formally, #P is the class of function problems of the form "compute f(x)", where f is the number of accepting paths of a nondeterministic Turing machine running in polynomial time. Unlike most well-known complexity classes, it is not a class of decision problems but a class of function problems. The most difficult, representative problems of this class are #P-complete.

<span class="mw-page-title-main">Time complexity</span> Estimate of time taken for running an algorithm

In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to be related by a constant factor.

<span class="mw-page-title-main">Combinatorial optimization</span> Subfield of mathematical optimization

Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combinatorial optimization problems are the travelling salesman problem ("TSP"), the minimum spanning tree problem ("MST"), and the knapsack problem. In many such problems, such as the ones previously mentioned, exhaustive search is not tractable, and so specialized algorithms that quickly rule out large parts of the search space or approximation algorithms must be resorted to instead.

<span class="mw-page-title-main">Complexity class</span> Set of problems in computational complexity theory

In computational complexity theory, a complexity class is a set of computational problems "of related resource-based complexity". The two most commonly analyzed resources are time and memory.

<span class="mw-page-title-main">Vertex cover</span> Subset of a graphs vertices, including at least one endpoint of every edge

In graph theory, a vertex cover of a graph is a set of vertices that includes at least one endpoint of every edge of the graph.

In computational complexity theory, Savitch's theorem, proved by Walter Savitch in 1970, gives a relationship between deterministic and non-deterministic space complexity. It states that for any function ,

In computational complexity theory, the Cook–Levin theorem, also known as Cook's theorem, states that the Boolean satisfiability problem is NP-complete. That is, it is in NP, and any problem in NP can be reduced in polynomial time by a deterministic Turing machine to the Boolean satisfiability problem.

In computational complexity theory, L is the complexity class containing decision problems that can be solved by a deterministic Turing machine using a logarithmic amount of writable memory space. Formally, the Turing machine has two tapes, one of which encodes the input and can only be read, whereas the other tape has logarithmic size but can be read as well as written. Logarithmic space is sufficient to hold a constant number of pointers into the input and a logarithmic number of boolean flags, and many basic logspace algorithms use the memory in this way.

In graph theory, the metric dimension of a graph G is the minimum cardinality of a subset S of vertices such that all other vertices are uniquely determined by their distances to the vertices in S. Finding the metric dimension of a graph is an NP-hard problem; the decision version, determining whether the metric dimension is less than a given value, is NP-complete.

<span class="mw-page-title-main">Dominating set</span> Subset of a graphs nodes such that all other nodes link to at least one

In graph theory, a dominating set for a graph G is a subset D of its vertices, such that any vertex of G is either in D, or has a neighbor in D. The domination numberγ(G) is the number of vertices in a smallest dominating set for G.

In computer science, a kernelization is a technique for designing efficient algorithms that achieve their efficiency by a preprocessing stage in which inputs to the algorithm are replaced by a smaller input, called a "kernel". The result of solving the problem on the kernel should either be the same as on the original input, or it should be easy to transform the output on the kernel to the desired output for the original problem.

In graph theory and theoretical computer science, the longest path problem is the problem of finding a simple path of maximum length in a given graph. A path is called simple if it does not have any repeated vertices; the length of a path may either be measured by its number of edges, or by the sum of the weights of its edges. In contrast to the shortest path problem, which can be solved in polynomial time in graphs without negative-weight cycles, the longest path problem is NP-hard and the decision version of the problem, which asks whether a path exists of at least some given length, is NP-complete. This means that the decision problem cannot be solved in polynomial time for arbitrary graphs unless P = NP. Stronger hardness results are also known showing that it is difficult to approximate. However, it has a linear time solution for directed acyclic graphs, which has important applications in finding the critical path in scheduling problems.

<span class="mw-page-title-main">NP-completeness</span> Complexity class

In computational complexity theory, a problem is NP-complete when:

  1. It is a decision problem, meaning that for any input to the problem, the output is either "yes" or "no".
  2. When the answer is "yes", this can be demonstrated through the existence of a short solution.
  3. The correctness of each solution can be verified quickly and a brute-force search algorithm can find a solution by trying all possible solutions.
  4. The problem can be used to simulate every other problem for which we can verify quickly that a solution is correct. In this sense, NP-complete problems are the hardest of the problems to which solutions can be verified quickly. If we could find solutions of some NP-complete problem quickly, we could quickly find the solutions of every other problem to which a given solution can be easily verified.

In computational complexity theory, the exponential time hypothesis is an unproven computational hardness assumption that was formulated by Impagliazzo & Paturi (1999). It states that satisfiability of 3-CNF Boolean formulas cannot be solved in subexponential time, . More precisely, the usual form of the hypothesis asserts the existence of a number such that all algorithms that correctly solve this problem require time at least . The exponential time hypothesis, if true, would imply that P ≠ NP, but it is a stronger statement. It implies that many computational problems are equivalent in complexity, in the sense that if one of them has a subexponential time algorithm then they all do, and that many known algorithms for these problems have optimal or near-optimal time complexity.

In the parameterized complexity of algorithms, the klam value of a parameterized algorithm is a number that bounds the parameter values for which the algorithm might reasonably be expected to be practical. An algorithm with a higher klam value can be used for a wider range of parameter values than another algorithm with a lower klam value. The klam value was first defined by Downey and Fellows (1999), and has since been used by other researchers in parameterized complexity both as a way of comparing different algorithms to each other and in order to set goals for future algorithmic improvements.

<span class="mw-page-title-main">Odd cycle transversal</span>

In graph theory, an odd cycle transversal of an undirected graph is a set of vertices of the graph that has a nonempty intersection with every odd cycle in the graph. Removing the vertices of an odd cycle transversal from a graph leaves a bipartite graph as the remaining induced subgraph.

A parameterized approximation algorithm is a type of algorithm that aims to find approximate solutions to NP-hard optimization problems in polynomial time in the input size and a function of a specific parameter. These algorithms are designed to combine the best aspects of both traditional approximation algorithms and fixed-parameter tractability.

References