K-approximation of k-hitting set

Last updated

In computer science, k-approximation of k-hitting set is an approximation algorithm for weighted hitting set. The input is a collection S of subsets of some universe T and a mapping W from T to non-negative numbers called the weights of the elements of T. In k-hitting set the size of the sets in S cannot be larger than k. That is, . The problem is now to pick some subset T' of T such that every set in S contains some element of T', and such that the total weight of all elements in T' is as small as possible.

Computer science Study of the theoretical foundations of information and computation

Computer science is the study of processes that interact with data and that can be represented as data in the form of programs. It enables the use of algorithms to manipulate, store, and communicate digital information. A computer scientist studies the theory of computation and the practice of designing software systems.

In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to NP-hard optimization problems with provable guarantees on the distance of the returned solution to the optimal one. Approximation algorithms naturally arise in the field of theoretical computer science as a consequence of the widely believed P ≠ NP conjecture. Under this conjecture, a wide class of optimization problems cannot be solved exactly in polynomial time. The field of approximation algorithms, therefore, tries to understand how closely it is possible to approximate optimal solutions to such problems in polynomial time. In an overwhelming majority of the cases, the guarantee of such algorithms is a multiplicative one expressed as an approximation ratio or approximation factor i.e., the optimal solution is always guaranteed to be within a (predetermined) multiplicative factor of the returned solution. However, there are also many approximation algorithms that provide an additive guarantee on the quality of the returned solution. A notable example of an approximation algorithm that provides both is the classic approximation algorithm of Lenstra, Shmoys and Tardos for Scheduling on Unrelated Parallel Machines.

In mathematics, a set A is a subset of a set B, or equivalently B is a superset of A, if A is "contained" inside B, that is, all elements of A are also elements of B. A and B may coincide. The relationship of one set being a subset of another is called inclusion or sometimes containment. A is a subset of B may also be expressed as B includes A; or A is included in B.

Contents

The algorithm

For each set in S is maintained a price, , which is initially 0. For an element a in T, let S(a) be the collection of sets from S containing a. During the algorithm the following invariant is kept

We say that an element, a, from T is tight if . The main part of the algorithm consists of a loop: As long as there is a set in S that contains no element from T which is tight, the price of this set is increased as much as possible without violating the invariant above. When this loop exits, all sets contain some tight element. Pick all the tight elements to be the hitting set.

Correctness

The algorithm always terminates because in each iteration of the loop the price of some set in S is increased enough to make one more element from T tight. If it cannot increase any price, it exits. It runs in polynomial time because the loop will not make more iterations than the number of elements in the union of all the sets of S. It returns a hitting set, because when the loop exits, all sets in S contain a tight element from T, and the set of these tight elements are returned.

Note that for any hitting set T* and any prices where the invariant from the algorithm is true, the total weight of the hitting set is an upper limit to the sum over all members of T* of the sum of the prices of sets containing this element, that is: . This follows from the invariant property. Further, since the price of every set must occur at least once on the left hand side, we get . Especially, this property is true for the optimal hitting set.

Further, for the hitting set H returned from the algorithm above, we have . Since any price can appear at most k times in the left-hand side (since subsets of S can contain no more than k element from T) we get Combined with the previous paragraph we get , where T* is the optimal hitting set. This is exactly the guarantee that it is a k-approximation algorithm.

Relation to linear programming

This algorithm is an instance of the primal-dual method, also called the pricing method. The intuition is that it is dual to a linear programming algorithm. For an explanation see http://algo.inria.fr/seminars/sem00-01/vazirani.html.

In mathematics, a duality, generally speaking, translates concepts, theorems or mathematical structures into other concepts, theorems or structures, in a one-to-one fashion, often by means of an involution operation: if the dual of A is B, then the dual of B is A. Such involutions sometimes have fixed points, so that the dual of A is A itself. For example, Desargues' theorem is self-dual in this sense under the standard duality in projective geometry.

Linear programming

Linear programming is a method to achieve the best outcome in a mathematical model whose requirements are represented by linear relationships. Linear programming is a special case of mathematical programming.

Related Research Articles

In mathematical analysis and in probability theory, a σ-algebra on a set X is a collection Σ of subsets of X that includes X itself, is closed under complement, and is closed under countable unions.

Permutation Arrangements of a list or set

In mathematics, permutation is the act of arranging the members of a set into a sequence or order, or, if the set is already ordered, rearranging (reordering) its elements—a process called permuting. Permutations differ from combinations, which are selections of some members of a set regardless of order. For example, written as tuples, there are six permutations of the set {1,2,3}, namely: (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), and (3,2,1). These are all the possible orderings of this three-element set. Anagrams of words whose letters are different are also permutations: the letters are already ordered in the original word, and the anagram is a reordering of the letters. The study of permutations of finite sets is an important topic in the fields of combinatorics and group theory.

In order theory, a field of mathematics, an incidence algebra is an associative algebra, defined for every locally finite partially ordered set and commutative ring with unity. Subalgebras called reduced incidence algebras give a natural construction of various types of generating functions used in combinatorics and number theory.

Arithmetical hierarchy Hierarchy of complexity classes for formulas defining sets

In mathematical logic, the arithmetical hierarchy, arithmetic hierarchy or Kleene–Mostowski hierarchy classifies certain sets based on the complexity of formulas that define them. Any set that receives a classification is called arithmetical.

In mathematics, an invariant subspace of a linear mapping T : VV from some vector space V to itself is a subspace W of V that is preserved by T; that is, T(W) ⊆ W.

In computational complexity theory, the polynomial hierarchy is a hierarchy of complexity classes that generalize the classes P, NP and co-NP to oracle machines. It is a resource-bounded counterpart to the arithmetical hierarchy and analytical hierarchy from mathematical logic.

The set cover problem is a classical question in combinatorics, computer science, operations research, and complexity theory. It is one of Karp's 21 NP-complete problems shown to be NP-complete in 1972.

The spectrum of a linear operator that operates on a Banach space consists of all scalars such that the operator does not have a bounded inverse on . The spectrum has a standard decomposition into three parts:

In logic, a rule of inference is admissible in a formal system if the set of theorems of the system does not change when that rule is added to the existing rules of the system. In other words, every formula that can be derived using that rule is already derivable without that rule, so, in a sense, it is redundant. The concept of an admissible rule was introduced by Paul Lorenzen (1955).

Game theory is the branch of mathematics in which games are studied: that is, models describing human behaviour. This is a glossary of some terms of the subject.

In complexity theory, the Karp–Lipton theorem states that if the Boolean satisfiability problem (SAT) can be solved by Boolean circuits with a polynomial number of logic gates, then

In mathematics, more precisely in measure theory, an atom is a measurable set which has positive measure and contains no set of smaller positive measure. A measure which has no atoms is called non-atomic or atomless.

In mathematics, effective dimension is a modification of Hausdorff dimension and other fractal dimensions which places it in a computability theory setting. There are several horse Wang's variations of which the most common is effective Hausdorff dimension. Dimension, in mathematics, is a particular way of describing the size of an object. Hausdorff dimension generalizes the well-known integer dimensions assigned to points, lines, planes, etc. by allowing one to distinguish between objects of intermediate size between these integer-dimensional objects. For example, fractal subsets of the plane may have intermediate dimension between 1 and 2, as they are "larger" than lines or curves, and yet "smaller" than filled circles or rectangles. Effective dimension modifies Hausdorff dimension by requiring that objects with small effective dimension be not only small but also locatable in a computable sense. As such, objects with large Hausdorff dimension also have large effective dimension, and objects with small effective dimension have small Hausdorff dimension, but an object can have small Hausdorff but large effective dimension. An example is an algorithmically random point on a line, which has Hausdorff dimension 0 but effective dimension 1.

In computational learning theory, Rademacher complexity, named after Hans Rademacher, measures richness of a class of real-valued functions with respect to a probability distribution.

Uniform convergence in probability is a form of convergence in probability in statistical asymptotic theory and probability theory. It means that, under certain conditions, the empirical frequencies of all events in a certain event-family converge to their theoretical probabilities. Uniform convergence in probability has applications to statistics as well as machine learning as part of statistical learning theory.

Strength of a graph

In the branch of mathematics called graph theory, the strength of an undirected graph corresponds to the minimum ratio edges removed/components created in a decomposition of the graph in question. It is a method to compute partitions of the set of vertices and detect zones of high concentration of edges, and is analogous to graph toughness which is defined similarly for vertex removal.

In mathematics, low-rank approximation is a minimization problem, in which the cost function measures the fit between a given matrix and an approximating matrix, subject to a constraint that the approximating matrix has reduced rank. The problem is used for mathematical modeling and data compression. The rank constraint is related to a constraint on the complexity of a model that fits the data. In applications, often there are other constraints on the approximating matrix apart from the rank constraint, e.g., non-negativity and Hankel structure.

Wavelet Tree

The Wavelet Tree is a succinct data structure to store strings in compressed space. It generalizes the and operations defined on bitvectors to arbitrary alphabets.

The geometric set cover problem is the special case of the set cover problem in geometric settings. The input is a range space where is a universe of points in and is a family of subsets of called ranges, defined by the intersection of and geometric shapes such as disks and axis-parallel rectangles. The goal is to select a minimum-size subset of ranges such that every point in the universe is covered by some range in .

References

Jon Kleinberg American computer scientist

Jon Michael Kleinberg is an American computer scientist and the Tisch University Professor of Computer Science at Cornell University known for his work in algorithms and networks. He is a recipient of the Nevanlinna Prize by the International Mathematical Union.

Éva Tardos Hungarian mathematician

Éva Tardos is a Hungarian mathematician and the Jacob Gould Schurman Professor of Computer Science at Cornell University. Tardos's research interest is algorithms. Her work focuses on the design and analysis of efficient methods for combinatorial optimization problems on graphs or networks. She has done some work on network flow algorithms like approximation algorithms for network flows, cut, and clustering problems. Her recent work focuses on algorithmic game theory and simple auctions.

International Standard Book Number Unique numeric book identifier

The International Standard Book Number (ISBN) is a numeric commercial book identifier which is intended to be unique. Publishers purchase ISBNs from an affiliate of the International ISBN Agency.