In mathematics, a submodular set function (also known as a submodular function) is a set function that, informally, describes the relationship between a set of inputs and an output, where adding more of one input has a decreasing additional benefit (diminishing returns). The natural diminishing returns property which makes them suitable for many applications, including approximation algorithms, game theory (as functions modeling user preferences) and electrical networks. Recently, submodular functions have also found utility in several real world problems in machine learning and artificial intelligence, including automatic summarization, multi-document summarization, feature selection, active learning, sensor placement, image collection summarization and many other domains. [1] [2] [3] [4]
If is a finite set, a submodular function is a set function , where denotes the power set of , which satisfies one of the following equivalent conditions. [5]
A nonnegative submodular function is also a subadditive function, but a subadditive function need not be submodular. If is not assumed finite, then the above conditions are not equivalent. In particular a function defined by if is finite and if is infinite satisfies the first condition above, but the second condition fails when and are infinite sets with finite intersection.
A set function is monotone if for every we have that . Examples of monotone submodular functions include:
A submodular function that is not monotone is called non-monotone. In particular, a function is called non-monotone if it has the property that adding more elements to a set can decrease the value of the function. More formally, the function is non-monotone if there are sets in its domain s.t. and .
A non-monotone submodular function is called symmetric if for every we have that . Examples of symmetric non-monotone submodular functions include:
A non-monotone submodular function which is not symmetric is called asymmetric.
Often, given a submodular set function that describes the values of various sets, we need to compute the values of fractional sets. For example: we know that the value of receiving house A and house B is V, and we want to know the value of receiving 40% of house A and 60% of house B. To this end, we need a continuous extension of the submodular set function.
Formally, a set function with can be represented as a function on , by associating each with a binary vector such that when , and otherwise. A continuous extension of is a continuous function , that matches the value of on , i.e. .
Several kinds of continuous extensions of submodular functions are commonly used, which are described below.
This extension is named after mathematician László Lovász. [9] Consider any vector such that each . Then the Lovász extension is defined as
where the expectation is over chosen from the uniform distribution on the interval . The Lovász extension is a convex function if and only if is a submodular function.
Consider any vector such that each . Then the multilinear extension is defined as [10] [11] .
Intuitively, xi represents the probability that item i is chosen for the set. For every set S, the two inner products represent the probability that the chosen set is exactly S. Therefore, the sum represents the expected value of f for the set formed by choosing each item i at random with probability xi, independently of the other items.
Consider any vector such that each . Then the convex closure is defined as .
The convex closure of any set function is convex over .
Consider any vector such that each . Then the concave closure is defined as .
For the extensions discussed above, it can be shown that when is submodular. [12]
Submodular functions have properties which are very similar to convex and concave functions. For this reason, an optimization problem which concerns optimizing a convex or concave function can also be described as the problem of maximizing or minimizing a submodular function subject to some constraints.
The hardness of minimizing a submodular set function depends on constraints imposed on the problem.
Unlike the case of minimization, maximizing a generic submodular function is NP-hard even in the unconstrained setting. Thus, most of the works in this field are concerned with polynomial-time approximation algorithms, including greedy algorithms or local search algorithms.
Many of these algorithms can be unified within a semi-differential based framework of algorithms. [18]
Apart from submodular minimization and maximization, there are several other natural optimization problems related to submodular functions.
Submodular functions naturally occur in several real world applications, in economics, game theory, machine learning and computer vision. [4] [29] Owing to the diminishing returns property, submodular functions naturally model costs of items, since there is often a larger discount, with an increase in the items one buys. Submodular functions model notions of complexity, similarity and cooperation when they appear in minimization problems. In maximization problems, on the other hand, they model notions of diversity, information and coverage.
In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the i-th approximation is derived from the previous ones.
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements and objective are represented by linear relationships. Linear programming is a special case of mathematical programming.
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by German mathematicians Paul Bachmann, Edmund Landau, and others, collectively called Bachmann–Landau notation or asymptotic notation. The letter O was chosen by Bachmann to stand for Ordnung, meaning the order of approximation.
A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally optimal solutions that approximate a globally optimal solution in a reasonable amount of time.
In the mathematical field of real analysis, the monotone convergence theorem is any of a number of related theorems proving the good convergence behaviour of monotonic sequences, i.e. sequences that are non-increasing, or non-decreasing. In its simplest form, it says that a non-decreasing bounded-above sequence of real numbers converges to its smallest upper bound, its supremum. Likewise, a non-increasing bounded-below sequence converges to its largest lower bound, its infimum. In particular, infinite sums of non-negative numbers converge to the supremum of the partial sums if and only if the partial sums are bounded.
In mathematics, Fatou's lemma establishes an inequality relating the Lebesgue integral of the limit inferior of a sequence of functions to the limit inferior of integrals of these functions. The lemma is named after Pierre Fatou.
In mathematics, a supermodular function is a function on a lattice that, informally, has the property of being characterized by "increasing differences." Seen from the point of set functions, this can also be viewed as a relationship of "increasing returns", where adding more elements to a subset increases its valuation. In economics, supermodular functions are often used as a formal expression of complementarity in preferences among goods. Supermodular functions are studied and have applications in game theory, economics, lattice theory, combinatorial optimization, and machine learning.
Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.
In mathematics, a polymatroid is a polytope associated with a submodular function. The notion was introduced by Jack Edmonds in 1970. It is also a generalization of the notion of a matroid.
The Frank–Wolfe algorithm is an iterative first-order optimization algorithm for constrained convex optimization. Also known as the conditional gradient method, reduced gradient algorithm and the convex combination algorithm, the method was originally proposed by Marguerite Frank and Philip Wolfe in 1956. In each iteration, the Frank–Wolfe algorithm considers a linear approximation of the objective function, and moves towards a minimizer of this linear function.
Linear Programming Boosting (LPBoost) is a supervised classifier from the boosting family of classifiers. LPBoost maximizes a margin between training samples of different classes, and thus also belongs to the class of margin classifier algorithms.
In mathematics, the Lions–Lax–Milgram theorem is a result in functional analysis with applications in the study of partial differential equations. It is a generalization of the famous Lax–Milgram theorem, which gives conditions under which a bilinear function can be "inverted" to show the existence and uniqueness of a weak solution to a given boundary value problem. The result is named after the mathematicians Jacques-Louis Lions, Peter Lax and Arthur Milgram.
In mathematical optimization, linear-fractional programming (LFP) is a generalization of linear programming (LP). Whereas the objective function in a linear program is a linear function, the objective function in a linear-fractional program is a ratio of two linear functions. A linear program can be regarded as a special case of a linear-fractional program in which the denominator is the constant function 1.
The maximum coverage problem is a classical question in computer science, computational complexity theory, and operations research. It is a problem that is widely taught in approximation algorithms.
In mathematics, there are many kinds of inequalities involving matrices and linear operators on Hilbert spaces. This article covers some important operator inequalities connected with traces of matrices.
In mathematics, a subadditive set function is a set function whose value, informally, has the property that the value of function on the union of two sets is at most the sum of values of the function on each of the sets. This is thematically related to the subadditivity property of real-valued functions.
In the mathematical theory of probability, the drift-plus-penalty method is used for optimization of queueing networks and other stochastic systems.
In financial mathematics and stochastic optimization, the concept of risk measure is used to quantify the risk involved in a random outcome or risk position. Many risk measures have hitherto been proposed, each having certain characteristics. The entropic value at risk (EVaR) is a coherent risk measure introduced by Ahmadi-Javid, which is an upper bound for the value at risk (VaR) and the conditional value at risk (CVaR), obtained from the Chernoff inequality. The EVaR can also be represented by using the concept of relative entropy. Because of its connection with the VaR and the relative entropy, this risk measure is called "entropic value at risk". The EVaR was developed to tackle some computational inefficiencies of the CVaR. Getting inspiration from the dual representation of the EVaR, Ahmadi-Javid developed a wide class of coherent risk measures, called g-entropic risk measures. Both the CVaR and the EVaR are members of this class.
Quadratic pseudo-Boolean optimisation (QPBO) is a combinatorial optimization method for minimizing quadratic pseudo-Boolean functions in the form
The welfare maximization problem is an optimization problem studied in economics and computer science. Its goal is to partition a set of items among agents with different utility functions, such that the welfare – defined as the sum of the agents' utilities – is as high as possible. In other words, the goal is to find an item allocation satisfying the utilitarian rule.