Pseudo-Boolean function

Last updated

In mathematics and optimization, a pseudo-Boolean function is a function of the form

Contents

where B = {0, 1} is a Boolean domain and n is a nonnegative integer called the arity of the function. A Boolean function is then a special case, where the values are also restricted to 0 or 1.

Representations

Any pseudo-Boolean function can be written uniquely as a multi-linear polynomial: [1] [2]

The degree of the pseudo-Boolean function is simply the degree of the polynomial in this representation.

In many settings (e.g., in Fourier analysis of pseudo-Boolean functions), a pseudo-Boolean function is viewed as a function that maps to . Again in this case we can uniquely write as a multi-linear polynomial: where are Fourier coefficients of and .

Optimization

Minimizing (or, equivalently, maximizing) a pseudo-Boolean function is NP-hard. This can easily be seen by formulating, for example, the maximum cut problem as maximizing a pseudo-Boolean function. [3]

Submodularity

The submodular set functions can be viewed as a special class of pseudo-Boolean functions, which is equivalent to the condition

This is an important class of pseudo-boolean functions, because they can be minimized in polynomial time. Note that minimization of a submodular function is a polynomially solvable problem independent on the presentation form, for e.g. pesudo-Boolean polynomials, opposite to maximization of a submodular function which is NP-hard, Alexander Schrijver (2000).

Roof Duality

If f is a quadratic polynomial, a concept called roof duality can be used to obtain a lower bound for its minimum value. [3] Roof duality may also provide a partial assignment of the variables, indicating some of the values of a minimizer to the polynomial. Several different methods of obtaining lower bounds were developed only to later be shown to be equivalent to what is now called roof duality. [3]

Quadratizations

If the degree of f is greater than 2, one can always employ reductions to obtain an equivalent quadratic problem with additional variables. One possible reduction is

There are other possibilities, for example,

Different reductions lead to different results. Take for example the following cubic polynomial: [4]

Using the first reduction followed by roof duality, we obtain a lower bound of -3 and no indication on how to assign the three variables. Using the second reduction, we obtain the (tight) lower bound of -2 and the optimal assignment of every variable (which is ).

Polynomial Compression Algorithms

Consider a pseudo-Boolean function as a mapping from to . Then Assume that each coefficient is integral. Then for an integer the problem P of deciding whether is more or equal to is NP-complete. It is proved in [5] that in polynomial time we can either solve P or reduce the number of variables to Let be the degree of the above multi-linear polynomial for . Then [5] proved that in polynomial time we can either solve P or reduce the number of variables to .

See also

Notes

  1. Hammer, P.L.; Rosenberg, I.; Rudeanu, S. (1963). "On the determination of the minima of pseudo-Boolean functions". Studii și cercetări matematice (in Romanian) (14): 359–364. ISSN   0039-4068.
  2. Hammer, Peter L.; Rudeanu, Sergiu (1968). Boolean Methods in Operations Research and Related Areas. Springer. ISBN   978-3-642-85825-3.
  3. 1 2 3 Boros, E.; Hammer, P. L. (2002). "Pseudo-Boolean Optimization". Discrete Applied Mathematics . 123 (1–3): 155–225. doi: 10.1016/S0166-218X(01)00341-9 .
  4. Kahl, F.; Strandmark, P. (2011). Generalized Roof Duality for Pseudo-Boolean Optimization (PDF). International Conference on Computer Vision.
  5. 1 2 Crowston, R.; Fellows, M.; Gutin, G.; Jones, M.; Rosamond, F.; Thomasse, S.; Yeo, A. (2011). "Simultaneously Satisfying Linear Equations Over GF(2): MaxLin2 and Max-r-Lin2 Parameterized Above Average". Proc. Of FSTTCS 2011. arXiv: 1104.1135 . Bibcode:2011arXiv1104.1135C.

Related Research Articles

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

<span class="mw-page-title-main">Linear programming</span> Method to solve some optimization problems

Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements are represented by linear relationships. Linear programming is a special case of mathematical programming.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

<span class="mw-page-title-main">Support vector machine</span> Set of methods for supervised statistical learning

In machine learning, support vector machines are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories by Vladimir Vapnik with colleagues SVMs are one of the most robust prediction methods, being based on statistical learning frameworks or VC theory proposed by Vapnik and Chervonenkis (1974). Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier. SVM maps training examples to points in space so as to maximise the width of the gap between the two categories. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall.

<span class="mw-page-title-main">Logistic regression</span> Statistical model for a binary dependent variable

In statistics, the logistic model is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression analysis, logistic regression is estimating the parameters of a logistic model. Formally, in binary logistic regression there is a single binary dependent variable, coded by an indicator variable, where the two values are labeled "0" and "1", while the independent variables can each be a binary variable or a continuous variable. The corresponding probability of the value labeled "1" can vary between 0 and 1, hence the labeling; the function that converts log-odds to probability is the logistic function, hence the name. The unit of measurement for the log-odds scale is called a logit, from logistic unit, hence the alternative names. See § Background and § Definition for formal mathematics, and § Example for a worked example.

<span class="mw-page-title-main">Expectation–maximization algorithm</span> Iterative method for finding maximum likelihood estimates in statistical models

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. These parameter-estimates are then used to determine the distribution of the latent variables in the next E step.

In statistics, propagation of uncertainty is the effect of variables' uncertainties on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations which propagate due to the combination of variables in the function.

<span class="mw-page-title-main">Total least squares</span>

In applied statistics, total least squares is a type of errors-in-variables regression, a least squares data modeling technique in which observational errors on both dependent and independent variables are taken into account. It is a generalization of Deming regression and also of orthogonal regression, and can be applied to both linear and non-linear models.

<span class="mw-page-title-main">Nonlinear regression</span> Regression analysis

In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fitted by a method of successive approximations.

<span class="mw-page-title-main">Gauss–Newton algorithm</span> Mathematical algorithm

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard. With recent advancements in computing and optimization algorithms, convex programming is nearly as straightforward as linear programming.

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box–Cox transformed regressors ().

Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and generalized (correlated) residuals. Numerical methods for linear least squares include inverting the matrix of the normal equations and orthogonal decomposition methods.

A sum-of-squares optimization program is an optimization problem with a linear cost function and a particular type of constraint on the decision variables. These constraints are of the form that when the decision variables are used as coefficients in certain polynomials, those polynomials should have the polynomial SOS property. When fixing the maximum degree of the polynomials involved, sum-of-squares optimization is also known as the Lasserre hierarchy of relaxations in semidefinite programming.

Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the objective, but the augmented Lagrangian method adds yet another term designed to mimic a Lagrange multiplier. The augmented Lagrangian is related to, but not identical with, the method of Lagrange multipliers.

In mathematics, a submodular set function is a set function that, informally, describes the relationship between a set of inputs and an output, where adding more of one input has a decreasing additional benefit. The natural diminishing returns property which makes them suitable for many applications, including approximation algorithms, game theory and electrical networks. Recently, submodular functions have also found utility in several real world problems in machine learning and artificial intelligence, including automatic summarization, multi-document summarization, feature selection, active learning, sensor placement, image collection summarization and many other domains.

In financial mathematics and stochastic optimization, the concept of risk measure is used to quantify the risk involved in a random outcome or risk position. Many risk measures have hitherto been proposed, each having certain characteristics. The entropic value at risk (EVaR) is a coherent risk measure introduced by Ahmadi-Javid, which is an upper bound for the value at risk (VaR) and the conditional value at risk (CVaR), obtained from the Chernoff inequality. The EVaR can also be represented by using the concept of relative entropy. Because of its connection with the VaR and the relative entropy, this risk measure is called "entropic value at risk". The EVaR was developed to tackle some computational inefficiencies of the CVaR. Getting inspiration from the dual representation of the EVaR, Ahmadi-Javid developed a wide class of coherent risk measures, called g-entropic risk measures. Both the CVaR and the EVaR are members of this class.

Quantum optimization algorithms are quantum algorithms that are used to solve optimization problems. Mathematical optimization deals with finding the best solution to a problem from a set of possible solutions. Mostly, the optimization problem is formulated as a minimization problem, where one tries to minimize an error which depends on the solution: the optimal solution has the minimal error. Different optimization techniques are applied in various fields such as mechanics, economics and engineering, and as the complexity and amount of data involved rise, more efficient ways of solving optimization problems are needed. Quantum computing may allow problems which are not practically feasible on classical computers to be solved, or suggest a considerable speed up with respect to the best known classical algorithm.

Graph cut optimization is a combinatorial optimization method applicable to a family of functions of discrete variables, named after the concept of cut in the theory of flow networks. Thanks to the max-flow min-cut theorem, determining the minimum cut over a graph representing a flow network is equivalent to computing the maximum flow over the network. Given a pseudo-Boolean function , if it is possible to construct a flow network with positive weights such that

Quadratic pseudo-Boolean optimisation (QPBO) is a combinatorial optimization method for quadratic pseudo-Boolean functions in the form

References