Pseudo-Boolean function

Last updated January 19, 2025

In mathematics and optimization, a pseudo-Boolean function is a function of the form

Representations

Any pseudo-Boolean function can be written uniquely as a multi-linear polynomial:^[1]^[2]

f({\boldsymbol {x}})=a+\sum _{i}a_{i}x_{i}+\sum _{i<j}a_{ij}x_{i}x_{j}+\sum _{i<j<k}a_{ijk}x_{i}x_{j}x_{k}+\ldots

The degree of the pseudo-Boolean function is simply the degree of the polynomial in this representation.

In many settings (e.g., in Fourier analysis of pseudo-Boolean functions), a pseudo-Boolean function is viewed as a function $f$ that maps $\{-1,1\}^{n}$ to $\mathbb {R}$ . Again in this case we can uniquely write $f$ as a multi-linear polynomial: $f(x)=\sum _{I\subseteq [n]}{\hat {f}}(I)\prod _{i\in I}x_{i},$ where ${\hat {f}}(I)$ are Fourier coefficients of $f$ and $[n]=\{1,...,n\}$ .

Optimization

Minimizing (or, equivalently, maximizing) a pseudo-Boolean function is NP-hard. This can easily be seen by formulating, for example, the maximum cut problem as maximizing a pseudo-Boolean function.^[3]

Submodularity

The submodular set functions can be viewed as a special class of pseudo-Boolean functions, which is equivalent to the condition

f({\boldsymbol {x}})+f({\boldsymbol {y}})\geq f({\boldsymbol {x}}\wedge {\boldsymbol {y}})+f({\boldsymbol {x}}\vee {\boldsymbol {y}}),\;\forall {\boldsymbol {x}},{\boldsymbol {y}}\in \mathbf {B} ^{n}\,.

This is an important class of pseudo-boolean functions, because they can be minimized in polynomial time. Note that minimization of a submodular function is a polynomially solvable problem independent on the presentation form, for e.g. pesudo-Boolean polynomials, opposite to maximization of a submodular function which is NP-hard, Alexander Schrijver (2000).

Roof Duality

If f is a quadratic polynomial, a concept called roof duality can be used to obtain a lower bound for its minimum value.^[3] Roof duality may also provide a partial assignment of the variables, indicating some of the values of a minimizer to the polynomial. Several different methods of obtaining lower bounds were developed only to later be shown to be equivalent to what is now called roof duality.^[3]

Quadratizations

If the degree of f is greater than 2, one can always employ reductions to obtain an equivalent quadratic problem with additional variables. One possible reduction is

\displaystyle -x_{1}x_{2}x_{3}=\min _{z\in \mathbf {B} }z(2-x_{1}-x_{2}-x_{3})

There are other possibilities, for example,

\displaystyle -x_{1}x_{2}x_{3}=\min _{z\in \mathbf {B} }z(-x_{1}+x_{2}+x_{3})-x_{1}x_{2}-x_{1}x_{3}+x_{1}.

Different reductions lead to different results. Take for example the following cubic polynomial:^[4]

\displaystyle f({\boldsymbol {x}})=-2x_{1}+x_{2}-x_{3}+4x_{1}x_{2}+4x_{1}x_{3}-2x_{2}x_{3}-2x_{1}x_{2}x_{3}.

Using the first reduction followed by roof duality, we obtain a lower bound of -3 and no indication on how to assign the three variables. Using the second reduction, we obtain the (tight) lower bound of -2 and the optimal assignment of every variable (which is ${(0,1,1)}$ ).

Polynomial Compression Algorithms

Consider a pseudo-Boolean function $f$ as a mapping from $\{-1,1\}^{n}$ to $\mathbb {R}$ . Then $f(x)=\sum _{I\subseteq [n]}{\hat {f}}(I)\prod _{i\in I}x_{i}.$ Assume that each coefficient ${\hat {f}}(I)$ is integral. Then for an integer $k$ the problem P of deciding whether $f(x)$ is more or equal to $k$ is NP-complete. It is proved in ^[5] that in polynomial time we can either solve P or reduce the number of variables to $O(k^{2}\log k).$ Let $r$ be the degree of the above multi-linear polynomial for $f$ . Then ^[5] proved that in polynomial time we can either solve P or reduce the number of variables to $r(k-1)$ .

Notes

↑ Hammer, P.L.; Rosenberg, I.; Rudeanu, S. (1963). "On the determination of the minima of pseudo-Boolean functions". Studii și cercetări matematice (in Romanian) (14): 359–364. ISSN 0039-4068.
↑ Hammer, Peter L.; Rudeanu, Sergiu (1968). Boolean Methods in Operations Research and Related Areas. Springer. ISBN 978-3-642-85825-3.
1 2 3 Boros, E.; Hammer, P. L. (2002). "Pseudo-Boolean Optimization". Discrete Applied Mathematics . 123 (1–3): 155–225. doi: 10.1016/S0166-218X(01)00341-9 .
↑ Kahl, F.; Strandmark, P. (2011). Generalized Roof Duality for Pseudo-Boolean Optimization (PDF). International Conference on Computer Vision.
1 2 Crowston, R.; Fellows, M.; Gutin, G.; Jones, M.; Rosamond, F.; Thomasse, S.; Yeo, A. (2011). "Simultaneously Satisfying Linear Equations Over GF(2): MaxLin2 and Max-r-Lin2 Parameterized Above Average". Proc. Of FSTTCS 2011. arXiv: 1104.1135 . Bibcode:2011arXiv1104.1135C.

Related Research Articles

In vector calculus, the curl, also known as rotor, is a vector operator that describes the infinitesimal circulation of a vector field in three-dimensional Euclidean space. The curl at a point in the field is represented by a vector whose length and direction denote the magnitude and axis of the maximum circulation. The curl of a field is formally defined as the circulation density at each point of the field.

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements and objective are represented by linear relationships. Linear programming is a special case of mathematical programming.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

In machine learning, support vector machines are supervised max-margin models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories, SVMs are one of the most studied models, being based on statistical learning frameworks of VC theory proposed by Vapnik and Chervonenkis (1974).

In regression analysis, least squares is a parameter estimation method based on minimizing the sum of the squares of the residuals made in the results of each individual equation.

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

In statistics, the logistic model is a statistical model that models the log-odds of an event as a linear combination of one or more independent variables. In regression analysis, logistic regression estimates the parameters of a logistic model. In binary logistic regression there is a single binary dependent variable, coded by an indicator variable, where the two values are labeled "0" and "1", while the independent variables can each be a binary variable or a continuous variable. The corresponding probability of the value labeled "1" can vary between 0 and 1, hence the labeling; the function that converts log-odds to probability is the logistic function, hence the name. The unit of measurement for the log-odds scale is called a logit, from logistic unit, hence the alternative names. See § Background and § Definition for formal mathematics, and § Example for a worked example.

In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and that the subcomponents are statistically independent from each other. ICA was invented by Jeanny Hérault and Christian Jutten in 1985. ICA is a special case of blind source separation. A common example application of ICA is the "cocktail party problem" of listening in on one person's speech in a noisy room.

In applied statistics, total least squares is a type of errors-in-variables regression, a least squares data modeling technique in which observational errors on both dependent and independent variables are taken into account. It is a generalization of Deming regression and also of orthogonal regression, and can be applied to both linear and non-linear models.

In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fitted by a method of successive approximations (iterations).

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box–Cox transformed regressors ( $).$

Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and generalized (correlated) residuals. Numerical methods for linear least squares include inverting the matrix of the normal equations and orthogonal decomposition methods.

In mathematics, a submodular set function is a set function that, informally, describes the relationship between a set of inputs and an output, where adding more of one input has a decreasing additional benefit. The natural diminishing returns property which makes them suitable for many applications, including approximation algorithms, game theory and electrical networks. Recently, submodular functions have also found utility in several real world problems in machine learning and artificial intelligence, including automatic summarization, multi-document summarization, feature selection, active learning, sensor placement, image collection summarization and many other domains.

In financial mathematics and stochastic optimization, the concept of risk measure is used to quantify the risk involved in a random outcome or risk position. Many risk measures have hitherto been proposed, each having certain characteristics. The entropic value at risk (EVaR) is a coherent risk measure introduced by Ahmadi-Javid, which is an upper bound for the value at risk (VaR) and the conditional value at risk (CVaR), obtained from the Chernoff inequality. The EVaR can also be represented by using the concept of relative entropy. Because of its connection with the VaR and the relative entropy, this risk measure is called "entropic value at risk". The EVaR was developed to tackle some computational inefficiencies of the CVaR. Getting inspiration from the dual representation of the EVaR, Ahmadi-Javid developed a wide class of coherent risk measures, called g-entropic risk measures. Both the CVaR and the EVaR are members of this class.

Quantum optimization algorithms are quantum algorithms that are used to solve optimization problems. Mathematical optimization deals with finding the best solution to a problem from a set of possible solutions. Mostly, the optimization problem is formulated as a minimization problem, where one tries to minimize an error which depends on the solution: the optimal solution has the minimal error. Different optimization techniques are applied in various fields such as mechanics, economics and engineering, and as the complexity and amount of data involved rise, more efficient ways of solving optimization problems are needed. Quantum computing may allow problems which are not practically feasible on classical computers to be solved, or suggest a considerable speed up with respect to the best known classical algorithm.

Graph cut optimization is a combinatorial optimization method applicable to a family of functions of discrete variables, named after the concept of cut in the theory of flow networks. Thanks to the max-flow min-cut theorem, determining the minimum cut over a graph representing a flow network is equivalent to computing the maximum flow over the network. Given a pseudo-Boolean function $, if it is possible to construct a flow network with positive weights such that$

Quadratic pseudo-Boolean optimisation (QPBO) is a combinatorial optimization method for minimizing quadratic pseudo-Boolean functions in the form

References

Ishikawa, H. (2011). "Transformation of general binary MRF minimization to the first order case". IEEE Transactions on Pattern Analysis and Machine Intelligence . 33 (6): 1234–1249. CiteSeerX 10.1.1.675.2183 . doi:10.1109/tpami.2010.91. PMID 20421673. S2CID 17314555.
O'Donnell, Ryan (2008). "Some topics in analysis of Boolean functions". ECCC. ISSN 1433-8092.
Rother, C.; Kolmogorov, V.; Lempitsky, V.; Szummer, M. (2007). Optimizing Binary MRFs via Extended Roof Duality (PDF). Conference on Computer Vision and Pattern Recognition.
Schrijver, Alexander (November 2000). "A Combinatorial Algorithm Minimizing Submodular Functions in Strongly Polynomial Time". Journal of Combinatorial Theory. 80 (2): 346–355. doi: 10.1006/jctb.2000.1989 .

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Hammer, P.L.; Rosenberg, I.; Rudeanu, S. (1963). "On the determination of the minima of pseudo-Boolean functions". Studii și cercetări matematice (in Romanian) (14): 359–364. ISSN 0039-4068.

[2] Hammer, Peter L.; Rudeanu, Sergiu (1968). Boolean Methods in Operations Research and Related Areas. Springer. ISBN 978-3-642-85825-3.

[Boros2002-3] 1 2 3 Boros, E.; Hammer, P. L. (2002). "Pseudo-Boolean Optimization". Discrete Applied Mathematics . 123 (1–3): 155–225. doi: 10.1016/S0166-218X(01)00341-9 .

[Kahl2011-4] Kahl, F.; Strandmark, P. (2011). Generalized Roof Duality for Pseudo-Boolean Optimization (PDF). International Conference on Computer Vision.

[Crowston2011-5] 1 2 Crowston, R.; Fellows, M.; Gutin, G.; Jones, M.; Rosamond, F.; Thomasse, S.; Yeo, A. (2011). "Simultaneously Satisfying Linear Equations Over GF(2): MaxLin2 and Max-r-Lin2 Parameterized Above Average". Proc. Of FSTTCS 2011. arXiv: 1104.1135 . Bibcode:2011arXiv1104.1135C.

[1]

[2]

[3]

[4]

[5]