Quadratic unconstrained binary optimization

Last updated

Quadratic unconstrained binary optimization (QUBO), also known as unconstrained binary quadratic programming (UBQP), is a combinatorial optimization problem with a wide range of applications from finance and economics to machine learning. [1] QUBO is an NP hard problem, and for many classical problems from theoretical computer science, like maximum cut, graph coloring and the partition problem, embeddings into QUBO have been formulated. [2] [3] Embeddings for machine learning models include support-vector machines, clustering and probabilistic graphical models. [4] Moreover, due to its close connection to Ising models, QUBO constitutes a central problem class for adiabatic quantum computation, where it is solved through a physical process called quantum annealing. [5]

Contents

Definition

The set of binary vectors of a fixed length is denoted by , where is the set of binary values (or bits). We are given a real-valued upper triangular matrix , whose entries define a weight for each pair of indices within the binary vector. We can define a function that assigns a value to each binary vector through

Intuitively, the weight is added if both and have value 1. When , the values are added if , as for all .

The QUBO problem consists of finding a binary vector that is minimal with respect to , namely

In general, is not unique, meaning there may be a set of minimizing vectors with equal value w.r.t. . The complexity of QUBO arises from the number of candidate binary vectors to be evaluated, as grows exponentially in .

Sometimes, QUBO is defined as the problem of maximizing, which is equivalent to minimizing .

Properties

QUBO is scale invariant for positive factors , which leave the optimum unchanged:

In its general form, QUBO is NP-hard and cannot be solved efficiently by any polynomial-time algorithm. [6] However, there are polynomially-solvable special cases, where has certain properties, [7] for example:

QUBO can be solved using integer linear programming solvers like CPLEX or Gurobi Optimizer. This is possible since QUBO can be reformulated as a linear constrained binary optimization problem. To achieve this, substitute the product by an additional binary variable and add the constraints , and . Note that can also be relaxed to continuous variables within the bounds zero and one.

Applications

QUBO is a structurally simple, yet computationally hard optimization problem. It can be used to encode a wide range of optimization problems from various scientific areas. [9]

Cluster Analysis

Binary Clustering with QUBO
Qubo-clustering-1.svg
A bad cluster assignment.
Qubo-clustering-2.svg
A good cluster assignment.
Visual representation of a clustering problem with 20 points: Circles of the same color belong to the same cluster. Each circle can be understood as a binary variable in the corresponding QUBO problem.

As an illustrative example of how QUBO can be used to encode an optimization problem, we consider the problem of cluster analysis. Here, we are given a set of 20 points in 2D space, described by a matrix , where each row contains two cartesian coordinates. We want to assign each point to one of two classes or clusters, such that points in the same cluster are similar to each other. For two clusters, we can assign a binary variable to the point corresponding to the -th row in , indicating whether it belongs to the first () or second cluster (). Consequently, we have 20 binary variables, which form a binary vector that corresponds to a cluster assignment of all points (see figure).

One way to derive a clustering is to consider the pairwise distances between points. Given a cluster assignment , one of or evaluates to 1 if points and are in the same cluster. Similarly, one of or indicates that they are in different clusters. Let denote the Euclidean distance between points and . In order to define a cost function to minimize, when points and are in the same cluster we add their positive distance , and subtract it when they are in different clusters. This way, an optimal solution tends to place points which are far apart into different clusters, and points that are close into the same cluster. The cost function thus comes down to

From the second line, the QUBO parameters can be easily found by re-arranging to be:

Using these parameters, the optimal QUBO solution will correspond to an optimal cluster w.r.t. above cost function.

Connection to Ising models

QUBO is very closely related and computationally equivalent to the Ising model, whose Hamiltonian function is defined as

with real-valued parameters for all . The spin variables are binary with values from instead of . Moreover, in the Ising model the variables are typically arranged in a lattice where only neighboring pairs of variables can have non-zero coefficients. Applying the identity yields an equivalent QUBO problem: [2]

where

and using the fact that for a binary variable .

As the constant does not change the position of the optimum , it can be neglected during optimization and is only important for recovering the original Hamiltonian function value.

Related Research Articles

In mathematics, particularly linear algebra, an orthonormal basis for an inner product space V with finite dimension is a basis for whose vectors are orthonormal, that is, they are all unit vectors and orthogonal to each other. For example, the standard basis for a Euclidean space is an orthonormal basis, where the relevant inner product is the dot product of vectors. The image of the standard basis under a rotation or reflection is also orthonormal, and every orthonormal basis for arises in this fashion.

The Fock space is an algebraic construction used in quantum mechanics to construct the quantum states space of a variable or unknown number of identical particles from a single particle Hilbert space H. It is named after V. A. Fock who first introduced it in his 1932 paper "Konfigurationsraum und zweite Quantelung".

<span class="mw-page-title-main">Helmholtz free energy</span> Thermodynamic potential

In thermodynamics, the Helmholtz free energy is a thermodynamic potential that measures the useful work obtainable from a closed thermodynamic system at a constant temperature (isothermal). The change in the Helmholtz energy during a process is equal to the maximum amount of work that the system can perform in a thermodynamic process in which temperature is held constant. At constant temperature, the Helmholtz free energy is minimized at equilibrium.

In quantum mechanics, perturbation theory is a set of approximation schemes directly related to mathematical perturbation for describing a complicated quantum system in terms of a simpler one. The idea is to start with a simple system for which a mathematical solution is known, and add an additional "perturbing" Hamiltonian representing a weak disturbance to the system. If the disturbance is not too large, the various physical quantities associated with the perturbed system can be expressed as "corrections" to those of the simple system. These corrections, being small compared to the size of the quantities themselves, can be calculated using approximate methods such as asymptotic series. The complicated system can therefore be studied based on knowledge of the simpler one. In effect, it is describing a complicated unsolved system using a simple, solvable system.

In mathematics, the Hodge star operator or Hodge star is a linear map defined on the exterior algebra of a finite-dimensional oriented vector space endowed with a nondegenerate symmetric bilinear form. Applying the operator to an element of the algebra produces the Hodge dual of the element. This map was introduced by W. V. D. Hodge.

A conformal field theory (CFT) is a quantum field theory that is invariant under conformal transformations. In two dimensions, there is an infinite-dimensional algebra of local conformal transformations, and conformal field theories can sometimes be exactly solved or classified.

<span class="mw-page-title-main">Stark effect</span> Spectral line splitting in electrical field

The Stark effect is the shifting and splitting of spectral lines of atoms and molecules due to the presence of an external electric field. It is the electric-field analogue of the Zeeman effect, where a spectral line is split into several components due to the presence of the magnetic field. Although initially coined for the static case, it is also used in the wider context to describe the effect of time-dependent electric fields. In particular, the Stark effect is responsible for the pressure broadening of spectral lines by charged particles in plasmas. For most spectral lines, the Stark effect is either linear or quadratic with a high accuracy.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

Functional data analysis (FDA) is a branch of statistics that analyses data providing information about curves, surfaces or anything else varying over a continuum. In its most general form, under an FDA framework, each sample element of functional data is considered to be a random function. The physical continuum over which these functions are defined is often time, but may also be spatial location, wavelength, probability, etc. Intrinsically, functional data are infinite dimensional. The high intrinsic dimensionality of these data brings challenges for theory as well as computation, where these challenges vary with how the functional data were sampled. However, the high or infinite dimensional structure of the data is a rich source of information and there are many interesting challenges for research and data analysis.

In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied.

In physics, a sigma model is a field theory that describes the field as a point particle confined to move on a fixed manifold. This manifold can be taken to be any Riemannian manifold, although it is most commonly taken to be either a Lie group or a symmetric space. The model may or may not be quantized. An example of the non-quantized version is the Skyrme model; it cannot be quantized due to non-linearities of power greater than 4. In general, sigma models admit (classical) topological soliton solutions, for example, the Skyrmion for the Skyrme model. When the sigma field is coupled to a gauge field, the resulting model is described by Ginzburg–Landau theory. This article is primarily devoted to the classical field theory of the sigma model; the corresponding quantized theory is presented in the article titled "non-linear sigma model".

In linear algebra, a frame of an inner product space is a generalization of a basis of a vector space to sets that may be linearly dependent. In the terminology of signal processing, a frame provides a redundant, stable way of representing a signal. Frames are used in error detection and correction and the design and analysis of filter banks and more generally in applied mathematics, computer science, and engineering.

<span class="mw-page-title-main">Wrapped normal distribution</span>

In probability theory and directional statistics, a wrapped normal distribution is a wrapped probability distribution that results from the "wrapping" of the normal distribution around the unit circle. It finds application in the theory of Brownian motion and is a solution to the heat equation for periodic boundary conditions. It is closely approximated by the von Mises distribution, which, due to its mathematical simplicity and tractability, is the most commonly used distribution in directional statistics.

Coherent states have been introduced in a physical context, first as quasi-classical states in quantum mechanics, then as the backbone of quantum optics and they are described in that spirit in the article Coherent states. However, they have generated a huge variety of generalizations, which have led to a tremendous amount of literature in mathematical physics. In this article, we sketch the main directions of research on this line. For further details, we refer to several existing surveys.

<span class="mw-page-title-main">Matrix completion</span>

Matrix completion is the task of filling in the missing entries of a partially observed matrix, which is equivalent to performing data imputation in statistics. A wide range of datasets are naturally organized in matrix form. One example is the movie-ratings matrix, as appears in the Netflix problem: Given a ratings matrix in which each entry represents the rating of movie by customer , if customer has watched movie and is otherwise missing, we would like to predict the remaining entries in order to make good recommendations to customers on what to watch next. Another example is the document-term matrix: The frequencies of words used in a collection of documents can be represented as a matrix, where each entry corresponds to the number of times the associated term appears in the indicated document.

Proximal gradientmethods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of convex regularization problems where the regularization penalty may not be differentiable. One such example is regularization of the form

In machine learning, the kernel embedding of distributions comprises a class of nonparametric methods in which a probability distribution is represented as an element of a reproducing kernel Hilbert space (RKHS). A generalization of the individual data-point feature mapping done in classical kernel methods, the embedding of distributions into infinite-dimensional feature spaces can preserve all of the statistical features of arbitrary distributions, while allowing one to compare and manipulate distributions using Hilbert space operations such as inner products, distances, projections, linear transformations, and spectral analysis. This learning framework is very general and can be applied to distributions over any space on which a sensible kernel function may be defined. For example, various kernels have been proposed for learning from data which are: vectors in , discrete classes/categories, strings, graphs/networks, images, time series, manifolds, dynamical systems, and other structured objects. The theory behind kernel embeddings of distributions has been primarily developed by Alex Smola, Le Song , Arthur Gretton, and Bernhard Schölkopf. A review of recent works on kernel embedding of distributions can be found in.

In mathematical physics, Clebsch–Gordan coefficients are the expansion coefficients of total angular momentum eigenstates in an uncoupled tensor product basis. Mathematically, they specify the decomposition of the tensor product of two irreducible representations into a direct sum of irreducible representations, where the type and the multiplicities of these irreducible representations are known abstractly. The name derives from the German mathematicians Alfred Clebsch (1833–1872) and Paul Gordan (1837–1912), who encountered an equivalent problem in invariant theory.

Quantum optimization algorithms are quantum algorithms that are used to solve optimization problems. Mathematical optimization deals with finding the best solution to a problem from a set of possible solutions. Mostly, the optimization problem is formulated as a minimization problem, where one tries to minimize an error which depends on the solution: the optimal solution has the minimal error. Different optimization techniques are applied in various fields such as mechanics, economics and engineering, and as the complexity and amount of data involved rise, more efficient ways of solving optimization problems are needed. Quantum computing may allow problems which are not practically feasible on classical computers to be solved, or suggest a considerable speed up with respect to the best known classical algorithm.

The two-dimensional critical Ising model is the critical limit of the Ising model in two dimensions. It is a two-dimensional conformal field theory whose symmetry algebra is the Virasoro algebra with the central charge . Correlation functions of the spin and energy operators are described by the minimal model. While the minimal model has been exactly solved, see also, e.g., the article on Ising critical exponents, the solution does not cover other observables such as connectivities of clusters.

References

  1. Kochenberger, Gary; Hao, Jin-Kao; Glover, Fred; Lewis, Mark; Lu, Zhipeng; Wang, Haibo; Wang, Yang (2014). "The unconstrained binary quadratic programming problem: a survey" (PDF). Journal of Combinatorial Optimization. 28: 58–81. doi:10.1007/s10878-014-9734-0. S2CID   16808394.
  2. 1 2 Glover, Fred; Kochenberger, Gary (2019). "A Tutorial on Formulating and Using QUBO Models". arXiv: 1811.11538 [cs.DS].
  3. Lucas, Andrew (2014). "Ising formulations of many NP problems". Frontiers in Physics. 2: 5. arXiv: 1302.5843 . Bibcode:2014FrP.....2....5L. doi: 10.3389/fphy.2014.00005 .
  4. Mücke, Sascha; Piatkowski, Nico; Morik, Katharina (2019). "Learning Bit by Bit: Extracting the Essence of Machine Learning" (PDF). LWDA. S2CID   202760166. Archived from the original (PDF) on 2020-02-27.
  5. Tom Simonite (8 May 2013). "D-Wave's Quantum Computer Goes to the Races, Wins". MIT Technology Review. Archived from the original on 24 September 2015. Retrieved 12 May 2013.
  6. A. P. Punnen (editor), Quadratic unconstrained binary optimization problem: Theory, Algorithms, and Applications, Springer, Springer, 2022.
  7. Çela, E., Punnen, A.P. (2022). Complexity and Polynomially Solvable Special Cases of QUBO. In: Punnen, A.P. (eds) The Quadratic Unconstrained Binary Optimization Problem. Springer, Cham. https://doi.org/10.1007/978-3-031-04520-2_3
  8. See Theorem 3.16 in Punnen (2022); note that the authors assume the maximization version of QUBO.
  9. Ratke, Daniel (2021-06-10). "List of QUBO formulations" . Retrieved 2022-12-16.