Jump to content
Wikipedia The Free Encyclopedia

Quadratic unconstrained binary optimization

From Wikipedia, the free encyclopedia
Combinatorial optimization problem

Quadratic unconstrained binary optimization (QUBO), also known as unconstrained binary quadratic programming (UBQP), is a combinatorial optimization problem with a wide range of applications from finance and economics to machine learning.[1] QUBO is an NP hard problem, and for many classical problems from theoretical computer science, like maximum cut, graph coloring and the partition problem, embeddings into QUBO have been formulated.[2] [3] Embeddings for machine learning models include support-vector machines, clustering and probabilistic graphical models.[4] Moreover, due to its close connection to Ising models, QUBO constitutes a central problem class for adiabatic quantum computation, where it is solved through a physical process called quantum annealing.[5]

Definition

[edit ]

Let B = { 0 , 1 } {\displaystyle \mathbb {B} =\lbrace 0,1\rbrace } {\displaystyle \mathbb {B} =\lbrace 0,1\rbrace } the set of binary digits (or bits), then B n {\displaystyle \mathbb {B} ^{n}} {\displaystyle \mathbb {B} ^{n}} is the set of binary vectors of fixed length n N {\displaystyle n\in \mathbb {N} } {\displaystyle n\in \mathbb {N} }. Given a symmetric or upper triangular matrix Q R n × n {\displaystyle {\boldsymbol {Q}}\in \mathbb {R} ^{n\times n}} {\displaystyle {\boldsymbol {Q}}\in \mathbb {R} ^{n\times n}}, whose entries Q i j {\displaystyle Q_{ij}} {\displaystyle Q_{ij}} define a weight for each pair of indices i , j { 1 , , n } {\displaystyle i,j\in \lbrace 1,\dots ,n\rbrace } {\displaystyle i,j\in \lbrace 1,\dots ,n\rbrace }, we can define the function f Q : B n R {\displaystyle f_{\boldsymbol {Q}}:\mathbb {B} ^{n}\rightarrow \mathbb {R} } {\displaystyle f_{\boldsymbol {Q}}:\mathbb {B} ^{n}\rightarrow \mathbb {R} } that assigns a value to each binary vector x {\displaystyle {\boldsymbol {x}}} {\displaystyle {\boldsymbol {x}}} through

f Q ( x ) = x Q x = i = 1 n j = 1 n Q i j x i x j . {\displaystyle f_{\boldsymbol {Q}}({\boldsymbol {x}})={\boldsymbol {x}}^{\intercal }{\boldsymbol {Qx}}=\sum _{i=1}^{n}\sum _{j=1}^{n}Q_{ij}x_{i}x_{j}.} {\displaystyle f_{\boldsymbol {Q}}({\boldsymbol {x}})={\boldsymbol {x}}^{\intercal }{\boldsymbol {Qx}}=\sum _{i=1}^{n}\sum _{j=1}^{n}Q_{ij}x_{i}x_{j}.}

Alternatively, the linear and quadratic parts can be separated as

f Q , q ( x ) = x Q x + q x , {\displaystyle f_{{\boldsymbol {Q}}',{\boldsymbol {q}}}({\boldsymbol {x}})={\boldsymbol {x}}^{\intercal }{\boldsymbol {Q}}'{\boldsymbol {x}}+{\boldsymbol {q}}^{\intercal }{\boldsymbol {x}},} {\displaystyle f_{{\boldsymbol {Q}}',{\boldsymbol {q}}}({\boldsymbol {x}})={\boldsymbol {x}}^{\intercal }{\boldsymbol {Q}}'{\boldsymbol {x}}+{\boldsymbol {q}}^{\intercal }{\boldsymbol {x}},}

where Q R n × n {\displaystyle {\boldsymbol {Q}}'\in \mathbb {R} ^{n\times n}} {\displaystyle {\boldsymbol {Q}}'\in \mathbb {R} ^{n\times n}} and q R n {\displaystyle {\boldsymbol {q}}\in \mathbb {R} ^{n}} {\displaystyle {\boldsymbol {q}}\in \mathbb {R} ^{n}}. This is equivalent to the previous definition through Q = Q + diag [ q ] {\displaystyle {\boldsymbol {Q}}={\boldsymbol {Q}}'+\operatorname {diag} [{\boldsymbol {q}}]} {\displaystyle {\boldsymbol {Q}}={\boldsymbol {Q}}'+\operatorname {diag} [{\boldsymbol {q}}]} using the diag operator, exploiting that x = x x {\displaystyle x=x\cdot x} {\displaystyle x=x\cdot x} for all binary values x {\displaystyle x} {\displaystyle x}.

Intuitively, the weight Q i j {\displaystyle Q_{ij}} {\displaystyle Q_{ij}} is added if both x i = 1 {\displaystyle x_{i}=1} {\displaystyle x_{i}=1} and x j = 1 {\displaystyle x_{j}=1} {\displaystyle x_{j}=1}. The QUBO problem consists of finding a binary vector x {\displaystyle {\boldsymbol {x}}^{*}} {\displaystyle {\boldsymbol {x}}^{*}} that minimizes f Q {\displaystyle f_{\boldsymbol {Q}}} {\displaystyle f_{\boldsymbol {Q}}}, i.e., x B n :   f Q ( x ) f Q ( x ) {\displaystyle \forall {\boldsymbol {x}}\in \mathbb {B} ^{n}:~f_{\boldsymbol {Q}}({\boldsymbol {x}}^{*})\leq f_{\boldsymbol {Q}}({\boldsymbol {x}})} {\displaystyle \forall {\boldsymbol {x}}\in \mathbb {B} ^{n}:~f_{\boldsymbol {Q}}({\boldsymbol {x}}^{*})\leq f_{\boldsymbol {Q}}({\boldsymbol {x}})}.

In general, x {\displaystyle {\boldsymbol {x}}^{*}} {\displaystyle {\boldsymbol {x}}^{*}} is not unique, meaning there may be a set of minimizing vectors with equal value w.r.t. f Q {\displaystyle f_{\boldsymbol {Q}}} {\displaystyle f_{\boldsymbol {Q}}}. The complexity of QUBO arises from the number of candidate binary vectors to be evaluated, as | B n | = 2 n {\displaystyle \left|\mathbb {B} ^{n}\right|=2^{n}} {\displaystyle \left|\mathbb {B} ^{n}\right|=2^{n}} grows exponentially in n {\displaystyle n} {\displaystyle n}.

Sometimes, QUBO is defined as the problem of maximizing f Q {\displaystyle f_{\boldsymbol {Q}}} {\displaystyle f_{\boldsymbol {Q}}}, which is equivalent to minimizing f Q = f Q {\displaystyle f_{-{\boldsymbol {Q}}}=-f_{\boldsymbol {Q}}} {\displaystyle f_{-{\boldsymbol {Q}}}=-f_{\boldsymbol {Q}}}.

Properties

[edit ]

QUBO is scale invariant for positive factors α > 0 {\displaystyle \alpha >0} {\displaystyle \alpha >0}, which leave the optimum x {\displaystyle {\boldsymbol {x}}^{*}} {\displaystyle {\boldsymbol {x}}^{*}} unchanged:

f α Q ( x ) = x ( α Q ) x = α ( x Q x ) = α f Q ( x ) {\displaystyle f_{\alpha {\boldsymbol {Q}}}({\boldsymbol {x}})={\boldsymbol {x}}^{\intercal }(\alpha {\boldsymbol {Q}}){\boldsymbol {x}}=\alpha ({\boldsymbol {x}}^{\intercal }{\boldsymbol {Qx}})=\alpha f_{\boldsymbol {Q}}({\boldsymbol {x}})} {\displaystyle f_{\alpha {\boldsymbol {Q}}}({\boldsymbol {x}})={\boldsymbol {x}}^{\intercal }(\alpha {\boldsymbol {Q}}){\boldsymbol {x}}=\alpha ({\boldsymbol {x}}^{\intercal }{\boldsymbol {Qx}})=\alpha f_{\boldsymbol {Q}}({\boldsymbol {x}})}.

In its general form, QUBO is NP-hard and cannot be solved efficiently by any polynomial-time algorithm.[6] However, there are polynomially-solvable special cases, where Q {\displaystyle {\boldsymbol {Q}}} {\displaystyle {\boldsymbol {Q}}} has certain properties,[7] for example:

  • If all coefficients are positive, the optimum is trivially x = ( 0 , , 0 ) {\displaystyle {\boldsymbol {x}}^{*}=(0,\dots ,0)^{\intercal }} {\displaystyle {\boldsymbol {x}}^{*}=(0,\dots ,0)^{\intercal }}. Similarly, if all coefficients are negative, the optimum is x = ( 1 , , 1 ) {\displaystyle {\boldsymbol {x}}^{*}=(1,\dots ,1)^{\intercal }} {\displaystyle {\boldsymbol {x}}^{*}=(1,\dots ,1)^{\intercal }}.
  • If Q {\displaystyle {\boldsymbol {Q}}} {\displaystyle {\boldsymbol {Q}}} is diagonal, the bits can be optimized independently, and the problem is solvable in O ( n ) {\displaystyle {\mathcal {O}}(n)} {\displaystyle {\mathcal {O}}(n)}. The optimal variable assignments are simply x i = 1 {\displaystyle x_{i}^{*}=1} {\displaystyle x_{i}^{*}=1} if Q i i < 0 {\displaystyle Q_{ii}<0} {\displaystyle Q_{ii}<0}, and x i = 0 {\displaystyle x_{i}^{*}=0} {\displaystyle x_{i}^{*}=0} otherwise.
  • If all off-diagonal elements of Q {\displaystyle {\boldsymbol {Q}}} {\displaystyle {\boldsymbol {Q}}} are non-positive, the corresponding QUBO problem is solvable in polynomial time.[8]

QUBO can be solved using integer linear programming solvers like CPLEX or Gurobi Optimizer. This is possible since QUBO can be reformulated as a linear constrained binary optimization problem. To achieve this, substitute the product x i x j {\displaystyle x_{i}x_{j}} {\displaystyle x_{i}x_{j}} by an additional binary variable z i j B {\displaystyle z_{ij}\in \mathbb {B} } {\displaystyle z_{ij}\in \mathbb {B} } and add the constraints x i z i j {\displaystyle x_{i}\geq z_{ij}} {\displaystyle x_{i}\geq z_{ij}}, x j z i j {\displaystyle x_{j}\geq z_{ij}} {\displaystyle x_{j}\geq z_{ij}} and x i + x j 1 z i j {\displaystyle x_{i}+x_{j}-1\leq z_{ij}} {\displaystyle x_{i}+x_{j}-1\leq z_{ij}}. Note that z i j {\displaystyle z_{ij}} {\displaystyle z_{ij}} can also be relaxed to continuous variables within the bounds zero and one.

Applications

[edit ]

QUBO is a structurally simple, yet computationally hard optimization problem. It can be used to encode a wide range of optimization problems from various scientific areas.[9]

Maximum Cut

[edit ]

Given a graph G = ( V , E ) {\displaystyle G=(V,E)} {\displaystyle G=(V,E)} with vertex set V = { 1 , , n } {\displaystyle V=\lbrace 1,\dots ,n\rbrace } {\displaystyle V=\lbrace 1,\dots ,n\rbrace } and edges E V × V {\displaystyle E\subseteq V\times V} {\displaystyle E\subseteq V\times V}, the maximum cut (max-cut) problem consists of finding two subsets S , T V {\displaystyle S,T\subseteq V} {\displaystyle S,T\subseteq V} with T = V S {\displaystyle T=V\setminus S} {\displaystyle T=V\setminus S}, such that the number of edges between S {\displaystyle S} {\displaystyle S} and T {\displaystyle T} {\displaystyle T} is maximized.

The more general weighted max-cut problem assumes edge weights w i j 0   i , j V {\displaystyle w_{ij}\geq 0~\forall i,j\in V} {\displaystyle w_{ij}\geq 0~\forall i,j\in V}, with ( i , j ) E w i j = 0 {\displaystyle (i,j)\notin E\Rightarrow w_{ij}=0} {\displaystyle (i,j)\notin E\Rightarrow w_{ij}=0}, and asks for a partition S , T V {\displaystyle S,T\subseteq V} {\displaystyle S,T\subseteq V} that maximizes the sum of edge weights between S {\displaystyle S} {\displaystyle S} and T {\displaystyle T} {\displaystyle T}, i.e.,

max S V i S , j S w i j . {\displaystyle \max _{S\subseteq V}\sum _{i\in S,j\notin S}w_{ij}.} {\displaystyle \max _{S\subseteq V}\sum _{i\in S,j\notin S}w_{ij}.}

By setting w i j = 1 {\displaystyle w_{ij}=1} {\displaystyle w_{ij}=1} for all ( i , j ) E {\displaystyle (i,j)\in E} {\displaystyle (i,j)\in E} this becomes equivalent to the original max-cut problem above, which is why we focus on this more general form in the following.

For every vertex in i V {\displaystyle i\in V} {\displaystyle i\in V} we introduce a binary variable x i {\displaystyle x_{i}} {\displaystyle x_{i}} with the interpretation x i = 0 {\displaystyle x_{i}=0} {\displaystyle x_{i}=0} if i S {\displaystyle i\in S} {\displaystyle i\in S} and x i = 1 {\displaystyle x_{i}=1} {\displaystyle x_{i}=1} if i T {\displaystyle i\in T} {\displaystyle i\in T}. As T = V S {\displaystyle T=V\setminus S} {\displaystyle T=V\setminus S}, every i {\displaystyle i} {\displaystyle i} is in exactly one set, meaning there is a 1:1 correspondence between binary vectors x B n {\displaystyle {\boldsymbol {x}}\in \mathbb {B} ^{n}} {\displaystyle {\boldsymbol {x}}\in \mathbb {B} ^{n}} and partitions of V {\displaystyle V} {\displaystyle V} into two subsets.

We observe that, for any i , j V {\displaystyle i,j\in V} {\displaystyle i,j\in V}, the expression x i ( 1 x j ) + ( 1 x i ) x j {\displaystyle x_{i}(1-x_{j})+(1-x_{i})x_{j}} {\displaystyle x_{i}(1-x_{j})+(1-x_{i})x_{j}} evaluates to 1 if and only if i {\displaystyle i} {\displaystyle i} and j {\displaystyle j} {\displaystyle j} are in different subsets, equivalent to logical XOR. Let W R + n × n {\displaystyle {\boldsymbol {W}}\in \mathbb {R} _{+}^{n\times n}} {\displaystyle {\boldsymbol {W}}\in \mathbb {R} _{+}^{n\times n}} with W i j = w i j   i , j V {\displaystyle W_{ij}=w_{ij}~\forall i,j\in V} {\displaystyle W_{ij}=w_{ij}~\forall i,j\in V}. By extending above expression to matrix-vector form we find that

x W ( 1 x ) + ( 1 x ) W x = 2 x W x + ( W 1 + W 1 ) x {\displaystyle {\boldsymbol {x}}^{\intercal }{\boldsymbol {W}}({\boldsymbol {1}}-{\boldsymbol {x}})+({\boldsymbol {1}}-{\boldsymbol {x}})^{\intercal }{\boldsymbol {Wx}}=-2{\boldsymbol {x}}^{\intercal }{\boldsymbol {Wx}}+({\boldsymbol {W1}}+{\boldsymbol {W}}^{\intercal }{\boldsymbol {1}})^{\intercal }{\boldsymbol {x}}} {\displaystyle {\boldsymbol {x}}^{\intercal }{\boldsymbol {W}}({\boldsymbol {1}}-{\boldsymbol {x}})+({\boldsymbol {1}}-{\boldsymbol {x}})^{\intercal }{\boldsymbol {Wx}}=-2{\boldsymbol {x}}^{\intercal }{\boldsymbol {Wx}}+({\boldsymbol {W1}}+{\boldsymbol {W}}^{\intercal }{\boldsymbol {1}})^{\intercal }{\boldsymbol {x}}}

is the sum of weights of all edges between S {\displaystyle S} {\displaystyle S} and T {\displaystyle T} {\displaystyle T}, where 1 = ( 1 , 1 , , 1 ) R n {\displaystyle {\boldsymbol {1}}=(1,1,\dots ,1)^{\intercal }\in \mathbb {R} ^{n}} {\displaystyle {\boldsymbol {1}}=(1,1,\dots ,1)^{\intercal }\in \mathbb {R} ^{n}}. As this is a quadratic function over x {\displaystyle {\boldsymbol {x}}} {\displaystyle {\boldsymbol {x}}}, it is a QUBO problem whose parameter matrix we can read from above expression as

Q = 2 W diag [ W 1 + W 1 ] , {\displaystyle {\boldsymbol {Q}}=2{\boldsymbol {W}}-\operatorname {diag} [{\boldsymbol {W1}}+{\boldsymbol {W}}^{\intercal }{\boldsymbol {1}}],} {\displaystyle {\boldsymbol {Q}}=2{\boldsymbol {W}}-\operatorname {diag} [{\boldsymbol {W1}}+{\boldsymbol {W}}^{\intercal }{\boldsymbol {1}}],}

after flipping the sign to make it a minimization problem.

Cluster Analysis

[edit ]
Binary Clustering with QUBO
Visual representation of a clustering problem with 20 points: Circles of the same color belong to the same cluster. Each circle can be understood as a binary variable in the corresponding QUBO problem.

Next, we consider the problem of cluster analysis, where we are given a set of N {\displaystyle N} {\displaystyle N} points in d {\displaystyle d} {\displaystyle d}-dimensional space and want to assign each point to one of two classes or clusters, such that points in the same cluster are similar to each other. For this example we set N = 20 {\displaystyle N=20} {\displaystyle N=20} and d = 2 {\displaystyle d=2} {\displaystyle d=2}. The data is given as a matrix X R 20 × 2 {\displaystyle {\boldsymbol {X}}\in \mathbb {R} ^{20\times 2}} {\displaystyle {\boldsymbol {X}}\in \mathbb {R} ^{20\times 2}}, where each row contains two cartesian coordinates. For two clusters, we can assign a binary variable x i B {\displaystyle x_{i}\in \mathbb {B} } {\displaystyle x_{i}\in \mathbb {B} } to the point corresponding to the i {\displaystyle i} {\displaystyle i}-th row in X {\displaystyle {\boldsymbol {X}}} {\displaystyle {\boldsymbol {X}}}, indicating whether it belongs to the first ( x i = 0 {\displaystyle x_{i}=0} {\displaystyle x_{i}=0}) or second cluster ( x i = 1 {\displaystyle x_{i}=1} {\displaystyle x_{i}=1}). Consequently, we have 20 binary variables, which form a binary vector x B 20 {\displaystyle {\boldsymbol {x}}\in \mathbb {B} ^{20}} {\displaystyle {\boldsymbol {x}}\in \mathbb {B} ^{20}} that corresponds to a cluster assignment of all points (see figure).

One way to derive a clustering is to consider the pairwise distances between points. Given a cluster assignment x {\displaystyle {\boldsymbol {x}}} {\displaystyle {\boldsymbol {x}}}, the expression x i x j + ( 1 x i ) ( 1 x j ) {\displaystyle x_{i}x_{j}+(1-x_{i})(1-x_{j})} {\displaystyle x_{i}x_{j}+(1-x_{i})(1-x_{j})} evaluates to 1 if points i {\displaystyle i} {\displaystyle i} and j {\displaystyle j} {\displaystyle j} are in the same cluster. Similarly, x i ( 1 x j ) + ( 1 x i ) x j = 1 {\displaystyle x_{i}(1-x_{j})+(1-x_{i})x_{j}=1} {\displaystyle x_{i}(1-x_{j})+(1-x_{i})x_{j}=1} indicates that they are in different clusters. Let d i j > 0 {\displaystyle d_{ij}>0} {\displaystyle d_{ij}>0} denote the Euclidean distance between the points i {\displaystyle i} {\displaystyle i} and j {\displaystyle j} {\displaystyle j}, i.e.,

d i j = X i X j {\displaystyle d_{ij}={\sqrt {{\boldsymbol {X}}_{i}^{\intercal }{\boldsymbol {X}}_{j}}}} {\displaystyle d_{ij}={\sqrt {{\boldsymbol {X}}_{i}^{\intercal }{\boldsymbol {X}}_{j}}}},

where X i {\displaystyle {\boldsymbol {X}}_{i}} {\displaystyle {\boldsymbol {X}}_{i}} is the i {\displaystyle i} {\displaystyle i}-th row of X {\displaystyle {\boldsymbol {X}}} {\displaystyle {\boldsymbol {X}}}.

In order to define a cost function to minimize, when points i {\displaystyle i} {\displaystyle i} and j {\displaystyle j} {\displaystyle j} are in the same cluster we add their positive distance d i j {\displaystyle d_{ij}} {\displaystyle d_{ij}}, and subtract it when they are in different clusters. This way, an optimal solution tends to place points which are far apart into different clusters, and points that are close into the same cluster.

Let D R N × N {\displaystyle {\boldsymbol {D}}\in \mathbb {R} ^{N\times N}} {\displaystyle {\boldsymbol {D}}\in \mathbb {R} ^{N\times N}} with D i j = d i j / 2 {\displaystyle D_{ij}=d_{ij}/2} {\displaystyle D_{ij}=d_{ij}/2} for all i , j { 1 , , n } {\displaystyle i,j\in \lbrace 1,\dots ,n\rbrace } {\displaystyle i,j\in \lbrace 1,\dots ,n\rbrace }. Given an assignment x B N {\displaystyle {\boldsymbol {x}}\in \mathbb {B} ^{N}} {\displaystyle {\boldsymbol {x}}\in \mathbb {B} ^{N}}, such a cost function is given by

f ( x ) = x D x x D ( 1 x ) ( 1 x ) D x + ( 1 x ) D ( 1 x ) = 4 x D x 4 1 D x + 1 D 1 , {\displaystyle {\begin{aligned}f({\boldsymbol {x}})&={\boldsymbol {x}}^{\intercal }{\boldsymbol {Dx}}-{\boldsymbol {x}}^{\intercal }{\boldsymbol {D}}({\boldsymbol {1}}-{\boldsymbol {x}})-({\boldsymbol {1}}-{\boldsymbol {x}})^{\intercal }{\boldsymbol {Dx}}+({\boldsymbol {1}}-{\boldsymbol {x}})^{\intercal }{\boldsymbol {D}}({\boldsymbol {1}}-{\boldsymbol {x}})\\&=4{\boldsymbol {x}}^{\intercal }{\boldsymbol {D}}{\boldsymbol {x}}-4{\boldsymbol {1}}^{\intercal }{\boldsymbol {D}}{\boldsymbol {x}}+{\boldsymbol {1}}^{\intercal }{\boldsymbol {D1}},\end{aligned}}} {\displaystyle {\begin{aligned}f({\boldsymbol {x}})&={\boldsymbol {x}}^{\intercal }{\boldsymbol {Dx}}-{\boldsymbol {x}}^{\intercal }{\boldsymbol {D}}({\boldsymbol {1}}-{\boldsymbol {x}})-({\boldsymbol {1}}-{\boldsymbol {x}})^{\intercal }{\boldsymbol {Dx}}+({\boldsymbol {1}}-{\boldsymbol {x}})^{\intercal }{\boldsymbol {D}}({\boldsymbol {1}}-{\boldsymbol {x}})\\&=4{\boldsymbol {x}}^{\intercal }{\boldsymbol {D}}{\boldsymbol {x}}-4{\boldsymbol {1}}^{\intercal }{\boldsymbol {D}}{\boldsymbol {x}}+{\boldsymbol {1}}^{\intercal }{\boldsymbol {D1}},\end{aligned}}}

where 1 = ( 1 , 1 , , 1 ) R N {\displaystyle {\boldsymbol {1}}=(1,1,\dots ,1)^{\intercal }\in \mathbb {R} ^{N}} {\displaystyle {\boldsymbol {1}}=(1,1,\dots ,1)^{\intercal }\in \mathbb {R} ^{N}}.

From the second line we can see that this expression can be re-arranged to a QUBO problem by defining

Q = 4 D 4 diag [ D 1 ] {\displaystyle {\boldsymbol {Q}}=4{\boldsymbol {D}}-4\operatorname {diag} [{\boldsymbol {D1}}]} {\displaystyle {\boldsymbol {Q}}=4{\boldsymbol {D}}-4\operatorname {diag} [{\boldsymbol {D1}}]}

and ignoring the constant term 1 D 1 {\displaystyle {\boldsymbol {1}}^{\intercal }{\boldsymbol {D1}}} {\displaystyle {\boldsymbol {1}}^{\intercal }{\boldsymbol {D1}}}. Using these parameters, a binary vector minimizing this QUBO instance Q {\displaystyle {\boldsymbol {Q}}} {\displaystyle {\boldsymbol {Q}}} will correspond to an optimal cluster assignment w.r.t. above cost function.

Connection to Ising models

[edit ]

QUBO is very closely related and computationally equivalent to the Ising model, whose Hamiltonian function is defined as

H ( σ ) = σ J σ + h σ = i , j J i j σ i σ j + j h j σ j {\displaystyle H({\boldsymbol {\sigma }})={\boldsymbol {\sigma }}^{\intercal }{\boldsymbol {J}}{\boldsymbol {\sigma }}+{\boldsymbol {h}}^{\intercal }{\boldsymbol {\sigma }}=\sum _{i,j}J_{ij}\sigma _{i}\sigma _{j}+\sum _{j}h_{j}\sigma _{j}} {\displaystyle H({\boldsymbol {\sigma }})={\boldsymbol {\sigma }}^{\intercal }{\boldsymbol {J}}{\boldsymbol {\sigma }}+{\boldsymbol {h}}^{\intercal }{\boldsymbol {\sigma }}=\sum _{i,j}J_{ij}\sigma _{i}\sigma _{j}+\sum _{j}h_{j}\sigma _{j}}

with real-valued parameters h j , J i j {\displaystyle h_{j},J_{ij}} {\displaystyle h_{j},J_{ij}} for all i , j {\displaystyle i,j} {\displaystyle i,j}. The spin variables σ j {\displaystyle \sigma _{j}} {\displaystyle \sigma _{j}} are binary with values from { 1 , + 1 } {\displaystyle \lbrace -1,+1\rbrace } {\displaystyle \lbrace -1,+1\rbrace } instead of B {\displaystyle \mathbb {B} } {\displaystyle \mathbb {B} }. Note that this formulation is simplified, since, in a physics context, σ i {\displaystyle \sigma _{i}} {\displaystyle \sigma _{i}} are typically Pauli operators, which are complex-valued matrices of size 2 n × 2 n {\displaystyle 2^{n}\times 2^{n}} {\displaystyle 2^{n}\times 2^{n}}, whereas here we treat them as binary variables. Many formulations of the Ising model Hamiltonian further assume that the variables are arranged in a lattice, where only neighboring pairs of variables i   j {\displaystyle \langle i~j\rangle } {\displaystyle \langle i~j\rangle } can have non-zero coefficients; here, we simply assume that J i j = 0 {\displaystyle J_{ij}=0} {\displaystyle J_{ij}=0} if i {\displaystyle i} {\displaystyle i} and j {\displaystyle j} {\displaystyle j} are not neighbors.

Applying the identity σ = 1 2 x {\displaystyle \sigma =1-2x} {\displaystyle \sigma =1-2x} yields an equivalent QUBO problem [10]

σ J σ + h σ = ( 1 2 x ) J ( 1 2 x ) + h ( 1 2 x ) = 4 x J x 4 1 J x + 1 J 1 2 h x + h 1 = x ( 4 J ) x ( 4 J 1 + 2 h ) x + 1 J 1 + h 1 const. , {\displaystyle {\begin{aligned}&{\boldsymbol {\sigma }}^{\intercal }{\boldsymbol {J}}{\boldsymbol {\sigma }}+{\boldsymbol {h}}^{\intercal }{\boldsymbol {\sigma }}\\&=({\boldsymbol {1}}-2{\boldsymbol {x}})^{\intercal }{\boldsymbol {J}}({\boldsymbol {1}}-2{\boldsymbol {x}})+{\boldsymbol {h}}^{\intercal }({\boldsymbol {1}}-2{\boldsymbol {x}})\\&=4{\boldsymbol {x}}^{\intercal }{\boldsymbol {J}}{\boldsymbol {x}}-4{\boldsymbol {1}}^{\intercal }{\boldsymbol {Jx}}+{\boldsymbol {1}}^{\intercal }{\boldsymbol {J1}}-2{\boldsymbol {h}}^{\intercal }{\boldsymbol {x}}+{\boldsymbol {h}}^{\intercal }{\boldsymbol {1}}\\&={\boldsymbol {x}}^{\intercal }(4{\boldsymbol {J}}){\boldsymbol {x}}-(4{\boldsymbol {J}}^{\intercal }{\boldsymbol {1}}+2{\boldsymbol {h}})^{\intercal }{\boldsymbol {x}}+\underbrace {{\boldsymbol {1}}^{\intercal }{\boldsymbol {J1}}+{\boldsymbol {h}}^{\intercal }{\boldsymbol {1}}} _{\text{const.}},\end{aligned}}} {\displaystyle {\begin{aligned}&{\boldsymbol {\sigma }}^{\intercal }{\boldsymbol {J}}{\boldsymbol {\sigma }}+{\boldsymbol {h}}^{\intercal }{\boldsymbol {\sigma }}\\&=({\boldsymbol {1}}-2{\boldsymbol {x}})^{\intercal }{\boldsymbol {J}}({\boldsymbol {1}}-2{\boldsymbol {x}})+{\boldsymbol {h}}^{\intercal }({\boldsymbol {1}}-2{\boldsymbol {x}})\\&=4{\boldsymbol {x}}^{\intercal }{\boldsymbol {J}}{\boldsymbol {x}}-4{\boldsymbol {1}}^{\intercal }{\boldsymbol {Jx}}+{\boldsymbol {1}}^{\intercal }{\boldsymbol {J1}}-2{\boldsymbol {h}}^{\intercal }{\boldsymbol {x}}+{\boldsymbol {h}}^{\intercal }{\boldsymbol {1}}\\&={\boldsymbol {x}}^{\intercal }(4{\boldsymbol {J}}){\boldsymbol {x}}-(4{\boldsymbol {J}}^{\intercal }{\boldsymbol {1}}+2{\boldsymbol {h}})^{\intercal }{\boldsymbol {x}}+\underbrace {{\boldsymbol {1}}^{\intercal }{\boldsymbol {J1}}+{\boldsymbol {h}}^{\intercal }{\boldsymbol {1}}} _{\text{const.}},\end{aligned}}}

whose weight matrix Q {\displaystyle {\boldsymbol {Q}}} {\displaystyle {\boldsymbol {Q}}} is given by

Q = 4 J diag [ 4 J 1 + 2 h ] , {\displaystyle {\boldsymbol {Q}}=4{\boldsymbol {J}}-\operatorname {diag} [4{\boldsymbol {J}}^{\intercal }{\boldsymbol {1}}+2{\boldsymbol {h}}],} {\displaystyle {\boldsymbol {Q}}=4{\boldsymbol {J}}-\operatorname {diag} [4{\boldsymbol {J}}^{\intercal }{\boldsymbol {1}}+2{\boldsymbol {h}}],}

again ignoring the constant term, which does not affect the minization. Using the identity x = ( 1 σ ) / 2 {\displaystyle x=(1-\sigma )/2} {\displaystyle x=(1-\sigma )/2}, a QUBO problem with matrix Q {\displaystyle {\boldsymbol {Q}}} {\displaystyle {\boldsymbol {Q}}} can be converted to an equivalent Ising model using the same technique, yielding

J = Q / 4 , h = ( Q 1 + Q 1 ) / 4 , {\displaystyle {\begin{aligned}{\boldsymbol {J}}&={\boldsymbol {Q}}/4,&{\boldsymbol {h}}&=-({\boldsymbol {Q1}}+{\boldsymbol {Q}}^{\intercal }{\boldsymbol {1}})/4,\end{aligned}}} {\displaystyle {\begin{aligned}{\boldsymbol {J}}&={\boldsymbol {Q}}/4,&{\boldsymbol {h}}&=-({\boldsymbol {Q1}}+{\boldsymbol {Q}}^{\intercal }{\boldsymbol {1}})/4,\end{aligned}}}

and a constant offset of 1 Q 1 / 4 {\displaystyle {\boldsymbol {1}}^{\intercal }{\boldsymbol {Q1}}/4} {\displaystyle {\boldsymbol {1}}^{\intercal }{\boldsymbol {Q1}}/4}.[10]

References

[edit ]
  1. ^ Kochenberger, Gary; Hao, Jin-Kao; Glover, Fred; Lewis, Mark; Lu, Zhipeng; Wang, Haibo; Wang, Yang (2014). "The unconstrained binary quadratic programming problem: a survey" (PDF). Journal of Combinatorial Optimization. 28: 58–81. doi:10.1007/s10878-014-9734-0. S2CID 16808394.
  2. ^ Glover, Fred; Kochenberger, Gary (2019). "A Tutorial on Formulating and Using QUBO Models". arXiv:1811.11538 [cs.DS].
  3. ^ Lucas, Andrew (2014). "Ising formulations of many NP problems". Frontiers in Physics. 2: 5. arXiv:1302.5843 . Bibcode:2014FrP.....2....5L. doi:10.3389/fphy.2014.00005 .
  4. ^ Mücke, Sascha; Piatkowski, Nico; Morik, Katharina (2019). "Learning Bit by Bit: Extracting the Essence of Machine Learning" (PDF). LWDA. S2CID 202760166. Archived from the original (PDF) on 2020年02月27日.
  5. ^ Tom Simonite (8 May 2013). "D-Wave's Quantum Computer Goes to the Races, Wins". MIT Technology Review. Archived from the original on 24 September 2015. Retrieved 12 May 2013.
  6. ^ A. P. Punnen (editor), Quadratic unconstrained binary optimization problem: Theory, Algorithms, and Applications, Springer, Springer, 2022.
  7. ^ Çela, E., Punnen, A.P. (2022). Complexity and Polynomially Solvable Special Cases of QUBO. In: Punnen, A.P. (eds) The Quadratic Unconstrained Binary Optimization Problem. Springer, Cham. https://doi.org/10.1007/978-3-031-04520-2_3
  8. ^ See Theorem 3.16 in Punnen (2022); note that the authors assume the maximization version of QUBO.
  9. ^ Ratke, Daniel (2021年06月10日). "List of QUBO formulations" . Retrieved 2022年12月16日.
  10. ^ a b Mücke, S. (2025). Quantum-Classical Optimization in Machine Learning. Shaker Verlag. https://d-nb.info/1368090214
[edit ]

AltStyle によって変換されたページ (->オリジナル) /