I have a problem where we are given multivariate polynomials in variables like y_i and z_j and the goal is to express them as sums of products of terms of the form y_i-z_j with nonnegative integer coefficients (each of the factors of one of these products must specifically have a coefficient of 1 for one y variable and a coefficient of -1 for one z variable). For example, you might be given
(y_1 - z_1)*(y_3 - z_2) + (y_1 + y_2 - z_1 - z_2)*(y_1 - z_3) - (y_1 - z_2)*(y_1 - z_3)
and you would want to express this positively like
(y_1 - z_1)*(y_3 - z_2) + (y_2 - z_1)*(y_1 - z_3)
The same polynomial could be given to you in a different way, such as
y_1*y_2 + y_1*y_3 - y_1*z_1 - y_1*z_2 - y_2*z_3 - y_3*z_1 + z_1*z_2 + z_1*z_3
and you would have to pick this apart to get the expression above.
The goal, specifically, is if we were to consider new variables a_ij = y_i-z_j
, we want to express the polynomial with nonnegative integer coefficients in terms of these variables. Since the variables are not linearly independent, a change of variables won't necessarily work, and my code uses linear optimization.
After some discussion with a member of the community, a requirement that is perhaps easier to understand is that we want to express the polynomial as a sum of terms of a specific form. The form is
a*(y_i + y_j + ... - z_k - z_l + ...)*(y_m + y_n + ... - z_o - z_p + ...)*(...) + ...
where:
- The coefficient a is always a positive integer
- The sign of y terms is always positive
- The sign of z terms is always negative
- Within one factor, the number of y is equal to the number of z
- Within one factor, there must be at least one y and one z
For clarification, each of the y and z terms have unit coefficient (depending on the sign), but when we express it in this level of generality some of them can be the same, for example
2*y_1 - z_2 - z_3 = y_1 + y_1 - z_2 - z_3
My program expresses it with factors of exactly one y with exactly one z, so this doesn't come up.
It took me a year to even figure out how to do this at all. I have some working code now (it's actually available on PyPI, but only for solving a specific problem) that I think can be faster/more efficient in general.
The polynomial is passed as a symengine object. I want to use sympy, so I cast it to a string and have sympy then parse it. I create a vector expressed as a dict of monomials with integer coefficients, then I have to somehow create a set of basis vectors to then express the result positively with an integer optimization (using PuLP). I think this is part of where the speed and efficiency problem is.
I create the basis vectors by taking all y variables that occur in terms of the polynomial that have only y variables, and all z variables that occur in terms of the polynomial that have only z variables, and take products of differences of these. I have a conjecture that they are square-free in these terms, and so far that's actually been working out. But I end up with huge numbers of basis vectors for any sizeable polynomial, on the order of 20K+. I use xpress for smaller cases (5000 rows and columns), and use CBC for larger ones.
So the question is: is there a way to speed this up, perhaps by trimming down the number of basis vectors? Any other comments or suggestions for improvements on the code are welcome. I've had a good experience with math-y people lurking here before which is why I'm trying this site.
from symengine import *
import numpy as np
import pulp as pu
import itertools as it
import sympy
import re
n = 100
var2 = symarray('y',n)
var3 = symarray('z',n)
dimen = 0
monom_to_vec = {}
def poly_to_vec(poly,vec0=None):
global dimen, monom_to_vec
dc = poly.as_coefficients_dict()
mn = list(dc.keys())
if vec0 is None:
init_basevec(mn)
vec = np.zeros(dimen)
try:
for i in range(len(mn)-1,-1,-1):
cf = dc[mn[i]]
index = monom_to_vec[mn[i]]
if vec0 is not None and abs(vec0[index])<abs(cf):
return None
vec[index] = cf
except Exception:
return None
return vec
def init_basevec(monoms):
global dimen, monom_to_vec
monom_to_vec = {}
index = 0
for mn in monoms:
monom_to_vec[mn] = index
index+=1
dimen = index
def compute_positive_rep(val,var2=var2,var3=var3,neg=False):
notint = False
try:
int(val)
except Exception:
notint = True
if notint:
sval = str(val)
val = expand(val)
regex = re.compile("y_([0-9]+)")
regex2 = re.compile("z_([0-9]+)")
n1 = max([int(index) for index in regex.findall(sval)])+1
n2 = max([int(index) for index in regex2.findall(sval)])+1
var22 = sympy.symarray('y',n1)
var33 = sympy.symarray('z',n2)
val_poly = sympy.poly(sval,*var22[1:],*var33[1:])
if val_poly == sympy.sympify(0):
return 0
did = False
list0 = []
list1 = []
mn = val_poly.monoms()
L0 = [0 for i in range(1,n1)]
L1 = [0 for i in range(1,n2)]
mn1L = []
mnn1L = []
for mm0 in mn:
if list(mm0[:n1-1]) == L0:
mnn1L += [mm0]
if list(mm0[n1-1:]) == L1:
mn1L += [mm0]
base = []
base_vectors = []
base_monoms = []
vec = poly_to_vec(val)
set1 = set()
set2 = set()
full_varlist = []
for mnn1 in mnn1L:
for i in range(len(mnn1)):
if mnn1[i]!=0:
set2.add(i+1-n1+1)
for index1 in range(1,n1):
full_varlist += [[]]
for index2 in set2:
full_varlist[index1-1] += [var2[index1]-var3[index2]]
comblists = []
for mn1 in mn1L:
comblistmn1 = [1]
for i in range(n1-1):
if mn1[i]!=0:
arr = np.array(comblistmn1)
comblistmn12 = []
newset = it.combinations(full_varlist[i],mn1[i])
combs = list(newset)
for comb in combs:
term = np.prod(comb)
comblistmn12 += (arr*term).tolist()
comblistmn1 = comblistmn12
for b1 in comblistmn1:
vec0 = poly_to_vec(expand(b1),vec)
if vec0 is not None:
base_vectors += [vec0]
base_monoms += [b1]
vrs = [pu.LpVariable(name=f"a{i}",lowBound=0,cat='Integer') for i in range(len(base_vectors))]
lp_prob = pu.LpProblem('Problem',pu.LpMinimize)
lp_prob += 0
for i in range(len(vec)):
eq = 0
for j in range(len(base_vectors)):
if base_vectors[j][i]!=0:
eq += int(base_vectors[j][i])*vrs[j]
lp_prob += eq == int(vec[i])
try:
status = lp_prob.solve(pu.XPRESS_PY(msg=0))
except pu.PulpSolverError:
status = lp_prob.solve(pu.PULP_CBC_CMD(msg=0))
if status != 1:
print(f"Counterexample {val}")
exit(1)
val = 0
for i in range(len(base_vectors)):
x = vrs[i].value()
b1 = base_monoms[i]
if x!=0:
val += int(x)*b1
return val
print(compute_positive_rep((var2[1]-var3[5])*(var2[5]-var3[1]) + (var2[3] + var2[7] - var3[1] - var3[2])*(var2[2] + var2[3] - var3[7] - var3[8]) - (var2[3] - var3[1])*(var2[2] - var3[7])))
This script runs for me, and below is the output:
(y_5 - z_1)*(y_1 - z_5) + (y_7 - z_1)*(y_3 - z_8) + (y_7 - z_2)*(y_2 - z_7) + (y_3 - z_8)*(y_3 - z_2)
It has not directly manipulated the expression to cancel the negative term, but rather started from scratch to try to express this positively.
You may need to pip install xpress
if you don't have it already in your pulp
installation.
Included below is a more substantial example that takes a very long time on my machine. Replace the last line with this.
print(compute_positive_rep((var2[3] - var3[1])*((var2[1] + var2[3] + var2[4] + var2[6] - var3[1] - var3[3] - var3[5] - var3[6])*(((var2[6] - var3[1])*(var2[4] - var3[1])*(var2[1] + var2[3] - var3[1] - var3[2]) + (var2[3] - var3[2])*(var2[1] - var3[2])*(var2[4] + var2[6] - var3[1] - var3[2]))*((var2[4] - var3[3])*(var2[6] - var3[3]) + (var2[1] - var3[4])*(var2[3] - var3[4]) + (var2[4] + var2[6] - var3[1] - var3[3])*(var2[1] + var2[3] - var3[1] - var3[4])) - (var2[6] - var3[1])*(var2[3] - var3[1])*(var2[1] - var3[1])*(var2[4] - var3[1])*(var2[1] + var2[3] + var2[4] + var2[6] - var3[1] - var3[2] - var3[3] - var3[4])) - (var2[6] - var3[1])*(var2[3] - var3[1])*(var2[1] - var3[1])*(var2[4] - var3[1])*((var2[4] - var3[3])*(var2[6] - var3[3]) + (var2[1] - var3[4])*(var2[3] - var3[4]) + (var2[1] + var2[3] - var3[2] - var3[4])*(var2[4] + var2[6] - var3[2] - var3[3])) - (((var2[6] - var3[1])*(var2[4] - var3[1])*(var2[1] + var2[3] - var3[1] - var3[3]) + (var2[1] - var3[3])*(var2[3] - var3[3])*(var2[4] + var2[6] - var3[1] - var3[3]))*((var2[6] - var3[1])*(var2[4] - var3[1])*(var2[1] + var2[3] - var3[1] - var3[2]) + (var2[3] - var3[2])*(var2[1] - var3[2])*(var2[4] + var2[6] - var3[1] - var3[2])) - (var2[6] - var3[1])*(var2[3] - var3[1])*(var2[1] - var3[1])*(var2[4] - var3[1])*((var2[4] - var3[2])*(var2[6] - var3[2]) + (var2[1] - var3[3])*(var2[3] - var3[3]) + (var2[4] + var2[6] - var3[1] - var3[2])*(var2[1] + var2[3] - var3[1] - var3[3])))) - (-(var2[6] - var3[1])*(var2[3] - var3[1])*(var2[1] - var3[1])*(var2[4] - var3[1])*((var2[4] - var3[2])*(var2[6] - var3[2])*(var2[1] + var2[3] - var3[2] - var3[3]) + (var2[1] - var3[3])*(var2[3] - var3[3])*(var2[4] + var2[6] - var3[2] - var3[3])) + (var2[6] - var3[1])*(var2[3] - var3[1])*(var2[1] - var3[1])*(var2[4] - var3[1])*(var2[1] + var2[3] + var2[4] + var2[6] - var3[2] - var3[3] - var3[5] - var3[6])*((var2[4] - var3[3])*(var2[6] - var3[3]) + (var2[1] - var3[4])*(var2[3] - var3[4]) + (var2[1] + var2[3] - var3[2] - var3[4])*(var2[4] + var2[6] - var3[2] - var3[3])))))
1 Answer 1
A lot of the more technical terms (basis vectors, etc.) are lost on me, so I can only partially speak to the theory in this question.
my code uses linear optimization
Ish. From lp_prob += 0
(which should have instead used setObjective
) we can see that there actually isn't an optimization objective at all. This is a linear program, and qualifies as MILP, but is only constraint programming and not minimization.
Since this is for theoretical mathematics, a (shaky) argument could be made that you're better off constructing your LP matrices directly instead of through PuLP. Advantages are typically faster setup time, a more bare-metal representation of the problem construction, and easier graphing of the sparsity pattern; disadvantages are that it takes more code and relies more heavily on knowledge of linear algebra (especially: Kronecker products proliferate).
I want to use sympy, so I cast it to a string and have sympy then parse it
I think you can trim off several of your dependencies. I was able to hack through a solution with only scipy, numpy and sympy, which I consider more standard than symengine and xpress.
each of the y and z terms have unit coefficient (depending on the sign), but when we express it in this level of generality some of them can be the same
This is expressible by just limiting every factor to have unique variables, and allowing multiple factors.
My program expresses it with factors of exactly one y with exactly one z
This isn't strictly necessary. I demonstrate how to allow for anywhere from two variables to all variables on either side of each factor.
is there a way to speed this up, perhaps by trimming down the number of basis vectors?
Ill-defined. You haven't shown any profiling results (and I don't, either, other than measuring setup and solve times).
Included below is a more substantial example that takes a very long time on my machine.
That example is quite different from the first two, since it has cubic terms and so can't be solved with the pair-wise LP construction I showed. Oh well - perhaps that can still serve as inspiration.
Fiddly bits about the code:
Avoid global
side-effects.
except; return None
is not great - the exception is too broad, and anyway, better to just let the exception fall through and let the caller catch it (or perhaps re-throw a domain-specific exception), which is how Python exception handling was designed to work.
Run your code through a PEP8 linter. For one example, you don't have enough newlines between functions.
Add type hints to your function signatures.
Include a __main__
guard.
Example code
This is more general in one sense, as it accommodates for longer factor expressions, but less general in another sense as it can only handle order-2 polynomials.
import time
from typing import Iterator
import matplotlib.pyplot as plt
import numpy as np
import sympy
from scipy.optimize import milp, Bounds, LinearConstraint
import scipy.sparse as sp
'''
Factorize an expression to the form
a*(yi + yj + ... - zk - zl + ...)(ym + yn + ... - zo - zp + ...) + ...
where:
- the outer product's coefficient a is always a positive integer
- the sign of y terms is always positive
- the sign of z terms is always negative
- within one factor, the number of y is equal to the number of z
- within one factor, there must be at least one y and one z
LP Variables:
n_max_factors: non-negative outer factor integers
n_max_factors * 2 * n_vars: binary selection variables, +1 for y, -1 for z
n_max_factors * n_addends: addends per factor, left variable, and right variable
'''
n_max_factors = 4 # Estimate; will need increasing for larger problems
def parse(expr: str) -> tuple[
sympy.Expr, # The parsed expression object
list[sympy.Symbol], # All free symbols, sorted
int, # number of variables
int, # number of y variables
int, # number of z variables
int, # number of unique addends
]:
# Parse and immediately expand to SOP form
input_expr = sympy.expand(sympy.parse_expr(expr))
# Unique tuple of all symbols sorted by name and index
symbols = sorted(input_expr.free_symbols, key=lambda s: s.name)
n_vars = len(symbols) # Number of unique polynomial variables
n_y = sum(s.name.startswith('y') for s in symbols) # Length of y vector
n_z = sum(s.name.startswith('z') for s in symbols) # Length of z vector
# For each combination of the variables. The number of unique multiplicative terms.
n_addends = n_vars * n_vars
return input_expr, symbols, n_vars, n_y, n_z, n_addends
def flatten_factors(term: sympy.Expr) -> Iterator[sympy.Expr]:
"""
Flatten a multiplicative Sympy expression into an iterator of terms
"""
if isinstance(term, (sympy.Symbol, sympy.Number)):
yield term
elif isinstance(term, sympy.Pow):
yield from tuple(flatten_factors(term.args[0])) * term.args[1]
elif isinstance(term, sympy.Mul):
for arg in term.args:
yield from flatten_factors(arg)
else:
raise NotImplementedError()
def iterate_equation(
input_expr: sympy.Expr,
symbols: list[sympy.Symbol],
n_vars: int,
n_y: int,
n_z: int,
) -> tuple[
np.ndarray, # equation coefficients, square, dense
np.ndarray, # flattened upper triangular equation coefficients
np.ndarray, # signs for each variable, one dimension
np.ndarray, # matrix of signs for each variable pair
np.ndarray, # upper triangular indices, first dimension
np.ndarray, # upper triangular indices, second dimension
]:
# Equation coefficient matrix, dense, square
equation_coefficients = np.zeros((n_vars, n_vars))
for term in input_expr.args:
args = tuple(flatten_factors(term))
coef = next(
(s for s in args if isinstance(s, sympy.Number)),
1,
)
syms = sorted(
(s for s in args if isinstance(s, sympy.Symbol)),
key=lambda s: s.name,
)
if len(syms) != 2:
raise NotImplementedError()
i = symbols.index(syms[0])
j = symbols.index(syms[1])
equation_coefficients[i, j] = coef
# Flattened version of the upper triangular half of the equation matrix
equation_triu = equation_coefficients[np.triu_indices(n_vars)]
# Vector of signs over one dimension of the equation matrix
equation_sign_vector = np.concatenate((
np.ones(n_y), -np.ones(n_z),
))
# Square outer product matrix of signs, dense, square
equation_signs = np.outer(equation_sign_vector, equation_sign_vector)
trii, trij = np.triu_indices(n_vars)
return equation_coefficients, equation_triu, equation_sign_vector, equation_signs, trii, trij
def make_bounds(
equation_signs: np.ndarray,
equation_sign_vector: np.ndarray,
n_addends: int,
) -> Bounds:
# Bounds for selectors
sel_lower = np.tile(equation_sign_vector*0.5-0.5, n_max_factors*2)
sel_upper = np.tile(equation_sign_vector*0.5+0.5, n_max_factors*2)
# Bounds for addends
add_lower = np.zeros((n_max_factors, n_addends))
add_upper = np.zeros((n_max_factors, n_addends))
add_lower[:, equation_signs.ravel() == -1] = -np.inf
add_upper[:, equation_signs.ravel() == +1] = +np.inf
return Bounds(
lb=np.concatenate((
np.zeros(n_max_factors), # outer factor coefficients
sel_lower , # selectors
add_lower.ravel(),
)),
ub=np.concatenate((
np.full(shape=n_max_factors, fill_value=np.inf),
sel_upper,
add_upper.ravel(),
)),
)
def make_constraints(
equation_triu: np.ndarray,
equation_sign_vector: np.ndarray,
equation_signs: np.ndarray,
trii: np.ndarray,
trij: np.ndarray,
n_vars: int,
n_addends: int,
) -> tuple[
tuple[LinearConstraint, ...],
int,
]:
# Per factor, there must be at least two variables selected
constraint_factorcount = LinearConstraint(
A=sp.hstack((
sp.csc_array((n_max_factors*2, n_max_factors)),
sp.kron(
A=sp.eye(n_max_factors*2),
B=equation_sign_vector,
),
sp.csc_array((n_max_factors*2, n_max_factors * n_addends)),
), format='csc'),
lb=2,
)
# Per factor, the number of y and z variables must be equal
constraint_factorparity = LinearConstraint(
A=sp.hstack((
sp.csc_array((n_max_factors*2, n_max_factors)),
sp.kron(
A=sp.eye(n_max_factors*2),
B=np.ones(n_vars),
),
sp.csc_array((n_max_factors*2, n_max_factors * n_addends)),
), format='csc'),
lb=0,
ub=0,
)
# The sum of each addend over all factors is pinned at the equation value
A_equation = sp.lil_array((len(equation_triu), n_addends))
vidx = np.arange(len(equation_triu))
A_equation[vidx, trii*n_vars + trij] = 1
A_equation[vidx, trij*n_vars + trii] = 1
constraint_equation = LinearConstraint(
A=sp.hstack((
sp.csc_array((len(equation_triu), n_max_factors)),
sp.csc_array((len(equation_triu), n_max_factors*2*n_vars)),
sp.kron(
A=np.ones(n_max_factors),
B=A_equation,
),
), format='csc'),
lb=equation_triu,
ub=equation_triu,
)
# Each addend will never exceed its corresponding outer factor
# positive addend < factor -> factor - addend > 0
# negative addend > -factor -> factor + addend > 0
constraint_addends_factor = LinearConstraint(
A=sp.hstack((
sp.kron(
A=sp.eye(n_max_factors),
B=np.ones((n_addends, 1)),
),
sp.csc_array((n_max_factors * n_addends, n_max_factors*2*n_vars)),
sp.kron(
A=-sp.eye(n_max_factors),
B=sp.diags(equation_signs.ravel()),
)
), format='csc'),
lb=0,
)
'''
If either of a given selector in a pair is zero, the corresponding addend must be zero
addend == factor * sela * selb * eqsign
addend*asign <= M*asel*vsign -> M*asel*vsign - addend*asign >= 0
addend*asign <= M*bsel*vsign -> M*bsel*vsign - addend*asign >= 0
'''
M = 10*n_max_factors*np.abs(equation_triu).max() # rough estimate; must exceed all factors
A_add = sp.vstack((
sp.diags(-equation_signs.ravel()),
sp.lil_array((n_addends, n_addends)),
), format='lil')
i, j = np.indices((n_vars, n_vars))
A_add[
n_vars*n_vars + n_vars*i + j,
n_vars*j + i,
] = -equation_signs
constraint_addends_zero = LinearConstraint(
A=sp.hstack((
sp.csc_array((n_max_factors*2*n_addends, n_max_factors)),
sp.kron(
A=sp.kron(
A=sp.eye(n_max_factors * 2),
B=sp.diags(M * equation_sign_vector),
),
B=np.ones((n_vars, 1)),
),
sp.kron(
A=sp.eye(n_max_factors),
B=A_add,
),
), format='csc'),
lb=0,
)
'''
If both selectors in a pair are non-zero, the corresponding addend must equal the factor
addend*asign >= factor - M(1 - sl*sgnl) - M(1 - sr*sgnr)
2M >= factor + Msl*sgnl + Msr*sgnr - asign*addend
'''
constraint_addends_nonzero = LinearConstraint(
A=sp.hstack((
sp.kron(
A=sp.eye(n_max_factors),
B=np.ones((n_addends, 1)),
),
sp.kron(
A=sp.eye(n_max_factors),
B=sp.hstack((
sp.kron(
A=sp.diags(M * equation_sign_vector),
B=np.ones((n_vars, 1)),
),
sp.kron(
A=np.ones((n_vars, 1)),
B=sp.diags(M * equation_sign_vector),
)
)),
),
sp.kron(
A=sp.eye(n_max_factors),
B=sp.diags(-equation_signs.ravel()),
),
), format='csc'),
ub=2*M,
)
constraints = (
constraint_factorcount,
constraint_factorparity,
constraint_equation,
constraint_addends_factor,
constraint_addends_zero,
constraint_addends_nonzero,
)
n_rows = sum(
con.A.shape[0] for con in constraints
)
return constraints, n_rows
def solve(
bounds: Bounds,
constraints: tuple[LinearConstraint, ...],
n_vars: int,
n_addends: int,
) -> np.ndarray:
result = milp(
# This is constraint programming; there is no minimization objective
c=np.zeros(n_max_factors + n_max_factors*2*n_vars + n_max_factors*n_addends),
integrality=np.concatenate((
np.ones(n_max_factors), # Coefficients are integral
np.ones(n_max_factors * 2 * n_vars), # selectors are integral
np.zeros(n_max_factors * n_addends), # addends are continuous
)),
bounds=bounds,
constraints=constraints,
)
if not result.success:
raise ValueError(result.message)
print(result.message)
return result.x
def triu_fold(square: np.ndarray) -> np.ndarray:
"""
Convert a matrix so that its upper triangular is the sum of the transpose and the lower triangular is zero
"""
output = square + square.T
trii, trij = np.tril_indices(n=square.shape[0], k=-1)
output[trii, trij] = 0
output[np.diag_indices_from(output)] = np.diag(square)
return output
def process_results(
result_x: np.ndarray,
symbols: list[sympy.Symbol],
n_vars: int,
) -> tuple[
np.ndarray, # outer factor coefficients
np.ndarray, # binary selectors
np.ndarray, # term addends
np.ndarray, # term addends, sum
np.ndarray, # term addends, upper triangular
sympy.Expr, # final output expression
]:
coeffs, selectors, addends = np.split(
result_x.round().astype(int),
(n_max_factors, n_max_factors + n_max_factors*2*n_vars),
)
selectors = selectors.reshape((n_max_factors, 2, n_vars))
addends = addends.reshape((n_max_factors, n_vars, n_vars))
addends_total = addends.sum(axis=0)
addends_triu = triu_fold(addends_total)
expr = sum(
coeff
* factor_selectors[0, :].dot(symbols)
* factor_selectors[1, :].dot(symbols)
for coeff, factor_selectors in zip(coeffs, selectors)
)
return coeffs, selectors, addends, addends_total, addends_triu, expr
def print_results(
eqn: str,
coeffs: np.ndarray,
selectors: np.ndarray,
addends: np.ndarray,
addends_triu: np.ndarray,
equation_coefficients: np.ndarray,
expr: sympy.Expr,
verbose: bool = False,
) -> None:
if verbose:
print(f'Selectors {selectors.shape}:')
print(selectors.reshape((-1, addends.shape[1])))
print(f'Term addends, output {addends.shape}:')
print(addends)
print(f'Term addends, upper triangular {addends_triu.shape}:')
print(addends_triu)
print(f'Term addends, input {equation_coefficients.shape}:')
print(equation_coefficients.round().astype(int))
print('Outer factor coefficients:', coeffs)
print('Input expression:', eqn)
print('Final expression:', expr)
def test(
equation_coefficients: np.ndarray,
coeffs: np.ndarray,
selectors: np.ndarray,
addends_full: np.ndarray,
addends_triu: np.ndarray,
) -> None:
assert np.allclose(equation_coefficients, addends_triu)
addends_reco = np.zeros_like(addends_triu)
for i_factor, (coef, (lhs, rhs), addend_slice) in enumerate(zip(coeffs, selectors, addends_full)):
product = coef*np.outer(lhs, rhs)
assert np.allclose(addend_slice, product)
addends_reco += product
addends_reco = triu_fold(addends_reco)
assert np.allclose(addends_reco, addends_triu)
def eval_constraints(
constraints: tuple[LinearConstraint, ...],
result_x: np.ndarray,
) -> None:
for constraint in constraints:
evald = constraint.A @ result_x
print('Evaluated constraint vector:')
if evald.size % 10:
print(evald)
else:
print(evald.reshape((-1, 10)))
print('Lower bound:', constraint.lb)
print(evald >= constraint.lb)
print('Upper bound:', constraint.ub)
print(evald <= constraint.ub)
print()
def graph_constraints(
eqn: str,
constraints: tuple[LinearConstraint, ...],
) -> plt.Figure:
all_constraints = sp.vstack([
con.A for con in constraints
]).T.toarray() != 0
fig, ax = plt.subplots()
fig: plt.Figure
ax: plt.Axes
fig.tight_layout()
ax.imshow(all_constraints, cmap='binary')
ax.set_title('Sparsity pattern (transpose) to solve\n' + eqn)
return fig
def run(eqn: str) -> None:
start = time.perf_counter()
input_expr, symbols, n_vars, n_y, n_z, n_addends = parse(eqn)
(
equation_coefficients, equation_triu, equation_sign_vector, equation_signs, trii, trij,
) = iterate_equation(input_expr, symbols, n_vars, n_y, n_z)
bounds = make_bounds(equation_signs, equation_sign_vector, n_addends)
constraints, n_rows = make_constraints(
equation_triu, equation_sign_vector, equation_signs, trii, trij, n_vars, n_addends,
)
setup = time.perf_counter()
result = solve(bounds, constraints, n_vars, n_addends)
solved = time.perf_counter()
coeffs, selectors, addends_full, addends_total, addends_triu, expr = process_results(result, symbols, n_vars)
print(f'Problem size: {n_rows}x{result.size}')
print(f'Setup time: {(setup - start)*1e3:.1f} ms')
print(f'Solve time: {(solved - setup)*1e3:.1f} ms')
print_results(
eqn, coeffs, selectors, addends_total, addends_triu, equation_coefficients, expr,
)
print()
graph_constraints(eqn, constraints)
# eval_constraints(constraints, result)
test(equation_coefficients, coeffs, selectors, addends_full, addends_triu)
def main() -> None:
# In-question first example
run(
'(y_1 - z_1)*(y_3 - z_2) '
'+ (y_1 + y_2 - z_1 - z_2)*(y_1 - z_3) '
'- (y_1 - z_2)*(y_1 - z_3)'
)
# Example hard-coded in OP script
run(
'(y_1-z_5)*(y_5-z_1) '
'+ (y_3 + y_7 - z_1 - z_2)*(y_2 + y_3 - z_7 - z_8) '
'- (y_3 - z_1)*(y_2 - z_7)'
)
# From chat
run(
'2*y_1**2 + 3*y_1*y_3 + 3*y_1*y_4 + 2*y_1*y_5 + 3*y_1*y_6 + 3*y_1*y_7 - 4*y_1*z_1 - 4*y_1*z_2 - 3*y_1*z_3 - 3*y_1*z_4 - 2*y_1*z_5 - 2*y_1*z_6 + y_3**2 + 2*y_3*y_4 + y_3*y_5 + 2*y_3*y_6 + 2*y_3*y_7 - 3*y_3*z_1 - 3*y_3*z_2 - 2*y_3*z_3 - 2*y_3*z_4 - y_3*z_5 - y_3*z_6 + y_4**2 + y_4*y_5 +'
'2*y_4*y_6 + 2*y_4*y_7 - 3*y_4*z_1 - 3*y_4*z_2 - 2*y_4*z_3 - 2*y_4*z_4 - y_4*z_5 - y_4*z_6 + y_5*y_6 + y_5*y_7 - 2*y_5*z_1 - 2*y_5*z_2 - y_5*z_3 - y_5*z_4 + y_6**2 + 2*y_6*y_7 - 3*y_6*z_1 - 3*y_6*z_2 - 2*y_6*z_3 - 2*y_6*z_4 - y_6*z_5 - y_6*z_6 + y_7**2 - 3*y_7*z_1 - 3*y_7*z_2 - 2*y_7*z_3 - 2*y_7*z_4 - y_7*z_5'
'- y_7*z_6 + 2*z_1**2 + 4*z_1*z_2 + 3*z_1*z_3 + 3*z_1*z_4 + 2*z_1*z_5 + 2*z_1*z_6 + 2*z_2**2 + 3*z_2*z_3 + 3*z_2*z_4 + 2*z_2*z_5 + 2*z_2*z_6 + z_3**2 + 2*z_3*z_4 + z_3*z_5 + z_3*z_6 + z_4**2 + z_4*z_5 + z_4*z_6'
'+' # Missing from pasted expression, inferred from context
'(y_1 - z_1)*(y_7 - z_2) + (y_1 - z_1)*(y_1 - z_3) + (y_1 - z_1)*(y_6 - z_4) + 2*(y_1 - z_1)*(y_4 - z_6) + (y_3 - z_1)*(y_1 - z_2) + (y_3 - z_1)*(y_3 - z_4) + (y_4 - z_1)*(y_7 - z_1) + (y_4 - z_1)*(y_1 - z_2) + (y_4 - z_1)*(y_6 - z_2) + (y_4 - z_1)*(y_3 - z_4) + (y_6 - z_1)*(y_3 - z_3)'
'+ (y_6 - z_1)*(y_6 - z_3) + (y_6 - z_1)*(y_5 - z_5) + (y_7 - z_1)*(y_5 - z_1) + (y_7 - z_1)*(y_1 - z_5) + (y_1 - z_2)*(y_7 - z_4) + 2*(y_1 - z_2)*(y_5 - z_5) + (y_3 - z_2)*(y_6 - z_6) + (y_4 - z_2)*(y_4 - z_4) + (y_7 - z_2)*(y_3 - z_2) + (y_7 - z_2)*(y_6 - z_2) + (y_7 - z_2)*(y_6 - z_3) + (y_7 - z_2)*(y_3 - z_4) + 2*(y_1 - z_3)*(y_3 - z_2)'
'+ (y_1 - z_3)*(y_1 - z_4) + (y_4 - z_3)*(y_5 - z_3) + (y_7 - z_3)*(y_7 - z_6) + (y_1 - z_4)*(y_6 - z_3) + (y_4 - z_4)*(y_6 - z_5) + (y_4 - z_4)*(y_7 - z_6) + (y_5 - z_4)*(y_3 - z_4) + (y_4 - z_5)*(y_3 - z_3) + (y_1 - z_6)*(y_6 - z_2)'
)
# Last example in question; has cubics so cannot be solved in pair-factor form
'''
run(
'(y_3 - z_1)*((y_1 + y_3 + y_4 + y_6 - z_1 - z_3 - z_5 - z_6)*(((y_6 - z_1)*(y_4 - z_1)*(y_1 + y_3 - z_1 - z_2) '
'+ (y_3 - z_2)*(y_1 - z_2)*(y_4 + y_6 - z_1 - z_2))*((y_4 - z_3)*(y_6 - z_3) + (y_1 - z_4)*(y_3 - z_4) '
'+ (y_4 + y_6 - z_1 - z_3)*(y_1 + y_3 - z_1 - z_4)) '
'- (y_6 - z_1)*(y_3 - z_1)*(y_1 - z_1)*(y_4 - z_1)*(y_1 + y_3 + y_4 + y_6 - z_1 - z_2 - z_3 - z_4)) '
'- (y_6 - z_1)*(y_3 - z_1)*(y_1 - z_1)*(y_4 - z_1)*((y_4 - z_3)*(y_6 - z_3) + (y_1 - z_4)*(y_3 - z_4) '
'+ (y_1 + y_3 - z_2 - z_4)*(y_4 + y_6 - z_2 - z_3)) - (((y_6 - z_1)*(y_4 - z_1)*(y_1 + y_3 - z_1 - z_3) '
'+ (y_1 - z_3)*(y_3 - z_3)*(y_4 + y_6 - z_1 - z_3))*((y_6 - z_1)*(y_4 - z_1)*(y_1 + y_3 - z_1 - z_2) '
'+ (y_3 - z_2)*(y_1 - z_2)*(y_4 + y_6 - z_1 - z_2)) '
'- (y_6 - z_1)*(y_3 - z_1)*(y_1 - z_1)*(y_4 - z_1)*((y_4 - z_2)*(y_6 - z_2) + (y_1 - z_3)*(y_3 - z_3) '
'+ (y_4 + y_6 - z_1 - z_2)*(y_1 + y_3 - z_1 - z_3)))) '
'- (-(y_6 - z_1)*(y_3 - z_1)*(y_1 - z_1)*(y_4 - z_1)*((y_4 - z_2)*(y_6 - z_2)*(y_1 + y_3 - z_2 - z_3) '
'+ (y_1 - z_3)*(y_3 - z_3)*(y_4 + y_6 - z_2 - z_3)) '
'+ (y_6 - z_1)*(y_3 - z_1)*(y_1 - z_1)*(y_4 - z_1)*(y_1 + y_3 + y_4 + y_6 - z_2 - z_3 - z_5 - z_6)*((y_4 - z_3)*(y_6 - z_3) '
'+ (y_1 - z_4)*(y_3 - z_4) + (y_1 + y_3 - z_2 - z_4)*(y_4 + y_6 - z_2 - z_3)))'
)
'''
plt.show()
if __name__ == '__main__':
main()
Optimization terminated successfully. (HiGHS Status 7: Optimal)
Problem size: 613x196
Setup time: 62.9 ms
Solve time: 33.4 ms
Input expression: (y_1 - z_1)*(y_3 - z_2) + (y_1 + y_2 - z_1 - z_2)*(y_1 - z_3) - (y_1 - z_2)*(y_1 - z_3)
Final expression: (y_1 - z_1)*(y_3 - z_2) + (y_1 - z_3)*(y_2 - z_1)
Optimization terminated successfully. (HiGHS Status 7: Optimal)
Problem size: 1671x484
Setup time: 29.5 ms
Solve time: 21.3 ms
Input expression: (y_1-z_5)*(y_5-z_1) + (y_3 + y_7 - z_1 - z_2)*(y_2 + y_3 - z_7 - z_8) - (y_3 - z_1)*(y_2 - z_7)
Final expression: (y_1 - z_5)*(y_5 - z_1) + (y_2 - z_7)*(y_7 - z_2) + (y_3 - z_1)*(y_3 - z_8) + (y_3 - z_8)*(y_7 - z_2)
Optimization terminated successfully. (HiGHS Status 7: Optimal)
Problem size: 2398x676
Setup time: 142.7 ms
Solve time: 939.3 ms
Input expression: 2*y_1**2 + 3*y_1*y_3 + 3*y_1*y_4 + 2*y_1*y_5 + 3*y_1*y_6 + 3*y_1*y_7 - 4*y_1*z_1 - 4*y_1*z_2 - 3*y_1*z_3 - 3*y_1*z_4 - 2*y_1*z_5 - 2*y_1*z_6 + y_3**2 + 2*y_3*y_4 + y_3*y_5 + 2*y_3*y_6 + 2*y_3*y_7 - 3*y_3*z_1 - 3*y_3*z_2 - 2*y_3*z_3 - 2*y_3*z_4 - y_3*z_5 - y_3*z_6 + y_4**2 + y_4*y_5 +2*y_4*y_6 + 2*y_4*y_7 - 3*y_4*z_1 - 3*y_4*z_2 - 2*y_4*z_3 - 2*y_4*z_4 - y_4*z_5 - y_4*z_6 + y_5*y_6 + y_5*y_7 - 2*y_5*z_1 - 2*y_5*z_2 - y_5*z_3 - y_5*z_4 + y_6**2 + 2*y_6*y_7 - 3*y_6*z_1 - 3*y_6*z_2 - 2*y_6*z_3 - 2*y_6*z_4 - y_6*z_5 - y_6*z_6 + y_7**2 - 3*y_7*z_1 - 3*y_7*z_2 - 2*y_7*z_3 - 2*y_7*z_4 - y_7*z_5- y_7*z_6 + 2*z_1**2 + 4*z_1*z_2 + 3*z_1*z_3 + 3*z_1*z_4 + 2*z_1*z_5 + 2*z_1*z_6 + 2*z_2**2 + 3*z_2*z_3 + 3*z_2*z_4 + 2*z_2*z_5 + 2*z_2*z_6 + z_3**2 + 2*z_3*z_4 + z_3*z_5 + z_3*z_6 + z_4**2 + z_4*z_5 + z_4*z_6+(y_1 - z_1)*(y_7 - z_2) + (y_1 - z_1)*(y_1 - z_3) + (y_1 - z_1)*(y_6 - z_4) + 2*(y_1 - z_1)*(y_4 - z_6) + (y_3 - z_1)*(y_1 - z_2) + (y_3 - z_1)*(y_3 - z_4) + (y_4 - z_1)*(y_7 - z_1) + (y_4 - z_1)*(y_1 - z_2) + (y_4 - z_1)*(y_6 - z_2) + (y_4 - z_1)*(y_3 - z_4) + (y_6 - z_1)*(y_3 - z_3)+ (y_6 - z_1)*(y_6 - z_3) + (y_6 - z_1)*(y_5 - z_5) + (y_7 - z_1)*(y_5 - z_1) + (y_7 - z_1)*(y_1 - z_5) + (y_1 - z_2)*(y_7 - z_4) + 2*(y_1 - z_2)*(y_5 - z_5) + (y_3 - z_2)*(y_6 - z_6) + (y_4 - z_2)*(y_4 - z_4) + (y_7 - z_2)*(y_3 - z_2) + (y_7 - z_2)*(y_6 - z_2) + (y_7 - z_2)*(y_6 - z_3) + (y_7 - z_2)*(y_3 - z_4) + 2*(y_1 - z_3)*(y_3 - z_2)+ (y_1 - z_3)*(y_1 - z_4) + (y_4 - z_3)*(y_5 - z_3) + (y_7 - z_3)*(y_7 - z_6) + (y_1 - z_4)*(y_6 - z_3) + (y_4 - z_4)*(y_6 - z_5) + (y_4 - z_4)*(y_7 - z_6) + (y_5 - z_4)*(y_3 - z_4) + (y_4 - z_5)*(y_3 - z_3) + (y_1 - z_6)*(y_6 - z_2)
Final expression: (2*y_1 + 2*y_7 - 2*z_1 - 2*z_2)*(y_1 + y_3 + y_4 + y_5 + y_6 + y_7 - z_1 - z_2 - z_3 - z_4 - z_5 - z_6) + (y_1 + y_3 + y_4 + y_6 - z_1 - z_2 - z_3 - z_4)*(2*y_1 + 2*y_3 + 2*y_4 + 2*y_5 + 2*y_6 + 2*y_7 - 2*z_1 - 2*z_2 - 2*z_3 - 2*z_4 - 2*z_5 - 2*z_6)
-
\$\begingroup\$ Thanks for your answer. I haven't looked at it fully yet, but I'm hoping it generalizes to polynomials of higher degree. I'll take a look. \$\endgroup\$Matt Samuel– Matt Samuel2023年12月08日 00:03:32 +00:00Commented Dec 8, 2023 at 0:03
-
\$\begingroup\$ It will generalize to higher degrees (with code modifications), but it will scale very poorly. With an n-degree polynomial, every top-level term will need up to n factors, requiring n_vars**n selectors. \$\endgroup\$Reinderien– Reinderien2023年12月08日 02:29:27 +00:00Commented Dec 8, 2023 at 2:29
-
\$\begingroup\$ So the scaling is what is needed. The 5th degree polynomial example is more typical, and we're currently running degree 7 in our test runs. \$\endgroup\$Matt Samuel– Matt Samuel2023年12月08日 10:32:15 +00:00Commented Dec 8, 2023 at 10:32
-
\$\begingroup\$ Maybe a combination will solve that issue. \$\endgroup\$Matt Samuel– Matt Samuel2023年12月08日 12:04:14 +00:00Commented Dec 8, 2023 at 12:04
-
1\$\begingroup\$ I have an idea for an arbitrary-degree algorithm but it will take another while to implement. I can describe it in the chat. \$\endgroup\$Reinderien– Reinderien2023年12月08日 12:37:20 +00:00Commented Dec 8, 2023 at 12:37
Explore related questions
See similar questions with these tags.
expand
which will fully expand the result in terms of monomials with integer coefficients. The variables don't have anything substituted, these are abstract polynomials. It arbitrarily assigned indexes to the monomials and populates "monom_to_vec" which is a lookup translating the monomial to the index. Then, given an expanded polynomial, poly_to_vec will fill in those coordinates which are the integer coefficients of the monomials in a numpy array. So the value at the index in the array is the coefficient of the monomial. Is this clearer? \$\endgroup\$(y_5 - z_1)
with nonnegative coefficients. So(y_5 - z_1)*(y_1-z_2)+(y_5-z_1)*(y_3-z_5)
would be good, and the reason this is so hard is because you could also express the same polynomial as(y_5 - z_1)*(y_1-z_5)+(y_5-z_1)*(y_3-z_2)
. We're not looking for a unique expression or factorized, just positive in terms of terms like that. The assumption is that this is always possible (if it's not in a certain case, we don't care about that case). \$\endgroup\$(y_5 - z_1)*(y_1-z_2)+(y_5-z_1)*(y_3-z_5)
as(y_5 - z_1)*(y_1-z_2+y_3-z_5)
? \$\endgroup\$