Input: Some fixed $k>1$, vectors $x_i,y_i\in\mathbb R^k$ for 1ドル\le i\le n$.
Output: A subset $I\subset\{1,\dots,n\}$ of maximal size such that $(x_i-x_j)^T(y_i-y_j) \ge 0$ for all $i,j\in I$.
Question: Can this be computed in polynomial time in $n$?
Remarks:
- For $k=1$ this is equivalent to the problem of finding a longest increasing subsequence. Indeed, assuming that $x_1<\dots<x_n$, we search for a longest increasing subsequence of $y_1,\dots,y_n$. Such a subsequence can be found in $O(n\log n)$.
- The problem is related to the notion of a monotone operator $F:\mathbb R^k\to\mathbb R^k$. Monotonicity of $F$ means that $(x_1-x_2)^T(F(x_1)-F(x_2))\ge 0$ for all $x_1,x_2\in\mathbb R^k$.
- The problem can be formulated as a search for a maximum clique in the graph $G=(V,E)$ with vertices $V=\{1,\dots,n\}$ and edges $E = \{(i,j) \;:\; (x_i-x_j)^T(y_i-y_j)\ge 0 \}$. The general clique problem is NP-complete. However, it might be possible to exploit the special structure of $E$ (as shown in the first remark, this is possible when $k=1$).
I would appreciate any hint or comment on this problem.
-
1$\begingroup$ Do you want a maximal clique, or a maximum clique? In the former, we essentially cannot extend a clique to a bigger clique, which is easily found: choose an arbitrary vertex and add as many adjacent vertices such that you maintain a clique. Finding a maximum clique is $\mathsf{NP}$-complete because we are looking for the largest clique out of all possible cliques. $\endgroup$STanja– STanja2020年10月03日 19:28:33 +00:00Commented Oct 3, 2020 at 19:28
-
$\begingroup$ You are right, the wording of the remark was ambigous. I have now edited it to say "maximum" instead of "maximal". However, I think the problem statement was precise: We search a subst I "of maximal size". $\endgroup$Klaas– Klaas2020年10月03日 19:41:21 +00:00Commented Oct 3, 2020 at 19:41
-
1$\begingroup$ Just a small comment: if $k$ can depend on $n,ドル then it's NP-hard, since we can build a reduction from maximum clique: $k = n^2,ドル and $i * n + j$-th coordinate is 0ドル$ for all vertices except $i$ and $j$. If there is an edge between them, then the coordinate is 0ドル$ for both; otherwise, it's 1ドル$ fr one and $-1$ for another. $\endgroup$user114966– user1149662020年10月03日 21:49:17 +00:00Commented Oct 3, 2020 at 21:49
-
$\begingroup$ That's correct, I also thought about adding this as a further remark :-) Maybe asking the following question can lead to some insights: Let's suppose $k=2$. Can all possible graphs (i. e. all possible sets $E$) be realized by choosing $x_i, y_i$ appropriately? If not, what are the obstructions? $\endgroup$Klaas– Klaas2020年10月04日 10:39:43 +00:00Commented Oct 4, 2020 at 10:39
-
$\begingroup$ If you want k to be fixed, traditionally it shouldn't be listed as an input. It should instead be a fixed parameter of the problem. I might remove it from your list of inputs for clarity. $\endgroup$Zachary Vance– Zachary Vance2020年10月07日 19:29:16 +00:00Commented Oct 7, 2020 at 19:29
2 Answers 2
I stared at this a while now with $k=2$ and got nowhere.
- Let's take on $k=2$. To avoid confusion with x and y coordinates, I'll call the vectors $a_i, b_i$ instead.
- Lets define $D(i,j)=(a_j-a_i)^T(b_j-b_i) \ge 0$, and $R(i,j)$ to be the boolean relation which holds when $D(i,j)\ge0$.
- For $k=1$, R was a total ordering, which is what let us find increasing subsequences. For $k=2$, it's not even a partial ordering, which makes me think any solution will not resemble the $k=1$ case.
- Take $x_1,x_2,x_3=(0,0),(0,1),(1,0)$; $y_1, y_2, y_3=(0,0), (-6, 1), (-5,-5)$
- $V(1,2)=1$; $V(2,3)=1$; $V(1,3)=-10$
I'd suggest checking if this can be reduced to the longest common subsequence problem (I suspect no, because of the lack of ordering) or to/from some form of linear programming.
To nitpick here, coordinates should really be taken from $\mathbb Z$ bounded by some reasonable (say, polynomial) function for CS problems, not "$\mathbb R$". I suspect you can show it's NP-complete otherwise, say by reduction to the longest common subsequence problem, but in some way that's not really related to whatever your real problem is. That said, I didn't succeed at actually doing so.
Construction of this problem as follows will incur $O(kn^2)$. Analysis of the time complexity for the second (ILP) stage is more complex and I don't show it here, but it is only a function in $n$ and not $k$.
You can:
- Calculate $n(n-1)/2$ vector differences and dot products
- This will produce an $n(n-1)/2$ predicate vector $p$, then negated by $q=1-p$
- Solve an ILP with $n$ rows and $n$ columns (each for a binary assignment)
- Maximise the assignment sum.
For an example,
import time
import numpy as np
from scipy.optimize import Bounds, milp, LinearConstraint
k = 3 # dimensionality
n = 20 # number of vectors
rand = np.random.default_rng(seed=0)
xy = rand.integers(low=-9, high=+9, size=(2, n, k))
print(f'Trial size: n={n} k={k}')
# The product matrix diagonal will always be 0 and the matrix will always be symmetric.
# Calculate upper-triangular indices of the Cartesian product.
ui, uj = np.triu_indices(n=n, k=1)
# Difference over the Cartesian product between each
# row of x and each row of y, upper triangular only
dx, dy = xy[:, ui, :] - xy[:, uj, :]
# Tensor contraction for dot product on final (k) dimension to a vector of n(n-1)/2
products = np.einsum('ij,ij->i', dx, dy)
# n(n-1)/2 monotone predicate vector, inverted to an antipredicate
antipredicate = products < 0
'''
Choose the largest possible index subset in 1..n such that
every i,j combination fulfills the predicate.
Variables: n binary assignments.
'''
cost = np.full(shape=n, fill_value=-1) # Maximise number of assignments
integrality = np.ones(shape=n, dtype=np.uint8) # All integral
bounds = Bounds(lb=0, ub=1) # All binary
'''
Constraint matrix: for assignment vector a, for each row index i,
given the predicate p and antipredicate q=1-p,
qi1a1 + qi2a2... + ai(sumqi) + ...qinan <= sumqi
'''
A = np.zeros(shape=(n, n), dtype=np.float64)
A[ui, uj] = antipredicate
A[uj, ui] = antipredicate # fill transpose
ij = np.arange(n)
diagsums = A.sum(axis=1)
A[ij, ij] += diagsums
constraint = LinearConstraint(A=A, ub=diagsums)
t0 = time.perf_counter()
result = milp(c=cost, integrality=integrality, bounds=bounds, constraints=constraint)
t1 = time.perf_counter()
assert result.success, result.message
print(result.message)
print(f'Solution time: {1e3*(t1-t0):.1f} ms')
assign = result.x > 0.5
print('Subset assignments:')
print(assign.astype(int))
print('Monotone test:')
idx = np.flatnonzero(assign)
for i in idx:
x0, y0 = xy[:, i, :]
for j in idx:
if j > i:
x1, y1 = xy[:, j, :]
product = (x1 - x0).dot(y1 - y0)
print(product, end=' ')
assert product >= 0, 'Failed monotone test'
print()
Trial size: n=20 k=3
Optimization terminated successfully. (HiGHS Status 7: Optimal)
Solution time: 10.8 ms
Subset assignments:
[1 0 0 0 1 0 0 1 1 1 0 0 1 0 0 0 0 0 1 0]
Monotone test:
52 60 16 117 282 11 126 24 44 145 87 40 91 159 92 28 165 27 95 109 352
Explore related questions
See similar questions with these tags.