Problem Setup
Let's say we have a bipartite graph $(U, V)$ where the nodes of the graph are labelled with a positive integer and each edge $(u_i, v_i)$ has $u_i < v_i$. Our goal is to find a maximum matching $L$, with the smallest total weight of the edges out of all maximum matchings. The conditions can be defined as follows:
- Matching condition: The maximum degree of a node $u_i$ or $v_i$ is 1 (it can be 0 as well).
- Optimality condition 1: The required matching has the highest cardinality among all valid matchings.
- Optimality condition 2: An edge weight is defined as $W(u_i, v_i) = (v_i - u_i)$. The required matching has the lowest total edge weight among all the other matchings of the same cardinality.
So essentially, our first order goal is to match maximum pairs and second order goal is to ensure that a certain $v$ is matched to the best $u$ and vice-versa.
Here's what I've tried till now:
Approach 1 (Incorrect)
I tried to approach this as an assignment problem using the Hungarian algorithm. In this, I assume that the u's are the workers and the v's are the jobs and the cost of making a worker $u_i$ do a job $v_i$ is $(v_i - u_i)$ if $(u_i, v_i) \in L$ otherwise it is $\infty$. Let's assume we have the graph as an edge list, where each edge is the row of a dataframe with column p corresponding to $u$ and q corresponding to $v$. So, the implementation for this would be:
from scipy.optimize import linear_sum_assignment
import polars as pl
import numpy as np
# Assume the df is initialized with columns p and q as above
df = df.with_columns(weight=(pl.col("q")-pl.col("p")))
weights_df = df.pivot(
on='q',
index='p',
values="weight",
)
weights_arr = np.nan_to_num(weights_df.to_numpy()[:, 1:], nan=np.inf)
row_ind, col_ind = linear_sum_assignment(weights_arr)
matched_df = pl.DataFrame(
data={
"p": weights_df[row_ind, "p"],
"q": [
int(weights_df.columns[i + 1]) for i in col_ind
],
"is_matched": True,
},
)
df = df.join(
matched_df,
on=['p', 'q']
how="left",
coalesce=True,
).with_columns(pl.col("is_matched").fill_null(False))
This approach is wrong if let's say $E = [(1, 10), (1, 11)]$. In this case, no worker can be matched to every job and the scipy routine throws an error as the optimization objective is always $\infty$.
Approach 2 (Inefficient/Unsolvable)
Here, I tried to approach the problem through an integer programming lens. Let's say we introduce a label $l_i \in \{0, 1\}$ for each tuple, or equivalently a column l in the edge list which is True if an edge is present in $L$ otherwise False and w for the weights. Let us introduce two 'infinity-like' constants $E_1 >>> E_2 >>>$ (sum of all the weights) which will help us to reduce our 2-step objective to a single quadratic objective function:
$$
\min_{l \in \{0, 1\}^n}f(l) = E_1\sum_{(i, j) \in S_1}l_il_j + E_2\sum_{i}(1 - l_i) + \sum_{i} l_iw_i \\
S = \{(i, j) | (u_i = u_j) \lor (v_i = v_j) \}
$$
The first term in the objective ensures that 'illegal' pairs (pairs in which constraint 1 is violated, more than one matched 'v' for a 'u' or more than one matched 'u' for a 'v') never occur because their penalty is unforgivably high. The second term in the objective ensures maximum cardinality of a labeling. The third term ensures that a suboptimal pairing is not chosen where a closer pair can be made.
The problem with this approach is that solvers for such kind of an optimization problem are too slow and it feels like an overkill for something for which I hope an elegant flow-cut/matching/combinatorial optimization based solution exists.
Apologies if the notation is a little messy. How do I tackle this problem and prove optimality?
-
$\begingroup$ I have edited the question to make it closer to the notation followed by graph theory. Hope this makes it crystal clear. Thanks for your suggestions @D.W. $\endgroup$kaddy– kaddy2025年10月23日 07:11:43 +00:00Commented Oct 23 at 7:11
1 Answer 1
Let $C$ be a sufficiently large constant. Use $C-(v_i-u_i)$ as the edge weight, instead of $v_i-u_i$. Then find a maximum-weight matching in the resulting graph. This will be the solution to your problem.
It suffices to take $C$ larger than $|V| \times \max \{v_i : i \in V\}$. Then any matching containing $k$ edges will always have greater total weight than any matching containing $k-1$ edges, so the maximum matching in this graph is a maximum cardinality matching. Moreover, by construction of the edge weights, the maximum matching in this graph also resolves ties by minimizing the sum of the original edge weights.
In case one cares only about the implementation, one can use NetworkX's maximum-weighted-matching implementation with the maxcardinality parameter set to True. This would let you obtain a matching directly without modifying the weights.
Explore related questions
See similar questions with these tags.