-
Notifications
You must be signed in to change notification settings - Fork 6
[Rule] MaximumAcyclicAgreementForest to ILP #1048
Description
Companion rule for the model proposal in #1046.
Source / Target
- Source:
MaximumAcyclicAgreementForest(default variant) — see [Model] MaximumAcyclicAgreementForest #1046 - Target:
ILP/i32
Motivation
A direct integer-linear-programming formulation for MAAF gives an immediate solver path via the existing ILP solver backend, independent of the structural MAAF → MinimumFeedbackVertexSet route. The formulation is compact and natural: cut-indicator variables per tree edge plus block-ancestor indicator variables plus a topological-rank constraint enforcing acyclicity.
Reference
Wu, "Close lower and upper bounds for the minimum reticulate network of multiple phylogenetic trees", Bioinformatics 26(12), 2010. https://doi.org/10.1093/bioinformatics/btq198
See also Chen, Wang, "Algorithms for reticulate networks of multiple phylogenetic trees", IEEE/ACM TCBB 9(2), 2012 for a related ILP.
Reduction Algorithm
Let
-
Cut-indicator variables. For each non-root edge
$e \in E_1 \cup E_2$ , introduce a binary variable$y_e \in {0,1}$ :$y_e = 1$ iff$e$ is cut (removed from$T_t$ ). -
Block-ancestor variables. Let
$B = {1,\dots,n}$ be candidate block indices. For each ordered pair$(i, j) \in B \times B$ with$i \neq j$ , introduce a binary variable$z_{ij} \in {0,1}$ indicating the arc$i \to j$ in the block ancestor digraph. -
Rank variables. For each
$i \in B$ , introduce an integer variable$r_i \in {0, 1, \dots, n-1}$ . -
Partition-agreement constraints. For each pair of leaves
$(\ell, m) \in X \times X$ , let$P_t(\ell, m)$ be the unique edge path from$\ell$ to$m$ in$T_t$ . The constraint
$$\sum_{e \in P_1(\ell,m)} y_e ;=; 0 ;;\Longleftrightarrow;; \sum_{e \in P_2(\ell,m)} y_e ;=; 0$$
(linearized via auxiliary indicators$s_{\ell,m}^{(t)} \in {0,1}$ with$s_{\ell,m}^{(t)} \leq \sum_{e \in P_t(\ell,m)} y_e$ and$s_{\ell,m}^{(t)} \geq y_e$ for every$e \in P_t(\ell,m)$ , plus$s_{\ell,m}^{(1)} = s_{\ell,m}^{(2)}$ ) enforces that$\ell$ and$m$ lie in the same block of$F$ iff neither tree cuts their connecting path. -
Acyclicity constraints. For each
$(i, j) \in B \times B$ ,$i \neq j$ :$r_i - r_j + n \cdot z_{ij} \leq n - 1$ (standard Miller–Tucker–Zemlin-style rank constraint, infeasible precisely when the$z$ -digraph contains a cycle). -
Objective.
$$\text{minimize} ;; 1 + \sum_{e \in E_1} y_e.$$
(Each cut in$T_1$ produces one additional block; by the partition-agreement constraints the number of$T_1$ -cuts equals the number of$T_2$ -cuts and equals$|F| - 1$ .)
Return the resulting ILP as the target instance.
Size Overhead
Target size fields from pred show ILP → size_fields = [num_constraints, num_variables].
| Field | Formula |
|---|---|
num_variables |
4 * num_leaves + num_leaves ^ 2 |
num_constraints |
num_leaves ^ 2 |
(The
Validation Method
Closed-loop test: construct the source instance, apply the reduction, solve the target ILP (brute-force or ILP solver), lift back to a partition
Example (fully worked)
Source.
Construction.
Target solved. Optimal objective =
Lifted solution.
BibTeX
@article{Wu2010, author = {Wu, Yufeng}, title = {Close lower and upper bounds for the minimum reticulate network of multiple phylogenetic trees}, journal = {Bioinformatics}, volume = {26}, number = {12}, pages = {i140--i148}, year = {2010}, doi = {10.1093/bioinformatics/btq198}, }