I have a set of compute tasks I want to schedule, these tasks have dependencies and a task may not be executed until all its dependencies are executed.
The problem can be represented as a directed acyclic graph:
The current scheduler ensures correctness, and capable of culling unneeded tasks such as k in the previous graph (assuming the end goal is i).
I am struggling to find another equivalent algorithm to maximize number of tasks running in parallel (tasks on the same line may be executed in any order):
a, b, j
c
d
e, f, h, g
i
Flatten form, what is between ()
may be in any order and represent parallel tasks:
(a, b, j), (c), (d), (e, f, h, g), (i)
The idea here I want to fill the compute engine with as much work as possible.
1 Answer 1
From your initial DAG $G(E, V)$ You start reversing all edges to build a new DAG $G'(E, V')$. Then do a BFS from $i$ in $G'$ to remove any unneeded tasks (unreached nodes in the BFS). This step is $O(N)$ with $N = |V'|$, the number of edges.
From $G$, build the number of requirements of each task ($R(V)$) and at the same time the list $L$ of the tasks with no requirements. this takes $O(N)$.
Then you iterate on this pseudo-algorithm:
while(root is not executed)
L' = empty list
for v in L
for each node v' with an edge in G' from v to v'
R(v') -= 1
if R(v') == O then add v' to L'
execute all L in a parrallel step
L = L'
this is $O(N)$.
-
$\begingroup$ Thank you! I have added a Python implementation of this algorithm. $\endgroup$user10655827– user106558272019年03月19日 09:42:46 +00:00Commented Mar 19, 2019 at 9:42
j
, noj
should be executed as soon as possible. The idea here I want to fill the compute engine with as much work as possible. $\endgroup$