I have come across what seems to be a collapsed loop.
parallel-for i, j = 1...n
What would be the work and span/depth/critical path length be of loops like this? Would it be the same as that of the following:
parallel-for i = 1...n
parallel-for j = 1...n
EDIT:
Work = the asymptotic running time of the algorithm on a single core (total computational steps)
Span/depth/critical path length = the asymptotic running time of the algorithm on an infinite number of cores (the point beyond which more parallel processors won't give any performance gains).
1 Answer 1
The work $W(n)$ is the total number of nodes in your computation graph and the span $D(n)$ is the number of nodes on the longest path of that graph. $T_P(n)$ is the runtime of the algorithm using $P$ processors; $T_1(n) = W(n)$ and $T_\inf(n) = D(n)$.
Suppose your code has the following shape
parfor i in range(n):
parfor j in range(n):
f()
where $f() \in O(1)$ is some function (with side-effects). The keyword "parfor" means that every loop iteration is independent. There are $n^2$ independent iterations so in theory $W(n)\in O(n^2)$ and $T_\inf(n) = D(n)\in O(1)$.
However, the work of starting each $f()$ thread and waiting for its result must be accounted for. That is, we must implement the "parfor" keyword. Here is one way:
parfor_impl(f, lo, hi):
if hi - lo <= 1:
f()
else:
mid = (lo + hi) // 2
spawn parfor_impl(f, lo, mid)
spawn parfor_impl(f, mid, hi)
wait
where the keywords "spawn" and "wait" launches threads and waits for previously launched threads' completion respectively. This is a classical recursive scheme with $W_{parfor} \in O(n)$ and $D_{parfor} \in O(\log n)$. To get the total runtime (and span) we just add the runtime of the two nested loops $O(2\log n) = O(\log n)$. We do not multiply the runtime since the loops run in parallel.
I can highly recommend the "Intro to the Work-Span Model" chapter of Udacity's High Performance Computing course which delves deep into this topic.
Explore related questions
See similar questions with these tags.
parallel-for i, j, k = 1 to n do C[i, j, k] = A[i, k] * B[k, j]
$\endgroup$