Previous
Up
Next
7 Conclusion and future work
In join-calculus, a name definition, all receptors on that name and all
possible synchronizations on that name are grouped altogether in a single
join definition. This enables the compilation of synchronization
between concurrent or even distributed processes, using finite-state
automata.
In the distributed case, a message transport phase to the
machine that currently hosts the join definition (and hence the
automaton) is first performed. This strengthens our point of view
that the join-calculus is the core of a distributed programming
language that can be compiled in practice, mainly because it restricts
reception on a channel to statically known parts of the program. The
same argument applied to à la ML polymorphic typing
in [
6].
fib
afib
pat
qsort
count
join
32.0
14.5
37.2
9.9
16.4
jocaml
5.7
3.5
5.4
1.4
4.2
Bologna
11.9
6.2
9.4
16.8
5.3
Table 1: Some performance measures (wall-clock time, in seconds)
We performed experiments on a 200Mhz Pentium-Pro PC,
taking a few benchmarks as a set of sensible join programs:
fib is computing fib(27) using synchronous
channels,
afib is computing fib(20) using
asynchronous channels and continuation-passing style,
pat tests pattern-matching,
qsort sorts a
one-hundred elements list (lists are encoded as lists of messages),
while
count increments the counter of section
5.1
100000 times.
We compared the
join,
jocaml compilers and another
join implementation from the University of Bologna [
12].
Note that we cannot offer a full analysis of these performance figures
at the moment, however such data can be useful to other
developers (the benchmarks are available at
http://join.inria.fr/speed/).
Table
1 shows that
jocaml and the Bologna
implementation exhibit similar performance,
jocaml being
slightly faster, while
join is noticeably slower.
This is easily explained by core implementation choices:
the
join runtime performs bytecode interpretation in Caml,
while the
jocaml and Bologna bytecode
interpreters are written in C. Additionally,
jocaml benefits
from the finely tuned Objective Caml bytecode interpreter.
The table and other consideration also show that both the
join and the
jocaml pattern matching compilation schemes
prove adequate. In particular, none of the two schemes falls into the
pitfall associated to the compilation technique used, since all
figures show similar variations.
However,
join performs poorly in the
pat test and we cannot tell for the moment whether
pattern-matching compilation is responsible for it or not.
In the
join case, one can be afraid of code size, the technique
exposed in section
5.3 successfully avoids code size explosion
in practical cases. The
jocaml technique appears expensive in
runtime checks and thus
a priori produces slow code. We choose
such a scheme of implementing automata using generic code, because it
can be implemented simply by adding code to the Objective Caml
bytecode interpreter. Using bytecode specialized for automata
manipulation would have implied more important modifications of the
Objective Caml bytecode interpreter. Moreover, the
jocaml system
runs faster than the
join system, even for pure join programs,
showing that its weaker compilation of join definitions is more than
compensated by its total integration in the Objective Caml system.
Whether
jocaml performance would benefit significantly from
a more sophisticated implementation of pattern-matching automata or not
remains an open question.
Comparison with the Bologna implementation [
12] of the
join-calculus is also instructive. This system also produces
bytecode, which is interpreted by a C program. It proves faster than
join and slower that
jocaml on all the examples,
except for
qsort, where it is outperformed by both our
implementations.
Taking a glance at the Bologna source code reveals that it uses a
scheme very similar to the one of
jocaml, with two slight
differences. First, status is systematically encoded as an array of
integers. Second when a message arrives on a name
x with an empty
queue, all patterns are tested (whereas in
jocaml only the
patterns that contain
x are tested).
Performance of a given join system depends on many factors. In
particular, scheduling policy and message queue management have a
dramatic impact on it, which accounts for the poor performance of
the Bologna implementation on the
qsort test. Furthermore, a
policy that gives good results
on one benchmark can be defeated by another. For these reasons, we
cannot tell which pattern-matching compilation technique is the best
by comparing
different implementations.
Clearly, we now need to integrate all our compilation techniques in
the same system, in order to compare them more thoroughly and to
experiment further. However, the definition of reactivity status and
the automata of section
3 provide a sound
basis for these future investigations. Apart from future language
development and fully implementing the failure semantics of the
join-calculus, we also plan to investigate more on the implementation
of threads, attempting to minimize thread suspension and creation.
The chemical semantics does not fit compilation needs. For instance,
this very paper gets imprecise when it comes to relate pattern-matching
by automata and join-definition firing.
Or, name management by α-conversion at dilution-time does not
faithfully render closure-based implementations.
Thus, another important direction for future research is defining a
semantics for the implementations.
Such a semantics should be more deterministic than the chemical
semantics, it should also handle synchronism directly.
This would be a first step toward developing precise and proved semantical
analyses, we plan to follow [
13] for designing
both concrete and abstract semantics for the implementations.
Previous
Up
Next