I am a beginner with Erlang and currently making my way through the awesome Joe Armstrong's book. The following are my answers to exercises 5 and 6 for the concurrency chapter. The code can probably be much better but at least it seems to work fine after some tests. I'm just happy to be able to understand this language, I'm def having to re-learn how to think about my programming logic, and I think that's a good thing.
5) Write a function that starts and monitors several worker processes. If any of the worker processes dies abnormally, restart it.
ex5_start(FList) ->
spawn(fun() ->
[bring_back(F) || F <- FList]
end).
bring_back(FID) ->
spawn(fun() ->
{Pid, Ref} = spawn_monitor(FID),
io:format("Pid = ~p~n",[Pid]),
receive
{'DOWN', Ref, process, Pid, normal} -> void;
{'DOWN', Ref, process, Pid, _Why} -> bring_back(FID)
after infinity -> void
end
end).
6) Write a function that starts and monitors several worker processes. If any of the worker processes dies abnormally, kill all the worker processes and restart them all.
ex6_start(Flist) when is_list(Flist) ->
Size = length(Flist),
spawn(fun() ->
{Pid,Ref} = spawn_monitor(fun() ->
start_and_monitor(Size,Flist),
receive
after infinity -> true
end
end),
receive
{'DOWN',Ref,process,Pid,_} -> io:format("All dead. Restarting.~n") ,
ex6_start(Flist)
end
end).
start_and_monitor(1,Flist) ->
Fun = lists:nth(1,Flist),
Pid = spawn_link(Fun),
io:format("Pid ~p created.~n",[Pid]);
start_and_monitor(N,Flist) ->
Fun = lists:nth(N,Flist),
Pid = spawn_link(Fun),
io:format("Pid ~p created.~n",[Pid]),
start_and_monitor(N-1,Flist).
-
\$\begingroup\$ probably would be better if you provide full example. In this case we can run your code and test it without additional efforts. \$\endgroup\$user110702– user1107022016年12月08日 21:28:19 +00:00Commented Dec 8, 2016 at 21:28
1 Answer 1
For the first problem:
You are using more processes than strictly needed, since you
- spawn a process
A
that - spawns a number of monitoring processes
M
that each - spawn a worker
W
and monitoring it.
If a worker fails, you spawn a new monitor 'M' and a new worker.
ex5_start(FList) ->
spawn(fun() -> %% This is the spawn of process A
[bring_back(F) || F <- FList] %% This is the list for B's
end).
bring_back(FID) ->
spawn(fun() -> %% This is the spawning of an M
{Pid, Ref} = spawn_monitor(FID), %% This is the spawning of a W
io:format("Pid = ~p~n",[Pid]),
receive
{'DOWN', Ref, process, Pid, normal} -> void;
{'DOWN', Ref, process, Pid, _Why} ->
%% New M and W
bring_back(FID)
after infinity -> void %% you can skip this entire line
end
end).
An after
clause with a hardcoded infinity
timeout is never going to be executed, so you can skip it.
You can think of a simpler solution, where you spawn_monitor
the workers and associate their FID
s with a Pid
and Ref
(e.g. in a list comprehension) and then have a second phase in which whenever a DOWN
message arrives you lookup the relevant worker PID
and restart its FID
. You don't even need a separate A
process.
For the second problem:
Several things seem off:
- You are using an intermediate process on which all others are linked
This intermediate process will block on this piece of code:
receive after infinity -> true end
- you are not
killing
all processes on an error, just linking them all to die together with the intermediate process. This may not be what is needed, if e.g. some process is trapping exit signals. - If all processes exit normally, the intermediate process will remain blocked forever.
If you follow my suggestion for the previous question, it is easy to replace the second phase with a different one that does the correct thing (remember to also keep track of processes that have died normally).
-
\$\begingroup\$ Hi @aronisstav, sorry it took so long to reply, I was away last month. Thank you so much for the analysis. I'll take this in and see how I can improve the code. Hopefully I can repost it here soon. \$\endgroup\$mbr_uk– mbr_uk2017年01月04日 10:28:30 +00:00Commented Jan 4, 2017 at 10:28