Killing and restarting processes

Question 1

I am a beginner with Erlang and currently making my way through the awesome Joe Armstrong's book. The following are my answers to exercises 5 and 6 for the concurrency chapter. The code can probably be much better but at least it seems to work fine after some tests. I'm just happy to be able to understand this language, I'm def having to re-learn how to think about my programming logic, and I think that's a good thing.

5) Write a function that starts and monitors several worker processes. If any of the worker processes dies abnormally, restart it.

ex5_start(FList) ->
 spawn(fun() ->
 [bring_back(F) || F <- FList]
 end).
bring_back(FID) ->
 spawn(fun() ->
 {Pid, Ref} = spawn_monitor(FID),
 io:format("Pid = ~p~n",[Pid]),
 receive
 {'DOWN', Ref, process, Pid, normal} -> void;
 {'DOWN', Ref, process, Pid, _Why} -> bring_back(FID)
 after infinity -> void
 end
 end).

6) Write a function that starts and monitors several worker processes. If any of the worker processes dies abnormally, kill all the worker processes and restart them all.

ex6_start(Flist) when is_list(Flist) ->
 Size = length(Flist),
 spawn(fun() ->
 {Pid,Ref} = spawn_monitor(fun() -> 
 start_and_monitor(Size,Flist), 
 receive 
 after infinity -> true 
 end 
 end),
 receive
 {'DOWN',Ref,process,Pid,_} -> io:format("All dead. Restarting.~n") ,
 ex6_start(Flist) 
 end
 end).
start_and_monitor(1,Flist) ->
 Fun = lists:nth(1,Flist),
 Pid = spawn_link(Fun),
 io:format("Pid ~p created.~n",[Pid]);
start_and_monitor(N,Flist) ->
 Fun = lists:nth(N,Flist),
 Pid = spawn_link(Fun),
 io:format("Pid ~p created.~n",[Pid]),
 start_and_monitor(N-1,Flist).

Question 2

probably would be better if you provide full example. In this case we can run your code and test it without additional efforts.

Question 3

For the first problem:

You are using more processes than strictly needed, since you

spawn a process A that
spawns a number of monitoring processes M that each
spawn a worker W and monitoring it.

If a worker fails, you spawn a new monitor 'M' and a new worker.

ex5_start(FList) ->
 spawn(fun() -> %% This is the spawn of process A
 [bring_back(F) || F <- FList] %% This is the list for B's
 end).
bring_back(FID) ->
 spawn(fun() -> %% This is the spawning of an M
 {Pid, Ref} = spawn_monitor(FID), %% This is the spawning of a W
 io:format("Pid = ~p~n",[Pid]),
 receive
 {'DOWN', Ref, process, Pid, normal} -> void;
 {'DOWN', Ref, process, Pid, _Why} ->
 %% New M and W
 bring_back(FID) 
 after infinity -> void %% you can skip this entire line
 end
 end).

An after clause with a hardcoded infinity timeout is never going to be executed, so you can skip it.

You can think of a simpler solution, where you spawn_monitor the workers and associate their FIDs with a Pid and Ref (e.g. in a list comprehension) and then have a second phase in which whenever a DOWN message arrives you lookup the relevant worker PID and restart its FID. You don't even need a separate A process.

For the second problem:

Several things seem off:

You are using an intermediate process on which all others are linked
This intermediate process will block on this piece of code:
```
 receive 
 after infinity -> true 
 end
```
you are not killing all processes on an error, just linking them all to die together with the intermediate process. This may not be what is needed, if e.g. some process is trapping exit signals.
If all processes exit normally, the intermediate process will remain blocked forever.

If you follow my suggestion for the previous question, it is easy to replace the second phase with a different one that does the correct thing (remember to also keep track of processes that have died normally).

Question 4

Hi @aronisstav, sorry it took so long to reply, I was away last month. Thank you so much for the analysis. I'll take this in and see how I can improve the code. Hopefully I can repost it here soon.

aronisstav aronisstav 2361 silver badge6 bronze badges · Accepted Answer · 2016-12-09 07:25:35Z

For the first problem:

You are using more processes than strictly needed, since you

spawn a process A that
spawns a number of monitoring processes M that each
spawn a worker W and monitoring it.

If a worker fails, you spawn a new monitor 'M' and a new worker.

ex5_start(FList) ->
 spawn(fun() -> %% This is the spawn of process A
 [bring_back(F) || F <- FList] %% This is the list for B's
 end).
bring_back(FID) ->
 spawn(fun() -> %% This is the spawning of an M
 {Pid, Ref} = spawn_monitor(FID), %% This is the spawning of a W
 io:format("Pid = ~p~n",[Pid]),
 receive
 {'DOWN', Ref, process, Pid, normal} -> void;
 {'DOWN', Ref, process, Pid, _Why} ->
 %% New M and W
 bring_back(FID) 
 after infinity -> void %% you can skip this entire line
 end
 end).

An after clause with a hardcoded infinity timeout is never going to be executed, so you can skip it.

You can think of a simpler solution, where you spawn_monitor the workers and associate their FIDs with a Pid and Ref (e.g. in a list comprehension) and then have a second phase in which whenever a DOWN message arrives you lookup the relevant worker PID and restart its FID. You don't even need a separate A process.

For the second problem:

Several things seem off:

You are using an intermediate process on which all others are linked
This intermediate process will block on this piece of code:
```
 receive 
 after infinity -> true 
 end
```
you are not killing all processes on an error, just linking them all to die together with the intermediate process. This may not be what is needed, if e.g. some process is trapping exit signals.
If all processes exit normally, the intermediate process will remain blocked forever.

If you follow my suggestion for the previous question, it is easy to replace the second phase with a different one that does the correct thing (remember to also keep track of processes that have died normally).

Hi @aronisstav, sorry it took so long to reply, I was away last month. Thank you so much for the analysis. I'll take this in and see how I can improve the code. Hopefully I can repost it here soon.

Stack Exchange Network

Killing and restarting processes

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Killing and restarting processes

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions