I have coded a very simple distributed system simulator in Python. It uses multiprocessing to assign tasks, and queues to communicate between processes.
The code is shown below.
from functions import *
import multiprocessing
import time
try:
with open("config.txt") as f:
lines = f.readlines()
max_instances = int(lines[0].split(' ')[1])
except Exception, e:
print "Exception while opening config.txt :", e
print "Please make sure that\n1) The File is present in the current folder"
print "2) It contains the value of MAX_NUMBER_OF_INSTANCES, space delimited"
print "Download the file again if problem persists"
exit(1)
class machine():
'Class for the instance of a machine'
q = [multiprocessing.Queue() for i in range(max_instances + 1)]
# q[0] is unused
count = multiprocessing.Value('i', 1)
def __init__(self):
self.mac_id = machine.count.value
machine.count.value += 1
def execute_func(self, func_name, *args):
comm_str = str(func_name) + ' = multiprocessing.Process(name = "' + str(func_name) + '", target = ' + str(func_name) + ', args = ('
comm_str += 'self,'
for arg in args:
if(type(arg) is str):
comm_str += '"' + str(arg) + '",'
else:
comm_str += str(arg) + ','
comm_str += '))'
try:
# create the new process
exec(comm_str)
# start the new process
comm_str = str(func_name) + '.start()'
exec(comm_str)
except Exception, e:
print "Exception in execute_func() of", self.get_machine_id(), ":", e
print self.get_machine_id(), "was not able to run the function ", func_name
print "Check your function name and parameters passed to execute_func() for", self.get_machine_id()
def send(self, destination_id, message):
# send message to the machine with machine_id destination_id
mac_id = int(destination_id[8:])
if(mac_id >= machine.count.value or mac_id <= 0):
return -1
# message is of the format "hello|2". Meaning message is "hello" from machine with id 2
# However, the message received is processed and then returned back to the user
message += '|' + str(self.get_id())
machine.q[mac_id].put(message)
return 1
def recv(self):
mac_id = self.get_id()
if(mac_id >= machine.count.value or mac_id <= 0):
return -1, -1
message = machine.q[mac_id].get().split('|')
# message received is returned with the format "hello" message from "machine_2"
return message[0], 'machine_' + message[1]
def get_id(self):
return self.mac_id
def get_machine_id(self):
return "machine_" + str(self.get_id())
You can assign tasks to each machine instance that you would create. These tasks are to be given in the form of a function. These functions are to be kept in a file in the same folder with name functions.py
Suppose I want 2 machine instances. One would send the other machine 10 numbers and the other one will return the sum. In this case, the functions would look something like this.
def machine1(id_var):
print "machine instance started with id:", id_var.get_machine_id()
# id_var.get_machine_id() is used to get the machine id
for i in range(10):
id_var.send("machine_2", str(i))
message, sender = id_var.recv()
print id_var.get_machine_id(), " got sum =", message, " from", sender
def machine2(id_var):
print "machine instance started with id:", id_var.get_machine_id()
# id_var.get_machine_id() is used to get the machine id
total = 0
for i in range(10):
message, sender = id_var.recv()
total += int(message)
id_var.send("machine_1", str(total))
Now to run this, you need to create a machine instance and assign the proper function to it. Like
from dss import *
m1 = machine()
m1.execute_func("machine1")
m2 = machine()
m2.execute_func("machine2")
This all works fine. I am already using this library to implement some pretty complex distributed load balancing algorithms.
I'm looking for a review as to this being a good enough solution, or new features that should be added.
For more information, you can see the github page.
2 Answers 2
I would do these things as improvements:
- Follow PEP-8 and PEP-257 for writing code and docstrings in Python.
- Use the
configparser
module in Python to get config parameters - Use
logging
module instead ofprint
- Compatibility Python2 and Python3, that means change every
try
...except Exception, e
and all theprint
statements.
Also I saw your config.txt file is repeated in two paths in your GitHub project.
-
\$\begingroup\$ Thanks for the valuable suggestions, I will certainly use all of them. :) \$\endgroup\$Haris– Haris2016年03月01日 10:30:03 +00:00Commented Mar 1, 2016 at 10:30
Avoid exec
It will take some major redesigning, but the resulting code will enjoy all of the benefits of not using exec
listed above.
Multiprocessing
can take first class functions as arguments, see the docs for further info.
-
\$\begingroup\$ Thanks for the answer. I have redesigned it and removed the use of exec. You can check it in github. If anything else can be changed to make it better, please do comment. :) \$\endgroup\$Haris– Haris2016年04月07日 13:44:47 +00:00Commented Apr 7, 2016 at 13:44