I would like the following code reviewed:
I am trying to create a local server which listens indefinitely and provides the latest information. The server runs another while loop which fetches the latest info every one hour.
I have tried multi-threading approach, in which one thread is a socket server (runs indefinitely) and another thread (runs indefinitely with sleep) updates the latest info to a global variable.The global variable is then accesses by the server and is sent to it's clients.
from socket import *
import threading
import pandas as pd
import json
import time
import datetime
import thread
#Global Info variable
info = pd.DataFrame()
def getInfo():
global info
while 1:
print "Information retrieval running"
try:
#This will bring the required info as pandas df
info = retriveInfoFromDB()
except:
pass
time.sleep(3600)
def runServer():
global info
address = ('localhost', 6005)
server_socket = socket(AF_INET, SOCK_DGRAM)
server_socket.bind(address)
while(1):
print "Listening"
recv_data, addr = server_socket.recvfrom(2048)
server_socket.sendto(info.to_json(path_or_buf = None, orient = 'records', date_format = 'epoch', double_precision = 10, force_ascii = True, date_unit = 'ms', default_handler = None), addr)
th1 = threading.Thread(target=runServer)
th2 = threading.Thread(target=getInfo)
th1.start()
th2.start()
2 Answers 2
def getInfo():
global info
while 1:
Don't use while 1
to symbolize an infinite loop; it does not look very pythonic. A better way to write this would be to use True
.
try:
#This will bring the required info as pandas df
info = retriveInfoFromDB()
except:
pass
Don't just except
; you should be catching a specific type of exception. In this case, the exception you would catch would be whatever exception could possibly be raised from retriveInfoFromDB
.
server_socket.bind(address)
while(1):
print "Listening"
and
global info
while 1:
print "Information retrieval running"
Why did you use different styling for the while
loop in both of these loops? Yes, they should both be using True
rather than 1, but why did you put ()
s around one's conditional and nothing around the other's conditional?
In my opinion, you should not be using ()
s because they make code look more cluttered and less pythonic.
Be careful with these lines:
server_socket = socket(AF_INET, SOCK_DGRAM)
server_socket.bind(address)
These lines are both capable of raising an error.
The python socket library comes with it's own error called socket.error
. It split up into a tuple: (errno, string)
Where errno
is the error number raised by the system call, and string
is a descriptive error message of the errno
.
To be safe with these lines, just encase them in a try/except
statement:
try:
server_socket = socket(AF_INET, SOCK_DGRAM)
server_socket.bind(address)
except socket.error as error_message:
print "Error #%d: %s" % (error_message[0], error_message[1])
sys.exit(-1)
At the end of the except
, I added a function call to quit the program with an exit status of -1 (this tells external programs that there was in running this application).
I think it should be a good idea to exit because, if you are having socket problems, the code is probably not going to work in the first place so it would be best to just exit the code.
Note: an error might be able to arise from
recv_data, addr = server_socket.recvfrom(2048)
If an error can come from here, I recommend that you catch it as I showed above, but I don't think you should have to quit out of the program for it.
The variable names
th1 = threading.Thread(target=runServer)
th2 = threading.Thread(target=getInfo)
are not very descriptive for a variable. Yes, I know it is a thread; the variable name should not tell me that.
The variable name should tell me what the variable is going to be used for. For example, you could call th1
something along the lines of server_listener
.
I recommend a different approach to your multi-threading.
In this loop right here:
while True:
print "Listening"
recv_data, addr = server_socket.recvfrom(2048)
server_socket.sendto(info.to_json(path_or_buf = None, orient = 'records', date_format = 'epoch', double_precision = 10, force_ascii = True, date_unit = 'ms', default_handler = None), addr)
Things can get a little hung up with all the reading and the writing; especially reading and writing of that size.
I recommend opening up a new thread every time you have a new connection, and then closing it every time you are done reading and writing the connection.
That way, if your program gets hung up on one connection, the rest won't be affected.
Here is what I mean:
while True:
connection = threading.Thread(target=read_and_write_data)
connection.start()
Where read_and_write_data
is
def read_and_write_data():
global info
global server_socket
print "Listening"
recv_data, addr = server_socket.recvfrom(2048)
server_socket.sendto(info.to_json(path_or_buf = None, orient = 'records', date_format = 'epoch', double_precision = 10, force_ascii = True, date_unit = 'ms', default_handler = None), addr)
Sorry for the poor naming of that function.
To make splitting up into different threads per connection easier, I made the variables address
, and server_socket
. That way, different functions can easily access the same information; it would be a pain and unnecessary to re-make the socket every time a new thread was created.
And, no need to worry about closing the thread: the thread will automatically close after it's target
has finished running.
Python's naming case for variables and functions is snake_case
, not camelCase
. Your functions getInfo
and runServer
should be snake_case
. See PEP 8, the official Python style guide.
Putting it all together:
from socket import *
import threading
import pandas as pd
import json
import time
import datetime
import thread
#Global Info variable
info = pd.DataFrame()
try:
server_socket = socket(AF_INET, SOCK_DGRAM)
server_socket.bind(address)
except socket.error as error_message:
print "Error #%d: %s" % (error_message[0], error_message[1])
sys.exit(-1)
def get_info():
# same code
def run_server():
while True:
connection = threading.Thread(target=read_and_write_data)
connection.start()
def read_and_write_data():
global info
global server_socket
print "Listening"
recv_data, addr = server_socket.recvfrom(2048)
server_socket.sendto(info.to_json(path_or_buf = None, orient = 'records', date_format = 'epoch', double_precision = 10, force_ascii = True, date_unit = 'ms', default_handler = None), addr)
server_listener = threading.Thread(target=run_server)
info_reader = threading.Thread(target=get_info)
server_listener.start()
info_reader.start()
-
\$\begingroup\$ Strongly disagree with
run_server
refactoring. The code would run out of threads right away. \$\endgroup\$vnp– vnp2015年07月07日 17:07:57 +00:00Commented Jul 7, 2015 at 17:07 -
\$\begingroup\$ @vnp Are you sure? After the
read_and_write_data
call finishes, the thread will close and will be open for another use. \$\endgroup\$SirPython– SirPython2015年07月07日 17:13:33 +00:00Commented Jul 7, 2015 at 17:13 -
\$\begingroup\$ Absolutely sure.
threading.Thread
doesn't block. The loop would keep creating threads indefinitely without waiting forread_and_write_data
to finish. \$\endgroup\$vnp– vnp2015年07月07日 17:17:53 +00:00Commented Jul 7, 2015 at 17:17 -
\$\begingroup\$ @vnp I see what you are saying; thanks for pointing that out! Would the best option be to insert a
time.sleep
, or would it be better to abandon the idea all together? \$\endgroup\$SirPython– SirPython2015年07月07日 17:23:36 +00:00Commented Jul 7, 2015 at 17:23
So @SirPython covered a lot.
I'm just wanted to mention: threads in python are (most of the time) not executed in parallel because of the Global Interpreter Lock. So you'll have to use stackless python
or the multiprocessing
module if you want true parallelism.
Do mind that either conversion is not magical and you would have to modify your code (specially stuff involving global variables). It looks like you don't really need your code to have great performance, but it's up to you.
Also, try to avoid global variables. They make your program unpredictable