3

I have been banging my head against a socket issue for the last two weeks to no avail. I have a setup of 12 'client' machines and one server machine. The server is given a large task, splits it into 12 smaller tasks and then distributes them to the 12 clients. The clients churn away and once they finish their task, they are supposed to let the server know that they have finished via socket communication. For some reason, this has only been working spottily or not at all (both, the server and the clients, just sit in the while loop).

Here is the code on the server:

socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
socket.bind(('localhost', RandomPort))
socket.listen(0)
socket.settimeout(0.9)
[Give all the clients their tasks, then do the following:]
while 1:
 data = 'None'
 IP = [0,0] 
 try:
 Client, IP = socket.accept()
 data = Client.recv(1024)
 if data == 'Done':
 Client.send('Thanks')
 for ClientIP in ClientIPList():
 if ClientIP == IP[0] and data == 'Done':
 FinishedCount += 1 
 if FinishedCount == 12:
 break
 except:
 pass

Here is the code on all the clients:

[Receive task from server and execute. Once finished, do the following:]
while 1:
 try:
 f = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
 f.connect((IPofServer, RandomPort)) 
 f.settimeout(0.5)
 f.send('Done')
 data = f.recv(1024)
 if data == 'Thanks':
 f.shutdown(socket.SHUT_RDWR)
 f.close()
 break
 except:
 f.close()
 time.sleep(2+random.random()*5)

I have used Wireshark and found that the packets are flying around. Yet, the "FinishedCount" never seems to increase... Is there anything glaringly wrong that I have missed in setting this up? This is my first exposure to sockets....

Thank you all for your help in advance!

EDIT: I've made the following changes to the code:

On the server: socket.listen is now socket.listen(5)

asked Nov 18, 2011 at 14:45
6
  • 2
    Tip: you should never use uppercase as the first letter of a variable. That syntax should always be left for class names. Question: why is the client in a while 1 loop? Why not just send once to say "Im done" then move on? Is the data not sending the first time? Commented Nov 18, 2011 at 15:00
  • I put it in a while loop because I'm worried about the connection being refused by the server because he's already handling other clients. It's my attempt at making sure we have a positive handshake... Commented Nov 18, 2011 at 15:07
  • If you can reduce the packets to a reasonable number, why not print out each 'data' as it's read by the server? If you have, for example, an extra newline, your equality test will never be true. It might be useful to see what data is actually being sent/received. Commented Nov 18, 2011 at 15:19
  • Hmm, interesting - if I print data nothing never actually prints. In fact, the server loop fails at socket.accept. Commented Nov 18, 2011 at 15:38
  • Just a sanity check -- are the client/server values for RandomPort the same? Commented Nov 18, 2011 at 16:07

5 Answers 5

3

Alright, this took me a while but I think I figured out what was causing this:

  1. glglgl's answer is correct - using 'localhost' causes the machine to only listen to itself and not to other machines on the network. This was the main culprit.
  2. Increasing the number allowed in the que from 0 to 5 reduced the likelihood of getting a "connection refused" error on the client side.
  3. I made the mistake of assuming that socket connections in an infinite while loop can be shut down infinitely fast - however, having an infinite while loop on both sides sometimes caused a client to sometimes be counted twice because the while loops were not synchronized. This, of course, caused 'client-agnostic' finishedCount to increase twice which led the server the believe all clients were done when they weren't. Using chown's code (thank you chown!), this can be dealt with like this:

    def main():
     sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
     sock.bind((HOST, PORT))
     sock.listen(0)
     FINISHEDCLIENTS = []
     while 1:
     data = 'None'
     IP = [0, 0]
     try:
     client, ip = sock.accept()
     data = client.recv(1024)
     print "%s: Server recieved: '%s'" % (time.ctime(), data)
     if data == 'Done':
     print "%s: Server sending: 'Thanks'" % time.ctime()
     client.send('Thanks')
     if ip[0] in CLIENT_IPS and ip[0] not in FINISHEDCLIENTS: 
     FINISHEDCLIENTS.append(ip[0])
     if len(FINISHEDCLIENTS) == 12:
     #raise MyException
     break
     except Exception, e:
     print "%s: Server Exception - %s" % (time.ctime(), e)
    

    On the client side, I changed the code to this (where of course, RandomPort is the same as the one used in the server script above):

    SentFlag = 0
    data = 'no'
    while SentFlag == 0:
     try:
     f = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
     f.connect((IPofServer, RandomPort))
     f.settimeout(20)
     f.send('Done')
     data = f.recv(1024)
     if data == 'Thanks':
     f.shutdown(socket.SHUT_RDWR)
     f.close()
     SentFlag = 1
     except:
     f.close()
     time.sleep(2*random.random())
    

PS: My understanding of .shutdown() vs .close() is that closes the connection but not necessarily the socket if it is engaged in another communication. .shutdown() shuts down the socket no matter what else it is doing. I don't have any proof for this though.

I think that should do it - thank you all again for helping fix this code!

answered Nov 21, 2011 at 15:49
Sign up to request clarification or add additional context in comments.

Comments

2

Your server has two bugs:

First, this will break out of the inner for loop, not the while loop:

if FinishedCount == 12:
 break

Your while loop has no termination condition.

Second, this pattern:

try:
 ...
except:
 pass

Should never be used. You're swallowing up every single exception and ignoring it. That is bad practice, and will lead to bugs. It should be:

try:
 ...
except OneExceptionIWantToIgnore:
 pass
except:
 raise

Fix those two and get back to us with results.

answered Nov 18, 2011 at 15:05

3 Comments

The except: raise is the same as not having an uncoditional except: in there at all. But this is correct about the break not ending the while loop.
@chown This is true, but I find it to be a bit clearer/easier to understand, as it's explicit that the only exception to be ignored is the one specified.
Good catch about the break - I fixed that. Once I added the raise, I got the socket.timeout: timed out error. I tried setting the socket to non-blocking and that resulted in the [Error 11] Resource Temporarily unavailable error. I originally did the except the way it was to avoid the time-outs - I expect those to happen while the clients are not yet communicating with the server.
2

I believe the issue here is the use of RandomPort. Each client and the server need to be sending/receiving on the same port for this to work. Also, the for ClientIP in ClientIPList(): if ClientIP == IP[0] and data == 'Done': loop is a little redundant and unnecessary. It can be replaced with if ip[0] in clientIpList: and placed inside the if data == 'Done': above it.

A few other thoughts; never name a variable the same name as something you have imported (like socket = socket.socket(..)) because then you will not be able to use the imported library anymore. And unless the client/server are both running on the same system or within the same sub-net, settimeout(0.5) is way to short.

I merged your code with some example code from the python socket documentation and came up with something that works that you should be able to easily adapt for your needs. Here are the scripts; the output from running the server and 12 clients is pasted below.

server.py:

#!/usr/bin/python
# server.py
import sys
import socket
import time
HOST = ''
PORT = 50008
CLIENT_IPS = ["10.10.1.11"]
## No longer necessary if the nested loop isn't needed
#class MyException(Exception):
# pass
def main():
 sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
 sock.bind((HOST, PORT))
 sock.listen(0)
 finishedCount = 0
 while 1:
 data = 'None'
 IP = [0, 0]
 try:
 client, ip = sock.accept()
 data = client.recv(1024)
 print "%s: Server recieved: '%s'" % (time.ctime(), data)
 if data == 'Done':
 print "%s: Server sending: 'Thanks'" % time.ctime()
 client.send('Thanks')
 if ip[0] in CLIENT_IPS:
 finishedCount += 1
 print "%s: Finished Count: '%d'" % (time.ctime(), finishedCount)
 if finishedCount == 12:
 #raise MyException
 break
 except Exception, e:
 print "%s: Server Exception - %s" % (time.ctime(), e)
 #except MyException:
 # print "%s: All clients accounted for. Server ending, goodbye!" % time.ctime()
 # break
 # close down the socket, ignore closing exceptions
 try:
 sock.close()
 except:
 pass
 print "%s: All clients accounted for. Server ending, goodbye!" % time.ctime()
if __name__ == '__main__':
 sys.exit(main())

client.py:

#!/usr/bin/python
# client.py
import sys
import time
import socket
import random
HOST = '10.10.1.11'
PORT = 50008
def main(n):
 while 1:
 try:
 s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
 s.connect((HOST, PORT))
 s.send('Done')
 print "%s: Client %d: Sending - 'Done'.." % (time.ctime(), n)
 data = s.recv(1024)
 print "%s: Client %d: Recieved - '%s'" % (time.ctime(), n, data)
 if data == 'Thanks':
 break
 except Exception, e:
 print "%s: Client %d: Exception - '%s'" % (time.ctime(), n, e)
 time.sleep(2 + random.random() * 5)
 finally:
 try:
 s.shutdown(socket.SHUT_RDWR)
 except:
 pass
 finally:
 try:
 s.close()
 except:
 pass
 print "%s: Client %d: Finished, goodbye!" % (time.ctime(), n)
if __name__ == '__main__':
 if len(sys.argv) > 1 and sys.argv[1].isdigit():
 sys.exit(main(int(sys.argv[1])))

Output from running 12 Clients:

[ 10:52 [email protected] ~/SO/python ]$ for x in {1..12}; do ./client.py $x && sleep 2; done
Fri Nov 18 10:52:44 2011: Client 1: Sending - 'Done'..
Fri Nov 18 10:52:44 2011: Client 1: Recieved - 'Thanks'
Fri Nov 18 10:52:44 2011: Client 1: Finished, goodbye!
Fri Nov 18 10:52:46 2011: Client 2: Sending - 'Done'..
Fri Nov 18 10:52:46 2011: Client 2: Recieved - 'Thanks'
Fri Nov 18 10:52:46 2011: Client 2: Finished, goodbye!
Fri Nov 18 10:52:48 2011: Client 3: Sending - 'Done'..
Fri Nov 18 10:52:48 2011: Client 3: Recieved - 'Thanks'
Fri Nov 18 10:52:48 2011: Client 3: Finished, goodbye!
Fri Nov 18 10:52:50 2011: Client 4: Sending - 'Done'..
Fri Nov 18 10:52:50 2011: Client 4: Recieved - 'Thanks'
Fri Nov 18 10:52:50 2011: Client 4: Finished, goodbye!
Fri Nov 18 10:52:52 2011: Client 5: Sending - 'Done'..
Fri Nov 18 10:52:52 2011: Client 5: Recieved - 'Thanks'
Fri Nov 18 10:52:52 2011: Client 5: Finished, goodbye!
Fri Nov 18 10:52:54 2011: Client 6: Sending - 'Done'..
Fri Nov 18 10:52:54 2011: Client 6: Recieved - 'Thanks'
Fri Nov 18 10:52:54 2011: Client 6: Finished, goodbye!
Fri Nov 18 10:52:56 2011: Client 7: Sending - 'Done'..
Fri Nov 18 10:52:56 2011: Client 7: Recieved - 'Thanks'
Fri Nov 18 10:52:56 2011: Client 7: Finished, goodbye!
Fri Nov 18 10:52:58 2011: Client 8: Sending - 'Done'..
Fri Nov 18 10:52:58 2011: Client 8: Recieved - 'Thanks'
Fri Nov 18 10:52:58 2011: Client 8: Finished, goodbye!
Fri Nov 18 10:53:01 2011: Client 9: Sending - 'Done'..
Fri Nov 18 10:53:01 2011: Client 9: Recieved - 'Thanks'
Fri Nov 18 10:53:01 2011: Client 9: Finished, goodbye!
Fri Nov 18 10:53:03 2011: Client 10: Sending - 'Done'..
Fri Nov 18 10:53:03 2011: Client 10: Recieved - 'Thanks'
Fri Nov 18 10:53:03 2011: Client 10: Finished, goodbye!
Fri Nov 18 10:53:05 2011: Client 11: Sending - 'Done'..
Fri Nov 18 10:53:05 2011: Client 11: Recieved - 'Thanks'
Fri Nov 18 10:53:05 2011: Client 11: Finished, goodbye!
Fri Nov 18 10:53:07 2011: Client 12: Sending - 'Done'..
Fri Nov 18 10:53:07 2011: Client 12: Recieved - 'Thanks'
Fri Nov 18 10:53:07 2011: Client 12: Finished, goodbye!
[ 10:53 [email protected] ~/SO/python ]$

Output from server running at the same time:

[ 10:52 [email protected] ~/SO/python ]$ ./server.py
Fri Nov 18 10:52:44 2011: Server recieved: 'Done'
Fri Nov 18 10:52:44 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:44 2011: Finished Count: '1'
Fri Nov 18 10:52:46 2011: Server recieved: 'Done'
Fri Nov 18 10:52:46 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:46 2011: Finished Count: '2'
Fri Nov 18 10:52:48 2011: Server recieved: 'Done'
Fri Nov 18 10:52:48 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:48 2011: Finished Count: '3'
Fri Nov 18 10:52:50 2011: Server recieved: 'Done'
Fri Nov 18 10:52:50 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:50 2011: Finished Count: '4'
Fri Nov 18 10:52:52 2011: Server recieved: 'Done'
Fri Nov 18 10:52:52 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:52 2011: Finished Count: '5'
Fri Nov 18 10:52:54 2011: Server recieved: 'Done'
Fri Nov 18 10:52:54 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:54 2011: Finished Count: '6'
Fri Nov 18 10:52:56 2011: Server recieved: 'Done'
Fri Nov 18 10:52:56 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:56 2011: Finished Count: '7'
Fri Nov 18 10:52:58 2011: Server recieved: 'Done'
Fri Nov 18 10:52:58 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:58 2011: Finished Count: '8'
Fri Nov 18 10:53:01 2011: Server recieved: 'Done'
Fri Nov 18 10:53:01 2011: Server sending: 'Thanks'
Fri Nov 18 10:53:01 2011: Finished Count: '9'
Fri Nov 18 10:53:03 2011: Server recieved: 'Done'
Fri Nov 18 10:53:03 2011: Server sending: 'Thanks'
Fri Nov 18 10:53:03 2011: Finished Count: '10'
Fri Nov 18 10:53:05 2011: Server recieved: 'Done'
Fri Nov 18 10:53:05 2011: Server sending: 'Thanks'
Fri Nov 18 10:53:05 2011: Finished Count: '11'
Fri Nov 18 10:53:07 2011: Server recieved: 'Done'
Fri Nov 18 10:53:07 2011: Server sending: 'Thanks'
Fri Nov 18 10:53:07 2011: Finished Count: '12'
Fri Nov 18 10:53:07 2011: All clients accounted for. Server ending, goodbye!
[ 10:53 [email protected] ~/SO/python ]$
answered Nov 18, 2011 at 18:29

3 Comments

I'm looking at the documentation for socket.close() and it explicitly states that calling .shutdown() has additional effects. Just .close() definitely doesn't "do it for you".
@AndréCaron Hmm, my mistake.. maybe I was thinking about stopping threads at the time... fixed answer.
This is nice, thank you chown. The socket naming was just conceptual - it's not the way I actually have it implemented. See the answer that I posted - I had to do a couple of additional edits.
0

Calling listen(0) sets no backlog, so you are much more likely to get a connection refused. The server-side socket is never closed, also. Get rid of the try/excepts for now so you can see what the real problems are. Handle explicit socket.error exceptions otherwise.

answered Nov 18, 2011 at 15:30

Comments

0

If you do

socket.bind(('localhost', RandomPort))

your server machine will only accept connections from itself, i. e. localhost.

Instead, do

socket.bind(('', RandomPort))

to listen on all interfaces.

answered Nov 18, 2011 at 20:07

1 Comment

Oh, and before I forget it - start getting familiar with AF_INET6 and/or getaddrinfo() for new applications.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.