4
\$\begingroup\$

Below is a piece of code which takes a list of IPs, pings the hosts and returns (prints) the list for "Alive" and "Dead". Any feedback on what can be done better is welcomed, mainly, speed and reliability! Would you accept it as production code? What about try/except blocks and logging?

import subprocess
from multiprocessing import Pool, cpu_count
hosts = ('127.0.0.1', 'google.com', '127.0.0.2', '10.1.1.1')
def ping(host):
 ret = subprocess.call(['ping', '-c', '5', '-W', '3', host], stdout=open('/dev/null', 'w'), stderr=open('/dev/null', 'w'))
 return ret == 0
host_list = {"Alive":[], "Dead":[]}
host_stat = {True: host_list["Alive"], False: host_list["Dead"]}
p = Pool(cpu_count())
[host_stat[success].append(host) for host, success in zip(hosts, p.map(ping, hosts))]
print host_stat
200_success
146k22 gold badges190 silver badges479 bronze badges
asked Jan 3, 2015 at 18:00
\$\endgroup\$
0

1 Answer 1

8
\$\begingroup\$

I’ll start with a meta comment:

  • Python (and production code) emphasis readability. There are several parts of your code which are quick difficult to read, or work out what they’re for (more details below). In production code, we don’t just want speed and reliability – we also want code that somebody else can debug in a year, who’s never seen it before. Try to write something that’s easy to follow.

Then a few style comments:

  • I prefer to list module imports (after grouping them per PEP 8’s recommendations) in alphabetical order. If you have a long list of module imports, this makes it much easier to quickly see whether a module has been imported, or find duplicate imports.

  • Quoting PEP 8 on line length:

    Limit all lines to a maximum of 79 characters.

    The line which defines ret is 126 characters long, which means I can't see it without scrolling off the screen. Ick.

More general comments:

  • Add comments and docstrings. I don’t know what any of this code does, and even if I work it out, I don’t know why you wrote it that way. For example, what does the ping function return, and why are you passing all those flags to the ping utility?

  • Use better variable names. A few examples:

    • I expect a variable named host_list to be a list, but it’s actually a dict.
    • In production code, I steer clear of single-character variable names like p. Fine in small scripts, but it’s a pain to grep for later.
    • I’d change the name of the ping function so it can be discussed unambiguously from the command-line ping utility.

    I wouldn’t accept these variable names in "production code".

  • Not everything is encapsulated in a function. The ping function provides a nice wrapper around the ping utility, but checking the list of hosts is handled at the main level. Wrap this in a function that takes a list of hosts, and returns a list of alive/dead. That makes it easier to reuse this code later.

    Then wrap the testing of this particular list of hosts in a main() function, and use the if __name__ == '__main__': main() construction. More portable.

  • You assume the list of host names is correct. If I drop in something that clearly isn’t a hostname (for example, '127.0.0.1 notahostname' into the hosts tuple), I get a non-zero return code from the subprocess call, but it’s not because that host is dead – it’s because it’s nonsensical.

    I would consider storing the return code from each host: this is a much more useful indicator of what’s happened. Alternatively, you could check that hostnames are valid before you pass them in – this Stack Overflow answer looks like it might be useful.

  • There seems to be duplication of data in host_list and host_stat. I’m really not sure what the point of having both of these dictionaries is. Why not just start with empty lists in host_stat?

  • That list comprehension is too complicated. Not only is it too long (86 chars), but it’s not immediately obvious what it does. Break it down across multiple lines (remember, Python emphasises readability).

  • Using /dev/null to throw away the output from subprocess is platform-specific. It’s better to use os.devnull, which should work on non-Unix systems. (from How to hide output of subprocess in Python 2.7)

Four woolier comments:

  • Are the keys and values of host_stat the right way round? Once you have a dict of host availability data, I’d imagine you’re going to go through the list of hosts one-at-a-time and check if they’re available. For that, surely having the keys be the hostnames makes more sense (but that’s just me).

  • Why is hosts a tuple? I'd make it a list instead. Also, break it across multiple lines (see below).

  • I don’t see the value of multiprocessing here. Perhaps if you’re doing it for lots of hostnames, but otherwise I wouldn’t bother.

  • What do you need logging for? The only thing that might be useful is the return code from ping, but since you’re throwing that away, I’m not sure what you think you’d like to log.

With those comments in mind, here's how I'd rewrite your script:

import os
import subprocess
def ping_return_code(hostname):
 """Use the ping utility to attempt to reach the host. We send 5 packets
 ('-c 5') and wait 3 milliseconds ('-W 3') for a response. The function
 returns the return code from the ping utility.
 """
 ret_code = subprocess.call(['ping', '-c', '5', '-W', '3', hostname],
 stdout=open(os.devnull, 'w'),
 stderr=open(os.devnull, 'w'))
 return ret_code
def verify_hosts(host_list):
 """For each hostname in the list, attempt to reach it using ping. Returns a
 dict in which the keys are the hostnames, and the values are the return
 codes from ping. Assumes that the hostnames are valid.
 """
 return_codes = dict()
 for hostname in host_list:
 return_codes[hostname] = ping_return_code(hostname)
 return return_codes
def main():
 hosts_to_test = [
 '127.0.0.1',
 'google.com',
 '127.0.0.2',
 '10.1.1.1'
 ]
 print verify_hosts(hosts_to_test)
if __name__ == '__main__':
 main()
answered Jan 3, 2015 at 18:58
\$\endgroup\$
5
  • \$\begingroup\$ Great comments! All, very pythonic and...common sense! Thank you! \$\endgroup\$ Commented Jan 3, 2015 at 19:10
  • \$\begingroup\$ It seems that you would not make use of multiprocessing module. My thought was to speed up the pinging process as the list might have hundreds of IPs. Any comment? \$\endgroup\$ Commented Jan 3, 2015 at 19:18
  • 3
    \$\begingroup\$ I would use return {hostname: ping_return_code(hostname) for hostname in host_list} for verify_hosts. My pants all get in knots if people don't use with for files, too, even if they are /dev/null. \$\endgroup\$ Commented Jan 3, 2015 at 19:35
  • 1
    \$\begingroup\$ @flamenco: If you have lots of hostnames and plenty of cores, that’s probably appropriate. It’s a tradeoff: the code will run faster, but probably be a bit more complicated. I usually come down on the side of simpler code, but there’s an argument to be made for the opposite. \$\endgroup\$ Commented Jan 3, 2015 at 19:53
  • 1
    \$\begingroup\$ @alexwlchan why use a list instead of a tuple for hosts_to_test? \$\endgroup\$ Commented Jan 5, 2015 at 11:11

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.