4
\$\begingroup\$

I am a novice when it comes to Python. I am working stuff related to Flowshop scheduling and so to generate benchmarks I used one by Taillard (Link to his homepage). The details are in the technical report.

Since I ported the code from C which I am not really good at please review my Python script:

# Format - random seed, number of jobs, number of machines
# Use a separate file and precede comments with a # symbol
from numpy import int32
import math
problem = list()
# Generate a random number uniformly between low and high
def unif(seed, low, high):
 """This function generates a random number between low\
 and high based on seed
 Seed is implemented as static for set of random values based on a series\
 of seed values"""
 if getattr(unif, "seed_value",0) == 0 or unif.seed_value is None:
 #create attribute seed_value which is static for method unif
 unif.seed_value = seed #inititalization if attribute does not exist
 # constants for random number
 a = 16807
 b = 127773
 c = 2836
 m = 2**(31) - 1
 k = math.floor(unif.seed_value / b)
 unif.seed_value = int32(a * (unif.seed_value % b) - k * c)
 # use numpy 32 bit int
 if unif.seed_value < 0: # 32 bit int overflow
 unif.seed_value = unif.seed_value + m
 value_0_1 = unif.seed_value/float(m)
 return int(low + math.floor(value_0_1 * (high - low + 1))) # random\
 # number generated
# generates a problem for given number of machines and jobs in problem tuple
def generate_flow_shop(p, output):
 """Generates a flow shop problem set for a set in problem tuple and\
 writes to file in output"""
 time_seed = int(problem[p][0])
 d = list() # list containing the problem
 for i in range( int(problem[p][2]) ):
 line = list()
 line = [ unif(time_seed, 1, 99) for j in range(int(problem[p][1])) ]
 # for j in range(0, int(problem[p][1])):
 # line.append(unif(time_seed, 1, 99))
 # 99 - max duration of op.
 d.append(line)
 unif.seed_value = None
 for line in d:
 datum = ' '.join(map(str,line))
 output.write("%s\n" % (datum))
def main():
 name = input("Enter filename to write the generated values:\n> ")
 seedfile = input("Enter filename of the seed values:\n> ")
 with open(seedfile) as in_file:
 for i in in_file:
 if line[0] == '#':
 continue
 else:
 line = i.strip().split()
 problem.append(tuple( map(int,line) ) )
 with open("%s" % (name), "w") as out_file:
 for i in range(0, len(problem)):
 out_file.write("%s %04d %s %s\n" %\
 ("p", i, problem[i][1], problem[i][2]))
 generate_flow_shop(i, out_file)
if __name__ == '__main__':
 main()

The link to the original C program is here.

This is the link to the output which is the expected output ie. the code is actually correct.

The main purpose of this module is to generate a set of values that can be used to test our implemented solutions ie. it is a generated problem set but not completely random but reproducible if the seeds entered are the same.

Note: The seedfile is a file that contains the values that are defined in the struct problem in the C program.

What I mainly need feedback on are:

  • Can the script be improved further with respect to use of Python idioms?
  • Since the generated file should again be parsed as input for testing, is writing to a file and then opening it again to test the scripts the best way? Is the written file format the best one for parsing again as input?
Peilonrayz
44.4k7 gold badges80 silver badges157 bronze badges
asked May 25, 2017 at 19:09
\$\endgroup\$

2 Answers 2

4
\$\begingroup\$
  • unif.seed_value is very un-Pythonic. Instead use a class.
  • Rather than using getattr(unif, "seed_value",0), you can make self.seed_value at object instantiation, and use that as the default, unless it's falsy, where it'd instead use seed.
  • math.floor is mostly the same as int. If you're using this on division, you can change it to floor divisions using //.
  • numpy.int32 is odd, it adds a dependency to numpy that shouldn't be needed. Instead you can use & or %, this also means that it doesn't throw an error if the number is greater than (1 << 31) - 1.
  • unif.seed_value/float(m) is the same as unif.seed_value/m in Python 3. In Python 2 it'd do floor division, unless one number is a float.
  • generate_flow_shop shouldn't need to take output, that should be implemented in main.
  • generate_flow_shop should take a single problem rather than the list of problems.
  • You can use tuple unpacking to give all values in the problem a name.
  • I heavily recommend str.format over the old % formatting operator. It has less bugs, and is enhanced in 3.6 with f-strings.
  • You can use _ as a 'throw away variable' in Python, it's somewhat a standard and lets developers ignore that variable, greatly increasing readability.

unif uses some fairly odd math, and so I decided to go for a more succinct approach. And so I'd change your code to: (note, it's untested)

class Random:
 def __init__(self):
 self.seed_value = None
 def unif(self, seed, low, high):
 seed_value = self.seed_value or seed
 a = 16807
 b = 127773
 c = 2836
 m = (1 << 31) - 1
 seed_value = (a * (seed_value % b) - (seed_value // b) * c)
 self.seed_value = (seed_value + (seed_value < 0)) & m
 return int(low + self.seed_value / m * (high - low + 1))
 def generate_flow_shop(self, problem):
 """Generates a flow shop problem set for a set in problem tuple and\
 writes to file in output"""
 time_seed, num_jobs, num_mach = problem
 r = [
 [
 self.unif(time_seed, 1, 99)
 for _ in range(num_jobs)
 ]
 for _ in range(num_mach)
 ]
 self.seed_value = None
 return r
def main():
 name = input("Enter filename to write the generated values:\n> ")
 seedfile = input("Enter filename of the seed values:\n> ")
 r = Random()
 with open(seedfile) as in_file:
 problems = [
 tuple(map(int, line.strip().split()))
 for line in in_file
 if line[0] != "#"
 ]
 with open(name, "w") as out_file:
 for i, problem in enumerate(problems):
 out_file.write("p {0:04} {1[1]} {1[2]}\n".format(i, problem)))
 data = r.generate_flow_shop(problem)
 out_file.write(''.join(' '.join(line) + '\n' for line in data))
if __name__ == '__main__':
 main()
answered May 25, 2017 at 21:10
\$\endgroup\$
2
  • \$\begingroup\$ The overflow part of the int which should be an error was incorporated as a feature in the program and so the values seem to differ because in your program. There are few trivial bugs in it like extra brackets and join used on int but nothing too big. How do I reproduce overflow without numpy int32? Because without it the generated values change. Other than that the code is blazing fast so thanks on the other suggestions. \$\endgroup\$ Commented May 26, 2017 at 6:26
  • \$\begingroup\$ I used this to imitate int32 overflow behaviour - if -(m+1) > seed_value or seed_value > m: seed_value %= (2*m-1) if seed_value > m: seed_value = m - seed_value \$\endgroup\$ Commented May 26, 2017 at 6:56
0
\$\begingroup\$

As far as styling goes- Triple quotation marks signifying a docstring end should be on a seperate line:

def func():
 """This docstring explains
 how the function should
 be used.
 """

Also, there is no need to escape newlines in docstrings. I suppose you could open the file in w+ or r+ mode (wr / rw);

with open(seedfile, "r+") as f:
 # magic
answered May 25, 2017 at 19:31
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.