I have started to learn Python and have chosen Conway's game of life as my first program. I would be interested in reading how to write more idiomatic Python. Also, what threw me off for some time was that everything is passed by reference and assignment of a list doesn't copy its values but copies the reference. Therefore, I have used the deepcopy function, but I am thinking that lists might be the wrong choice in this case. What would be a better choice in Python?
""" Implementation of LIFE """
import copy
# PARAMETERS
# Number of generations to simulate
N_GENERATIONS = 10
# Define the field. Dots (.) are dead cells, the letter "o" represents living cells
INITIAL_FIELD = \
"""
...................
...................
...................
...................
.ooooo.ooooo.ooooo.
...................
...................
...................
...................
"""
# FUNCTIONS
def print_field(field_copy, dead_cells=' ', living_cells='x'):
"""Pretty-print the current field."""
field_string = "\n".join(["".join(x) for x in field_copy])
field_string = field_string.replace('.', dead_cells)
field_string = field_string.replace('o', living_cells)
print(field_string)
def get_neighbours(field_copy, x, y):
"""Get all neighbours around a cell with position x and y
and return them in a list."""
n_rows = len(field_copy)
n_cols = len(field_copy[0])
if y == 0:
y_idx = [y, y+1]
elif y == n_rows - 1:
y_idx = [y-1, y]
else:
y_idx = [y-1, y, y+1]
if x == 0:
x_idx = [x, x+1]
elif x == n_cols - 1:
x_idx = [x-1, x]
else:
x_idx = [x-1, x, x+1]
neigbours = [field_copy[row][col] for row in y_idx for col in x_idx if (row, col) != (y, x)]
return neigbours
def count_living_cells(cell_list):
"""Count the living cells."""
accu = 0
for cell in cell_list:
if cell == 'o':
accu = accu + 1
return accu
def update_field(field_copy):
"""Update the field to the next generation."""
new_field = copy.deepcopy(field_copy)
for row in range(len(field_copy)):
for col in range(len(field_copy[0])):
living_neighbours = count_living_cells(get_neighbours(field_copy, col, row))
if living_neighbours < 2 or living_neighbours > 3:
new_field[row][col] = '.'
elif living_neighbours == 3:
new_field[row][col] = 'o'
return new_field
# MAIN
# Convert the initial playfield to an array
field = str.splitlines(INITIAL_FIELD)
field = field[1:] # Getting rid of the empty first element due to the multiline string
field = [list(x) for x in field]
print("Generation 0")
print_field(field)
for generation in range(1, N_GENERATIONS+1):
field = update_field(field)
print(f"Generation {generation}")
print("")
print_field(field)
print("")
-
\$\begingroup\$ I would prefer using e.g. a numpy array to hold the field rather than a string. \$\endgroup\$Dschoni– Dschoni2020年08月27日 14:15:24 +00:00Commented Aug 27, 2020 at 14:15
3 Answers 3
I think your get_neighbor
function can be cleaned up using min
and max
, and by making use of range
s:
def get_neighbours(field_copy, x, y):
"""Get all neighbours around a cell with position x and y
and return them in a list."""
n_rows = len(field_copy)
n_cols = len(field_copy[0])
min_x = max(0, x - 1)
max_x = min(x + 1, n_cols - 1)
min_y = max(0, y - 1)
max_y = min(y + 1, n_rows - 1)
return [field_copy[row][col]
for row in range(min_y, max_y + 1)
for col in range(min_x, max_x + 1)
if (row, col) != (y, x)]
It's still quite long, but it does away with all the messy if
dispatching to hard-coded lists of indices. I also broke up the list comprehension over a few lines. Whenever my comprehensions start to get a little long, I break them up like that. I find it significantly helps readability.
For
"\n".join(["".join(x) for x in field_copy])
You don't need the []
:
"\n".join("".join(x) for x in field_copy)
Without the square brackets, it's a generator expression instead of a list comprehension. They're lazy, which saves you from creating a list just so it can be fed into join
. The difference here isn't huge, but for long lists that can save memory.
I wouldn't represent the board as a 2D list of strings. This likely uses up more memory than necessary, and especially with how you have it now, you're forced to remember what string symbol represents what. On top of that, you have two sets of string symbols: one used internally for logic ('o'
and '.'
), and the other for when you print out (' '
and 'x'
). This is more confusing than it needs to be.
If you really wanted to use strings, you should have a global constant at the top that clearly defines what string is what:
DEAD_CELL = '.' # At the very top somewhere
ALIVE_CELL = 'o'
. . .
if living_neighbours < 2 or living_neighbours > 3: # Later on in a function
new_field[row][col] = DEAD_CELL
elif living_neighbours == 3:
new_field[row][col] = ALIVE_CELL
Strings like '.'
floating around fall into the category of "magic numbers": values that are used loose in a program that don't have a self-explanatory meaning. If the purpose of a value isn't self-evident, store it in a variable with a descriptive name so you and your readers know exactly what's going on in the code.
Personally though, when I write GoL implementations, I use a 1D or 2D list of Boolean values, or a set of tuples representing alive cells. For the Boolean list versions, if a cell is alive, it's true, and if it's dead it's false. For the set version, a cell is alive if it's in the set, otherwise it's dead.
I'd tuck all the stuff at the bottom into a main
function. You don't necessarily always want all of that running simply because you loaded the file.
For the sake of efficiency, instead of constantly creating new field copies every generation, a common trick is to create two right at the start, then swap them every generation.
The way I do it is one field is the write_field
and one is the read_field
. As the names suggest, all writes happen to the write_field
, and all reads from read_field
. After each "tick", you simply swap them; read_field
becomes the new write_field
and write_field
becomes read_field
. This saves you from the expensive deepcopy
call once per tick.
You can do this swap quite simply in Python:
write_field, read_field = read_field, write_field
-
2\$\begingroup\$ You forgot to mention the pythonic way of doing a swap:
write_field, read_field = read_field, write_field
. \$\endgroup\$Mark Ransom– Mark Ransom2020年08月27日 18:52:58 +00:00Commented Aug 27, 2020 at 18:52 -
1\$\begingroup\$ @MarkRansom Added. Thank you. \$\endgroup\$Carcigenicate– Carcigenicate2020年08月27日 19:22:24 +00:00Commented Aug 27, 2020 at 19:22
-
\$\begingroup\$ Thanks for all your great suggestions! I have implemented them. When it comes to the read_field and write_field suggestion, I have changed the update_field function to accept two arguments (both fields) and added an else condition in order to copy the read_field value in case there are 2 neighbours (and thus no update). With that, the swap works correctly in the for loop. Was this the way how you meant it? \$\endgroup\$p.vitzliputzli– p.vitzliputzli2020年08月28日 07:32:57 +00:00Commented Aug 28, 2020 at 7:32
-
1\$\begingroup\$ @p.vitzliputzli That sounds about right. If you can read C to any extent, you can see my first C program here, which was the GoL, and implements that technique. You can see that I'm holding both arrays in a
World
struct at the top. \$\endgroup\$Carcigenicate– Carcigenicate2020年08月28日 12:03:44 +00:00Commented Aug 28, 2020 at 12:03 -
1\$\begingroup\$ Actually, I'm realizing now that I'm doing it a kind of dumb way there and doing a copy from the one array to the other instead of swapping. Idk why I did it that way. Ffs younger self. \$\endgroup\$Carcigenicate– Carcigenicate2020年08月28日 12:08:54 +00:00Commented Aug 28, 2020 at 12:08
Comment 1
There is no need to have a special case for printing Generation 0.
Just let your range start from 0 and print before you update.
for generation in range(N_GENERATIONS+1):
print(f"Generation {generation}")
print("")
print_field(field)
print("")
field = update_field(field)
Comment 2
Also, it looks like you are adjusting your code quite a bit to the way you define INITIAL_FIELD
as a multiline string, just because it looks nice that way in the code window. This is backwards.
You should rather define it as a list of strings so that you don't have to do splitlines and those other things on it before starting the program. If you still want to make it human-readable, you can use some line breaks \
(if needed), but I think the syntax will be ok even without that.
INITIAL_FIELD = [
"...................",
"...................",
etc
]
Comment 3
def print_field(field_copy, dead_cells=' ', living_cells='x'):
This function accepts two parameters but no call to it ever passes them in. So they are in fact just internal variables and should not be in the function definition.
Comment 4
field_string = field_string.replace('.', dead_cells)
field_string = field_string.replace('o', living_cells)
print(field_string)
This is unnecessary repetition and hard to read. I would rather chain those 3 lines into one
print(field_string.replace('.', dead_cells).replace('o', living_cells))
Comment 5
def count_living_cells(cell_list):
"""Count the living cells."""
accu = 0
for cell in cell_list:
if cell == 'o':
accu = accu + 1
return accu
This is also backwards, due to how you represent your cells as characters and strings.
It would be more sensible I think to prioritize simple program logic and let the print functions adjust as needed.
If you represent live cells as the number 1 and dead cells as the number 0, then a cell list would look like [0,1,1,0,0,1,0]
and this function could be written as
return sum(cell_list)
Actually, you wouldn't even need a function anymore, since this is so short.
In your print function you could then replace 1 by some other character and 0 by some other character before printing.
-
\$\begingroup\$ Thanks for your suggestions! Especially comment 5 is great! \$\endgroup\$p.vitzliputzli– p.vitzliputzli2020年08月27日 18:44:33 +00:00Commented Aug 27, 2020 at 18:44
The code you posted offers a good example of the benefits that can flow
downstream from a greater investment upfront in conceptual and naming
consistency. As written, the code has two different ways to represent alive or
dead cells, it toggles back and forth between the language of rows/columns and
the language of x/y coordinates, and it switches between field
and
field_copy
.
When you hit that point in the development of a program, it's useful to step back and commit yourself to some consistency. For example:
field : list of rows
row : list of cells
cell : either 'x' (alive) or space (dead)
r : row index
c : column index
And let's also start on a solid foundation by putting all code in functions, adding a tiny bit of flexibility to usage so we can vary the N of generations on the command line (handy for debugging and testing). In addition, we want maintain a strict separation between the algorithmic parts of the program and the parts of the program that deal with printing and presentation. Here's one way to start on that path:
import sys
ALIVE = 'x'
DEAD = ' '
INITIAL_FIELD_TEMPLATE = [
' ',
' ',
' ',
' ',
' xxxxx xxxxx xxxxx ',
' ',
' ',
' ',
' ',
]
DEFAULT_GENERATIONS = 10
def main(args):
# Setup: initial field and N of generations.
init = [list(row) for row in INITIAL_FIELD_TEMPLATE]
args.append(DEFAULT_GENERATIONS)
n_generations = int(args[0])
# Run Conway: we now have the fields for all generations.
fields = list(conway(n_generations, init))
# Analyze, report, whatever.
for i, f in enumerate(fields):
s = field_as_str(f)
print(f'\nGeneration {i}:\n{s}')
def conway(n, field):
for _ in range(n + 1):
yield field # Temporary implementation.
def field_as_str(field):
return '\n'.join(''.join(row) for row in field)
if __name__ == '__main__':
main(sys.argv[1:])
Starting on that foundation, the next step is to make conway()
do something
interesting -- namely, compute the field for the next generation. The new_field()
implementation is easy if we define a couple of range constants.
RNG_R = range(len(INITIAL_FIELD_TEMPLATE))
RNG_C = range(len(INITIAL_FIELD_TEMPLATE[0]))
def new_field(field):
return [
[new_cell_value(field, r, c) for c in RNG_C]
for r in RNG_R
]
def new_cell_value(field, r, c):
return field[r][c] # Temporary implementation.
And then the next step is to implement a real new_cell_value()
, which we know
will lead us to thinking about neighboring cells. In these 2D grid situations,
neighbor logic can often be simplified by expressing the neighbors in relative
(R, C)
terms in a simple data structure:
NEIGHBOR_SHIFTS = [
(-1, -1), (-1, 0), (-1, 1),
(0, -1), (0, 1),
(1, -1), (1, 0), (1, 1),
]
def new_cell_value(field, r, c):
n_living = sum(
cell == ALIVE
for cell in neighbor_cells(field, r, c)
)
return (
field[r][c] if n_living == 2 else
ALIVE if n_living == 3 else
DEAD
)
def neighbor_cells(field, r, c):
return [
field[r + dr][c + dc]
for dr, dc in NEIGHBOR_SHIFTS
if (r + dr) in RNG_R and (c + dc) in RNG_C
]
One final note: by adopting a consistent naming convention and by decomposing
the problem into fairly small functions, we can get away with many short
variable names, which lightens the visual weight of the code and helps with
readability. Within small scopes and within a clear context (both are crucial),
short variable names tend to increase readability. Consider
neighbor_cells()
: r
and c
work because our convention is followed
everywhere; RNG_R
and RNG_C
work because they build on that convention; dr
and
dc
work partly for the same reason and partly because they have the context of an
explicitly named container, NEIGHBOR_SHIFTS
.
-
\$\begingroup\$ Thanks for your great suggestions! I used 'field_copy' instead of field because the linter was complaining about shadowing the global, but with your approach, this is elegantly circumvented. \$\endgroup\$p.vitzliputzli– p.vitzliputzli2020年08月29日 13:36:39 +00:00Commented Aug 29, 2020 at 13:36