LIFE in Python 3

Question 1

I have started to learn Python and have chosen Conway's game of life as my first program. I would be interested in reading how to write more idiomatic Python. Also, what threw me off for some time was that everything is passed by reference and assignment of a list doesn't copy its values but copies the reference. Therefore, I have used the deepcopy function, but I am thinking that lists might be the wrong choice in this case. What would be a better choice in Python?

""" Implementation of LIFE """
import copy
# PARAMETERS
# Number of generations to simulate
N_GENERATIONS = 10
# Define the field. Dots (.) are dead cells, the letter "o" represents living cells
INITIAL_FIELD = \
"""
...................
...................
...................
...................
.ooooo.ooooo.ooooo.
...................
...................
...................
...................
"""
# FUNCTIONS
def print_field(field_copy, dead_cells=' ', living_cells='x'):
 """Pretty-print the current field."""
 field_string = "\n".join(["".join(x) for x in field_copy])
 field_string = field_string.replace('.', dead_cells)
 field_string = field_string.replace('o', living_cells)
 print(field_string)
def get_neighbours(field_copy, x, y):
 """Get all neighbours around a cell with position x and y
 and return them in a list."""
 n_rows = len(field_copy)
 n_cols = len(field_copy[0])
 if y == 0:
 y_idx = [y, y+1]
 elif y == n_rows - 1:
 y_idx = [y-1, y]
 else:
 y_idx = [y-1, y, y+1]
 if x == 0:
 x_idx = [x, x+1]
 elif x == n_cols - 1:
 x_idx = [x-1, x]
 else:
 x_idx = [x-1, x, x+1]
 neigbours = [field_copy[row][col] for row in y_idx for col in x_idx if (row, col) != (y, x)]
 return neigbours
def count_living_cells(cell_list):
 """Count the living cells."""
 accu = 0
 for cell in cell_list:
 if cell == 'o':
 accu = accu + 1
 return accu
def update_field(field_copy):
 """Update the field to the next generation."""
 new_field = copy.deepcopy(field_copy)
 for row in range(len(field_copy)):
 for col in range(len(field_copy[0])):
 living_neighbours = count_living_cells(get_neighbours(field_copy, col, row))
 if living_neighbours < 2 or living_neighbours > 3:
 new_field[row][col] = '.'
 elif living_neighbours == 3:
 new_field[row][col] = 'o'
 return new_field
# MAIN
# Convert the initial playfield to an array
field = str.splitlines(INITIAL_FIELD)
field = field[1:] # Getting rid of the empty first element due to the multiline string
field = [list(x) for x in field]
print("Generation 0")
print_field(field)
for generation in range(1, N_GENERATIONS+1):
 field = update_field(field)
 print(f"Generation {generation}")
 print("")
 print_field(field)
 print("")

Question 2

I would prefer using e.g. a numpy array to hold the field rather than a string.

Question 3

I think your get_neighbor function can be cleaned up using min and max, and by making use of ranges:

def get_neighbours(field_copy, x, y):
 """Get all neighbours around a cell with position x and y
 and return them in a list."""
 n_rows = len(field_copy)
 n_cols = len(field_copy[0])
 min_x = max(0, x - 1)
 max_x = min(x + 1, n_cols - 1)
 min_y = max(0, y - 1)
 max_y = min(y + 1, n_rows - 1)
 return [field_copy[row][col]
 for row in range(min_y, max_y + 1)
 for col in range(min_x, max_x + 1)
 if (row, col) != (y, x)]

It's still quite long, but it does away with all the messy if dispatching to hard-coded lists of indices. I also broke up the list comprehension over a few lines. Whenever my comprehensions start to get a little long, I break them up like that. I find it significantly helps readability.

For

"\n".join(["".join(x) for x in field_copy])

You don't need the []:

"\n".join("".join(x) for x in field_copy)

Without the square brackets, it's a generator expression instead of a list comprehension. They're lazy, which saves you from creating a list just so it can be fed into join. The difference here isn't huge, but for long lists that can save memory.

I wouldn't represent the board as a 2D list of strings. This likely uses up more memory than necessary, and especially with how you have it now, you're forced to remember what string symbol represents what. On top of that, you have two sets of string symbols: one used internally for logic ('o' and '.'), and the other for when you print out (' ' and 'x'). This is more confusing than it needs to be.

If you really wanted to use strings, you should have a global constant at the top that clearly defines what string is what:

DEAD_CELL = '.' # At the very top somewhere
ALIVE_CELL = 'o'
. . .
if living_neighbours < 2 or living_neighbours > 3: # Later on in a function
 new_field[row][col] = DEAD_CELL
elif living_neighbours == 3:
 new_field[row][col] = ALIVE_CELL

Strings like '.' floating around fall into the category of "magic numbers": values that are used loose in a program that don't have a self-explanatory meaning. If the purpose of a value isn't self-evident, store it in a variable with a descriptive name so you and your readers know exactly what's going on in the code.

Personally though, when I write GoL implementations, I use a 1D or 2D list of Boolean values, or a set of tuples representing alive cells. For the Boolean list versions, if a cell is alive, it's true, and if it's dead it's false. For the set version, a cell is alive if it's in the set, otherwise it's dead.

I'd tuck all the stuff at the bottom into a main function. You don't necessarily always want all of that running simply because you loaded the file.

For the sake of efficiency, instead of constantly creating new field copies every generation, a common trick is to create two right at the start, then swap them every generation.

The way I do it is one field is the write_field and one is the read_field. As the names suggest, all writes happen to the write_field, and all reads from read_field. After each "tick", you simply swap them; read_field becomes the new write_field and write_field becomes read_field. This saves you from the expensive deepcopy call once per tick.

You can do this swap quite simply in Python:

write_field, read_field = read_field, write_field

Question 4

You forgot to mention the pythonic way of doing a swap: write_field, read_field = read_field, write_field.

Question 5

@MarkRansom Added. Thank you.

Question 6

Thanks for all your great suggestions! I have implemented them. When it comes to the read_field and write_field suggestion, I have changed the update_field function to accept two arguments (both fields) and added an else condition in order to copy the read_field value in case there are 2 neighbours (and thus no update). With that, the swap works correctly in the for loop. Was this the way how you meant it?

Question 7

@p.vitzliputzli That sounds about right. If you can read C to any extent, you can see my first C program here, which was the GoL, and implements that technique. You can see that I'm holding both arrays in a World struct at the top.

Question 8

Actually, I'm realizing now that I'm doing it a kind of dumb way there and doing a copy from the one array to the other instead of swapping. Idk why I did it that way. Ffs younger self.

Question 9

Comment 1

There is no need to have a special case for printing Generation 0.

Just let your range start from 0 and print before you update.

for generation in range(N_GENERATIONS+1):
 print(f"Generation {generation}")
 print("")
 print_field(field)
 print("")
 field = update_field(field)

Comment 2

Also, it looks like you are adjusting your code quite a bit to the way you define INITIAL_FIELD as a multiline string, just because it looks nice that way in the code window. This is backwards.

You should rather define it as a list of strings so that you don't have to do splitlines and those other things on it before starting the program. If you still want to make it human-readable, you can use some line breaks \ (if needed), but I think the syntax will be ok even without that.

INITIAL_FIELD = [
 "...................",
 "...................",
 etc
 ]

Comment 3

def print_field(field_copy, dead_cells=' ', living_cells='x'):

This function accepts two parameters but no call to it ever passes them in. So they are in fact just internal variables and should not be in the function definition.

Comment 4

field_string = field_string.replace('.', dead_cells)
field_string = field_string.replace('o', living_cells)
print(field_string)

This is unnecessary repetition and hard to read. I would rather chain those 3 lines into one

print(field_string.replace('.', dead_cells).replace('o', living_cells))

Comment 5

def count_living_cells(cell_list):
 """Count the living cells."""
 accu = 0
 for cell in cell_list:
 if cell == 'o':
 accu = accu + 1
 return accu

This is also backwards, due to how you represent your cells as characters and strings.

It would be more sensible I think to prioritize simple program logic and let the print functions adjust as needed. If you represent live cells as the number 1 and dead cells as the number 0, then a cell list would look like [0,1,1,0,0,1,0] and this function could be written as

return sum(cell_list)

Actually, you wouldn't even need a function anymore, since this is so short.

In your print function you could then replace 1 by some other character and 0 by some other character before printing.

Question 10

Thanks for your suggestions! Especially comment 5 is great!

Question 11

The code you posted offers a good example of the benefits that can flow downstream from a greater investment upfront in conceptual and naming consistency. As written, the code has two different ways to represent alive or dead cells, it toggles back and forth between the language of rows/columns and the language of x/y coordinates, and it switches between field and field_copy.

When you hit that point in the development of a program, it's useful to step back and commit yourself to some consistency. For example:

field : list of rows
row : list of cells
cell : either 'x' (alive) or space (dead)
r : row index
c : column index

And let's also start on a solid foundation by putting all code in functions, adding a tiny bit of flexibility to usage so we can vary the N of generations on the command line (handy for debugging and testing). In addition, we want maintain a strict separation between the algorithmic parts of the program and the parts of the program that deal with printing and presentation. Here's one way to start on that path:

import sys
ALIVE = 'x'
DEAD = ' '
INITIAL_FIELD_TEMPLATE = [
 ' ',
 ' ',
 ' ',
 ' ',
 ' xxxxx xxxxx xxxxx ',
 ' ',
 ' ',
 ' ',
 ' ',
]
DEFAULT_GENERATIONS = 10
def main(args):
 # Setup: initial field and N of generations.
 init = [list(row) for row in INITIAL_FIELD_TEMPLATE]
 args.append(DEFAULT_GENERATIONS)
 n_generations = int(args[0])
 # Run Conway: we now have the fields for all generations.
 fields = list(conway(n_generations, init))
 # Analyze, report, whatever.
 for i, f in enumerate(fields):
 s = field_as_str(f)
 print(f'\nGeneration {i}:\n{s}')
def conway(n, field):
 for _ in range(n + 1):
 yield field # Temporary implementation.
def field_as_str(field):
 return '\n'.join(''.join(row) for row in field)
if __name__ == '__main__':
 main(sys.argv[1:])

Starting on that foundation, the next step is to make conway() do something interesting -- namely, compute the field for the next generation. The new_field() implementation is easy if we define a couple of range constants.

RNG_R = range(len(INITIAL_FIELD_TEMPLATE))
RNG_C = range(len(INITIAL_FIELD_TEMPLATE[0]))
def new_field(field):
 return [
 [new_cell_value(field, r, c) for c in RNG_C]
 for r in RNG_R
 ]
def new_cell_value(field, r, c):
 return field[r][c] # Temporary implementation.

And then the next step is to implement a real new_cell_value(), which we know will lead us to thinking about neighboring cells. In these 2D grid situations, neighbor logic can often be simplified by expressing the neighbors in relative (R, C) terms in a simple data structure:

NEIGHBOR_SHIFTS = [
 (-1, -1), (-1, 0), (-1, 1),
 (0, -1), (0, 1),
 (1, -1), (1, 0), (1, 1),
]
def new_cell_value(field, r, c):
 n_living = sum(
 cell == ALIVE
 for cell in neighbor_cells(field, r, c)
 )
 return (
 field[r][c] if n_living == 2 else
 ALIVE if n_living == 3 else
 DEAD
 )
def neighbor_cells(field, r, c):
 return [
 field[r + dr][c + dc]
 for dr, dc in NEIGHBOR_SHIFTS
 if (r + dr) in RNG_R and (c + dc) in RNG_C
 ]

One final note: by adopting a consistent naming convention and by decomposing the problem into fairly small functions, we can get away with many short variable names, which lightens the visual weight of the code and helps with readability. Within small scopes and within a clear context (both are crucial), short variable names tend to increase readability. Consider neighbor_cells(): r and c work because our convention is followed everywhere; RNG_R and RNG_C work because they build on that convention; dr and dc work partly for the same reason and partly because they have the context of an explicitly named container, NEIGHBOR_SHIFTS.

Question 12

Thanks for your great suggestions! I used 'field_copy' instead of field because the linter was complaining about shadowing the global, but with your approach, this is elegantly circumvented.

Carcigenicate Carcigenicate 16.5k3 gold badges37 silver badges82 bronze badges · Accepted Answer · 2020-08-27 15:05:17Z

I think your get_neighbor function can be cleaned up using min and max, and by making use of ranges:

def get_neighbours(field_copy, x, y):
 """Get all neighbours around a cell with position x and y
 and return them in a list."""
 n_rows = len(field_copy)
 n_cols = len(field_copy[0])
 min_x = max(0, x - 1)
 max_x = min(x + 1, n_cols - 1)
 min_y = max(0, y - 1)
 max_y = min(y + 1, n_rows - 1)
 return [field_copy[row][col]
 for row in range(min_y, max_y + 1)
 for col in range(min_x, max_x + 1)
 if (row, col) != (y, x)]

It's still quite long, but it does away with all the messy if dispatching to hard-coded lists of indices. I also broke up the list comprehension over a few lines. Whenever my comprehensions start to get a little long, I break them up like that. I find it significantly helps readability.

For

"\n".join(["".join(x) for x in field_copy])

You don't need the []:

"\n".join("".join(x) for x in field_copy)

Without the square brackets, it's a generator expression instead of a list comprehension. They're lazy, which saves you from creating a list just so it can be fed into join. The difference here isn't huge, but for long lists that can save memory.

I wouldn't represent the board as a 2D list of strings. This likely uses up more memory than necessary, and especially with how you have it now, you're forced to remember what string symbol represents what. On top of that, you have two sets of string symbols: one used internally for logic ('o' and '.'), and the other for when you print out (' ' and 'x'). This is more confusing than it needs to be.

If you really wanted to use strings, you should have a global constant at the top that clearly defines what string is what:

DEAD_CELL = '.' # At the very top somewhere
ALIVE_CELL = 'o'
. . .
if living_neighbours < 2 or living_neighbours > 3: # Later on in a function
 new_field[row][col] = DEAD_CELL
elif living_neighbours == 3:
 new_field[row][col] = ALIVE_CELL

Strings like '.' floating around fall into the category of "magic numbers": values that are used loose in a program that don't have a self-explanatory meaning. If the purpose of a value isn't self-evident, store it in a variable with a descriptive name so you and your readers know exactly what's going on in the code.

Personally though, when I write GoL implementations, I use a 1D or 2D list of Boolean values, or a set of tuples representing alive cells. For the Boolean list versions, if a cell is alive, it's true, and if it's dead it's false. For the set version, a cell is alive if it's in the set, otherwise it's dead.

I'd tuck all the stuff at the bottom into a main function. You don't necessarily always want all of that running simply because you loaded the file.

For the sake of efficiency, instead of constantly creating new field copies every generation, a common trick is to create two right at the start, then swap them every generation.

The way I do it is one field is the write_field and one is the read_field. As the names suggest, all writes happen to the write_field, and all reads from read_field. After each "tick", you simply swap them; read_field becomes the new write_field and write_field becomes read_field. This saves you from the expensive deepcopy call once per tick.

You can do this swap quite simply in Python:

write_field, read_field = read_field, write_field

You forgot to mention the pythonic way of doing a swap: write_field, read_field = read_field, write_field.
Thanks for all your great suggestions! I have implemented them. When it comes to the read_field and write_field suggestion, I have changed the update_field function to accept two arguments (both fields) and added an else condition in order to copy the read_field value in case there are 2 neighbours (and thus no update). With that, the swap works correctly in the for loop. Was this the way how you meant it?
@p.vitzliputzli That sounds about right. If you can read C to any extent, you can see my first C program here, which was the GoL, and implements that technique. You can see that I'm holding both arrays in a World struct at the top.
Actually, I'm realizing now that I'm doing it a kind of dumb way there and doing a copy from the one array to the other instead of swapping. Idk why I did it that way. Ffs younger self.

Stack Exchange Network

LIFE in Python 3

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

LIFE in Python 3

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions