Optimise Game of Life in Rust

Question 1

I recently picked up Rust and was making a CLI for Conway's Game of Life. I got it working but, looking back at it, there are places it could be improved. The main one is the function that generates the next board. Specifically, the part of that function which calculates the amount of alive neighbours a cell has.

fn next_step(game_array: [[bool; WIDTH]; HEIGHT]) -> [[bool; WIDTH]; HEIGHT] {
 let mut next_state = [[false; WIDTH]; HEIGHT];
 let mut neighbours: u8; // Will never be above 8
 const HEIGHT_I: isize = HEIGHT as isize;
 const WIDTH_I: isize = WIDTH as isize;
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 for rownum in 0..HEIGHT {
 for cellnum in 0..WIDTH {
 // FROM HERE
 neighbours = 0;
 for [j, k] in NEIGHBOUR_LIST {
 // This will break if width and height are set to larger than isize::MAX
 if game_array[(((rownum as isize + j % HEIGHT_I) + HEIGHT_I) % HEIGHT_I) as usize]
 [(((cellnum as isize + k % WIDTH_I) + WIDTH_I) % WIDTH_I) as usize] {
 neighbours += 1;
 }
 }
 // TO HERE
 // This is the cleanest way I could find to implement Life rules
 if neighbours == 3 || (game_array[rownum][cellnum] && neighbours == 2) {
 next_state[rownum][cellnum] = true;
 }
 }
 }
 next_state
}

I wanted the edges of the board to loop and this is the best way I could find to check the neighbours of a cell. However, it is verbose and hard to read. Is there a better way that I am missing?

HEIGHT and WIDTH are the height and width of the board and will never be above isize::MAX.

Question 2

Your solution looks good! I would suggest the following improvements in order to make it more readable:

Use type aliases to improve the readability:

type State = [[bool; WIDTH]; HEIGHT];

Extract the coordinate wrapping logic into its own type to make it easier to understand:

#[derive(Debug, Copy, Clone)]
struct Coord {
 value: usize,
 max: usize,
}
impl Coord {
 fn new(value: usize, max: usize) -> Self {
 Coord {
 value,
 max,
 }
 }
 fn increment(&mut self) {
 self.value += 1;
 if self.value >= self.max {
 self.value = 0;
 }
 }
 fn decrement(&mut self) {
 self.value = self.value.checked_sub(1).unwrap_or(self.max - 1);
 }
}

Create a Cell and NeighborsIter structures that encapsulate the logic of iterating over all neigbhours of a given cell:

#[derive(Debug, Copy, Clone)]
struct Cell {
 x: Coord,
 y: Coord,
}
impl Cell {
 fn new(x: usize, y: usize) -> Self {
 Cell {
 x: Coord::new(x, HEIGHT),
 y: Coord::new(y, WIDTH),
 }
 }
 fn into_neighbors_iter(self) -> NeighborsIter {
 NeighborsIter::new(self)
 }
}
struct NeighborsIter {
 state: usize,
 cell: Cell,
}
impl NeighborsIter {
 fn new(init: Cell) -> Self {
 Self {
 state: 0,
 cell: init,
 }
 }
}
impl Iterator for NeighborsIter {
 type Item = Cell;
 fn next(&mut self) -> Option<Self::Item> {
 match self.state {
 0 => {
 self.cell.x.decrement();
 self.cell.y.decrement()
 }
 1 => { self.cell.y.increment() }
 2 => { self.cell.y.increment() }
 3 => { self.cell.x.increment() }
 4 => { self.cell.x.increment() }
 5 => { self.cell.y.decrement() }
 6 => { self.cell.y.decrement() }
 7 => { self.cell.x.decrement() }
 _ => { return None; }
 }
 self.state += 1;
 Some(self.cell)
 }
}

Separate the counting of live neigbhors into its own function:

fn count_neighbors(&self, cell: Cell) -> u8 {
 let mut neighbours: u8 = 0;
 for cell in cell.into_neighbors_iter() {
 if self.state[cell.x.value][cell.y.value] {
 neighbours += 1;
 }
 }
 neighbours
}

Create a Board struct and move the next_step function into it as a method:

#[derive(Debug)]
struct Board {
 state: State
}
impl Board {
 pub fn new() -> Self {
 Board {
 state: [[false; WIDTH]; HEIGHT]
 }
 }
 pub fn from(state: State) -> Self {
 Board {
 state
 }
 }
 pub fn next_step(self) -> Board {
 let mut next_state = [[false; WIDTH]; HEIGHT];
 for row in 0..HEIGHT {
 for col in 0..WIDTH {
 match self.count_neighbors(Cell::new(row, col)) {
 3 => next_state[row][col] = true,
 2 if self.state[row][col] => next_state[row][col] = true,
 _ => {}
 }
 }
 }
 Board {
 state: next_state
 }
 }
 fn count_neighbors(&self, cell: Cell) -> u8 {
 let mut neighbours: u8 = 0;
 for cell in cell.into_neighbors_iter() {
 if self.state[cell.x.value][cell.y.value] {
 neighbours += 1;
 }
 }
 neighbours
 }
}

Final Code:

const WIDTH: usize = 10;
const HEIGHT: usize = 10;
type State = [[bool; WIDTH]; HEIGHT];
#[derive(Debug, Copy, Clone)]
struct Coord {
 value: usize,
 max: usize,
}
impl Coord {
 fn new(value: usize, max: usize) -> Self {
 Coord {
 value,
 max,
 }
 }
 fn increment(&mut self) {
 self.value += 1;
 if self.value >= self.max {
 self.value = 0;
 }
 }
 fn decrement(&mut self) {
 self.value = self.value.checked_sub(1).unwrap_or(self.max - 1);
 }
}
#[derive(Debug, Copy, Clone)]
struct Cell {
 x: Coord,
 y: Coord,
}
impl Cell {
 fn new(x: usize, y: usize) -> Self {
 Cell {
 x: Coord::new(x, HEIGHT),
 y: Coord::new(y, WIDTH),
 }
 }
 fn into_neighbors_iter(self) -> NeighborsIter {
 NeighborsIter::new(self)
 }
}
struct NeighborsIter {
 state: usize,
 cell: Cell,
}
impl NeighborsIter {
 fn new(init: Cell) -> Self {
 Self {
 state: 0,
 cell: init,
 }
 }
}
impl Iterator for NeighborsIter {
 type Item = Cell;
 fn next(&mut self) -> Option<Self::Item> {
 match self.state {
 0 => {
 self.cell.x.decrement();
 self.cell.y.decrement()
 }
 1 => { self.cell.y.increment() }
 2 => { self.cell.y.increment() }
 3 => { self.cell.x.increment() }
 4 => { self.cell.x.increment() }
 5 => { self.cell.y.decrement() }
 6 => { self.cell.y.decrement() }
 7 => { self.cell.x.decrement() }
 _ => { return None; }
 }
 self.state += 1;
 Some(self.cell)
 }
}
#[derive(Debug)]
struct Board {
 state: State
}
impl Board {
 pub fn new() -> Self {
 Board {
 state: [[false; WIDTH]; HEIGHT]
 }
 }
 pub fn from(state: State) -> Self {
 Board {
 state
 }
 }
 pub fn next_step(self) -> Board {
 let mut next_state = [[false; WIDTH]; HEIGHT];
 for row in 0..HEIGHT {
 for col in 0..WIDTH {
 match self.count_neighbors(Cell::new(row, col)) {
 3 => next_state[row][col] = true,
 2 if self.state[row][col] => next_state[row][col] = true,
 _ => {}
 }
 }
 }
 Board {
 state: next_state
 }
 }
 fn count_neighbors(&self, cell: Cell) -> u8 {
 let mut neighbours: u8 = 0;
 for cell in cell.into_neighbors_iter() {
 if self.state[cell.x.value][cell.y.value] {
 neighbours += 1;
 }
 }
 neighbours
 }
}

Performance:

After benchmarking the performance of the original implementation from the author and my new implementation, it is clear that the original modulus approach yields significantly slower performance on top of being more unreadable.

Experiment 1 parameters:
Grid - 100x100
Starting sate - Randomly generated (but same between the two methods)
Game steps - 1000

Experiment 1 average execution time:
Original implementation: 381.86 ms
New implementation: 162.13 ms

Experiment 2 parameters:
Grid - 10x10
Starting sate - Randomly generated (but same between the two methods)
Game steps - 10000

Experiment 2 average execution time:
Original implementation: 33.209 ms
New implementation: 14.175 ms

Additional information:

The library used for the benchmarking is criterion. And the raw data can be seen bellow.

game-of-life new impl, (100x100), random state, 1000 steps 
 time: [158.70 ms 162.13 ms 166.53 ms]
Found 7 outliers among 100 measurements (7.00%)
 2 (2.00%) high mild
 5 (5.00%) high severe
---------------------------------------------------------------------
game-of-life original impl, (100x100), random state, 1000 steps 
 time: [381.63 ms 381.86 ms 382.13 ms]
Found 6 outliers among 100 measurements (6.00%)
 2 (2.00%) low mild
 2 (2.00%) high mild
 2 (2.00%) high severe
--------------------------------------------------------------------
game-of-life new impl, (10x10), random state, 10000 steps 
 time: [13.847 ms 14.175 ms 14.580 ms]
Found 10 outliers among 100 measurements (10.00%)
 2 (2.00%) high mild
 8 (8.00%) high severe
---------------------------------------------------------------------
game-of-life original impl, (10x10), random state, 10000 steps 
 time: [33.176 ms 33.209 ms 33.247 ms]
Found 9 outliers among 100 measurements (9.00%)
 2 (2.00%) low mild
 4 (4.00%) high mild
 3 (3.00%) high severe

Question 3

While this is nicely implemented, I'm quite certain that his will perform significantly slower than OPs original version. Especially your decrement and increment functions are very critical, performance wise.

Question 4

I would strongly disagree with you. Please see the benchmarking data in my updated answer.

Question 5

Would you mind sharing the benchmark code so I can reproduce it?

Question 6

github.com/DobromirM/game-of-life

Question 7

I have to agree that you get a speedup. But I don't think that's because your code is fast, but because OP's original code is very slow. On my machine. OP's version is 37 ms, yours is 15.3 ms, and my version with unsafe and ArrayExt is 1.95 ms.

Question 8

I've refactored your code a bunch. Nothing major, just a lot of little opinionated optimizations.

Apply @Andrea's change to your offset calculation
Use iterators instead of a for loop calculate the number of neighbors

pub fn next_step(game_array: [[bool; WIDTH]; HEIGHT]) -> [[bool; WIDTH]; HEIGHT] {
 let mut next_state = [[false; WIDTH]; HEIGHT];
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 for rownum in 0..HEIGHT {
 for cellnum in 0..WIDTH {
 // Calculate neighbors
 let neighbours = NEIGHBOUR_LIST
 .iter()
 .filter(|[j, k]| {
 game_array[(rownum as isize + j).rem_euclid(HEIGHT as isize) as usize]
 [(cellnum as isize + k).rem_euclid(WIDTH as isize) as usize]
 })
 .count();
 // Apply game-of-life game rules
 if neighbours == 3 || (game_array[rownum][cellnum] && neighbours == 2) {
 next_state[rownum][cellnum] = true;
 }
 }
 }
 next_state
}

Further, in a separate code snippet, because it's a more major refactoring:

Use array.map to fill next_state, to avoid filling it with false values first

pub fn next_step(game_array: [[bool; WIDTH]; HEIGHT]) -> [[bool; WIDTH]; HEIGHT] {
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 let mut rownum: usize = 0;
 [[(); WIDTH]; HEIGHT].map(|row| {
 let mut cellnum: usize = 0;
 let row = row.map(|()| {
 // Calculate neighbors
 let neighbours = NEIGHBOUR_LIST
 .iter()
 .filter(|[j, k]| {
 game_array[(rownum as isize + j).rem_euclid(HEIGHT as isize) as usize]
 [(cellnum as isize + k).rem_euclid(WIDTH as isize) as usize]
 })
 .count();
 let cell_active = game_array[rownum][cellnum];
 cellnum += 1;
 // Apply game-of-life game rules
 neighbours == 3 || (cell_active && neighbours == 2)
 });
 rownum += 1;
 row
 })
}

Further remark - Owned vs Referenced

Arrays are usually Copy. Therefore it could be that you loose a lot of performance (unless the compiler optimizes this away) by using owned values as input/output. For performance reasons, I'd allocate two buffers globally and then pass them in via references:

pub fn next_step(current_step: &[[bool; WIDTH]; HEIGHT], next_step: &mut [[bool; WIDTH]; HEIGHT]) {
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 for (rownum, row) in next_step.iter_mut().enumerate() {
 for (cellnum, cell) in row.iter_mut().enumerate() {
 // Calculate neighbors
 let neighbours = NEIGHBOUR_LIST
 .iter()
 .filter(|[j, k]| {
 current_step[(rownum as isize + j).rem_euclid(HEIGHT as isize) as usize]
 [(cellnum as isize + k).rem_euclid(WIDTH as isize) as usize]
 })
 .count();
 // Apply game-of-life game rules
 *cell = neighbours == 3 || (current_step[rownum][cellnum] && neighbours == 2)
 }
 }
}

Further remark #2 - Extracting the offset into a function

You could wrap the offset calculation in an inline function, for improved readability:

pub fn next_step(current_step: &[[bool; WIDTH]; HEIGHT], next_step: &mut [[bool; WIDTH]; HEIGHT]) {
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 #[inline(always)]
 fn wrapping_add<const LIMIT: usize>(value: usize, offset: isize) -> usize {
 (value as isize + offset).rem_euclid(LIMIT as isize) as usize
 }
 for (rownum, row) in next_step.iter_mut().enumerate() {
 for (cellnum, cell) in row.iter_mut().enumerate() {
 // Calculate neighbors
 let neighbours = NEIGHBOUR_LIST
 .iter()
 .filter(|[j, k]| {
 current_step[wrapping_add::<HEIGHT>(rownum, *j)]
 [wrapping_add::<WIDTH>(cellnum, *k)]
 })
 .count();
 // Apply game-of-life game rules
 *cell = neighbours == 3 || (current_step[rownum][cellnum] && neighbours == 2)
 }
 }
}

Further remark #3 - Unsafe and extending the slice type

As you can prove fairly easily that your access to game_array will never be out of bounds, you can use unsafe{ get_unchecked() } for another small performance increase. If you feel ok with unsafe.

pub fn next_step(current_step: &[[bool; WIDTH]; HEIGHT], next_step: &mut [[bool; WIDTH]; HEIGHT]) {
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 #[inline(always)]
 fn wrapping_add<const LIMIT: usize>(value: usize, offset: isize) -> usize {
 (value as isize + offset).rem_euclid(LIMIT as isize) as usize
 }
 for (rownum, row) in next_step.iter_mut().enumerate() {
 for (cellnum, cell) in row.iter_mut().enumerate() {
 // Calculate neighbors
 let neighbours = NEIGHBOUR_LIST
 .iter()
 .filter(|[j, k]| *unsafe {
 current_step
 .get_unchecked(wrapping_add::<HEIGHT>(rownum, *j))
 .get_unchecked(wrapping_add::<WIDTH>(cellnum, *k))
 })
 .count();
 // Apply game-of-life game rules
 *cell = neighbours == 3 || (current_step[rownum][cellnum] && neighbours == 2)
 }
 }
}

You could even go one step further and add this functionality to the slice type itself:

trait ArrayExt<T, const SIZE: usize> {
 fn get_wrapped(&self, index: isize) -> &T;
}
impl<T, const SIZE: usize> ArrayExt<T, SIZE> for [T; SIZE] {
 #[inline(always)]
 fn get_wrapped(&self, index: isize) -> &T {
 unsafe { self.get_unchecked(index.rem_euclid(SIZE as isize) as usize) }
 }
}
pub fn next_step(current_step: &[[bool; WIDTH]; HEIGHT], next_step: &mut [[bool; WIDTH]; HEIGHT]) {
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 for (rownum, row) in next_step.iter_mut().enumerate() {
 for (cellnum, cell) in row.iter_mut().enumerate() {
 // Calculate neighbors
 let neighbours = NEIGHBOUR_LIST
 .iter()
 .filter(|[j, k]| {
 *current_step
 .get_wrapped(rownum as isize + *j)
 .get_wrapped(cellnum as isize + *k)
 })
 .count();
 // Apply game-of-life game rules
 *cell = neighbours == 3 || (current_step[rownum][cellnum] && neighbours == 2)
 }
 }
}

Further remark #4 - Dynamically sized slices

You most likely don't want to operate exclusively on compile-time known array sizes, so here is a version that can operate on anything that fullfills &[&[bool]]:

trait SliceExt<T> {
 fn get_wrapped(&self, index: isize) -> &T;
}
impl<T, S> SliceExt<T> for S
where
 S: AsRef<[T]>,
{
 #[inline(always)]
 fn get_wrapped(&self, index: isize) -> &T {
 let slice = self.as_ref();
 unsafe { slice.get_unchecked(index.rem_euclid(slice.len() as isize) as usize) }
 }
}
pub fn next_step<In, NestedIn, Out, NestedOut>(current_step: In, mut next_step: Out)
where
 In: AsRef<[NestedIn]>,
 Out: AsMut<[NestedOut]>,
 NestedIn: AsRef<[bool]>,
 NestedOut: AsMut<[bool]>,
{
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 for (rownum, row) in next_step.as_mut().iter_mut().enumerate() {
 for (cellnum, cell) in row.as_mut().iter_mut().enumerate() {
 // Calculate neighbors
 let neighbours = NEIGHBOUR_LIST
 .iter()
 .filter(|[j, k]| {
 *current_step
 .get_wrapped(rownum as isize + *j)
 .get_wrapped(cellnum as isize + *k)
 })
 .count();
 // Apply game-of-life game rules
 *cell = neighbours == 3
 || (current_step.as_ref()[rownum].as_ref()[cellnum] && neighbours == 2)
 }
 }
}

This should work for both [[bool; WIDTH]; HEIGHT] and Vec<Vec<bool>> and anything in between.

There might be a very slight performance hit due to the dynamic .len() instead of a compile time constant. It should be very small, though. But that of course would need to be benchmarked, it's hard to say just by looking at it.

Remark #5 - Multithreading

The advantage of using iterators is that it is now trivial to introduce multithreading using the rayon library:

use rayon::prelude::*;
trait SliceExt<T> {
 fn get_wrapped(&self, index: isize) -> &T;
}
impl<T, S> SliceExt<T> for S
where
 S: AsRef<[T]>,
{
 #[inline(always)]
 fn get_wrapped(&self, index: isize) -> &T {
 let slice = self.as_ref();
 unsafe { slice.get_unchecked(index.rem_euclid(slice.len() as isize) as usize) }
 }
}
pub fn next_step<In, NestedIn, Out, NestedOut>(current_step: In, mut next_step: Out)
where
 In: AsRef<[NestedIn]> + Sync,
 Out: AsMut<[NestedOut]> + Send,
 NestedIn: AsRef<[bool]>,
 NestedOut: AsMut<[bool]> + Send,
{
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 next_step
 .as_mut()
 .par_iter_mut()
 .enumerate()
 .for_each(|(rownum, row)| {
 for (cellnum, cell) in row.as_mut().iter_mut().enumerate() {
 // Calculate neighbors
 let neighbours = NEIGHBOUR_LIST
 .iter()
 .filter(|[j, k]| {
 *current_step
 .get_wrapped(rownum as isize + *j)
 .get_wrapped(cellnum as isize + *k)
 })
 .count();
 // Apply game-of-life game rules
 *cell = neighbours == 3
 || (current_step.as_ref()[rownum].as_ref()[cellnum] && neighbours == 2)
 }
 })
}

Performance

After running my versions in @DobromirM's benchmarking setup, I realized that the biggest problem with your existing code is indeed the ownership transfer in and out of next_step, which causes two copies of the entire game_array.

My fastest version is the one with ArrayExt. This is the time it got:

game-of-life new impl, (10x10), random state, 10000 steps 
 time: [14.955 ms 15.036 ms 15.144 ms]
Found 9 outliers among 100 measurements (9.00%)
 6 (6.00%) high mild
 3 (3.00%) high severe
game-of-life original impl, (10x10), random state, 10000 steps 
 time: [38.076 ms 38.725 ms 39.437 ms]
Found 9 outliers among 100 measurements (9.00%)
 6 (6.00%) high mild
 3 (3.00%) high severe
game-of-life finomnis impl, (10x10), random state, 10000 steps 
 time: [1.9600 ms 1.9819 ms 2.0087 ms]
Found 10 outliers among 100 measurements (10.00%)
 3 (3.00%) high mild
 7 (7.00%) high severe
game-of-life new impl, (100x100), random state, 1000 steps 
 time: [165.55 ms 168.42 ms 171.62 ms]
Found 12 outliers among 100 measurements (12.00%)
 6 (6.00%) high mild
 6 (6.00%) high severe
game-of-life original impl, (100x100), random state, 1000 steps 
 time: [436.55 ms 441.34 ms 446.67 ms]
Found 3 outliers among 100 measurements (3.00%)
 2 (2.00%) high mild
 1 (1.00%) high severe
game-of-life finomnis impl, (100x100), random state, 1000 steps 
 time: [43.646 ms 43.870 ms 44.165 ms]
Found 6 outliers among 100 measurements (6.00%)
 2 (2.00%) high mild
 4 (4.00%) high severe

Experiment 1 average execution time:
Original implementation: 436.55 ms
DobromirM's implementation: 168.42 ms
My 'ArrayExt' implementation: 43.64 ms

Experiment 2 average execution time:
Original implementation: 38.076 ms
DobromirM's implementation: 14.955 ms
My 'ArrayExt' implementation: 1.960 ms

Fastest version and conclusion

I assume that your field will be quite large, probably more than 1000x1000. With that in mind, combining all the things I said, this is the fastest version I could come up with:

use rayon::prelude::*;
trait ArrayExt<T, const SIZE: usize> {
 fn get_wrapped(&self, index: isize) -> &T;
}
impl<T, const SIZE: usize> ArrayExt<T, SIZE> for [T; SIZE] {
 #[inline(always)]
 fn get_wrapped(&self, index: isize) -> &T {
 unsafe { self.get_unchecked(index.rem_euclid(SIZE as isize) as usize) }
 }
}
pub fn next_step<const WIDTH: usize, const HEIGHT: usize>(
 current_step: &[[bool; WIDTH]; HEIGHT],
 next_step: &mut [[bool; WIDTH]; HEIGHT],
) {
 const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
 [ 0, -1], [ 0, 1],
 [ 1, -1], [ 1, 0], [ 1, 1]];
 next_step
 .par_iter_mut()
 .enumerate()
 .for_each(|(rownum, row)| {
 for (cellnum, cell) in row.iter_mut().enumerate() {
 // Calculate neighbors
 let neighbours: u8 = NEIGHBOUR_LIST
 .iter()
 .map(|[j, k]| {
 *current_step
 .get_wrapped(rownum as isize + *j)
 .get_wrapped(cellnum as isize + *k) as u8
 })
 .sum();
 // Apply game-of-life game rules
 *cell = neighbours == 3 || (neighbours == 2 && current_step[rownum][cellnum])
 }
 });
}

1000x1000, 10 steps:

Orginal: 420.63 ms
DobromirM: 208.59 ms
Mine (ArrayExt, without rayon): 54.772 ms
Mine (ArrayExt, with rayon): 22.385 ms

This was when I hit a limit. The stack started to overflow.

Then I used this post to create a Box<[[bool; WIDTH]; HEIGHT]> on the heap:

10000x10000, 1 step:

Orginal: 4123.4 ms
DobromirM: 1811.1 ms
Mine (ArrayExt, without rayon): 691.3 ms
Mine (ArrayExt, with rayon): 179.1 ms

Of course, this performance is nothing compared to what GPUs are capable of.

To close this chapter, the main takeaway for me is: Moves are not free.

Was fun, I hope you took something away from this answer.

And thanks to @DobromirM for providing the benchmarking code.

Question 9

This looks more or less like how I would have written it, so I think it is nice. :)

(((rownum as isize + j % HEIGHT_I) + HEIGHT_I) % HEIGHT_I) as usize

I recognise this pattern from code I've written before in other languages: you want a modulo where the result is positive even if the dividend is negative, right? You can use Rust's .rem_euclid for that:

(rownum as isize + j).rem_euclid(HEIGHT_I) as usize

(Warning: a.rem_euclid(b) is only the same as (((a % b) + b) % b) if b is positive. I think it's rare that you'd want a negative divisor though.)

let mut neighbours: u8; // Will never be above 8

Is there any particular reason this is defined outside the loop? I don't think you ever want to carry over the neighbour count from one cell to the next, right? You could sink this declaration (let) to where you set the initial value (= 0) and combine them.

DobromirM DobromirM 1,1235 silver badges15 bronze badges · Accepted Answer · 2022-08-06 15:31:58Z

Your solution looks good! I would suggest the following improvements in order to make it more readable:

Use type aliases to improve the readability:

type State = [[bool; WIDTH]; HEIGHT];

Extract the coordinate wrapping logic into its own type to make it easier to understand:

#[derive(Debug, Copy, Clone)]
struct Coord {
 value: usize,
 max: usize,
}
impl Coord {
 fn new(value: usize, max: usize) -> Self {
 Coord {
 value,
 max,
 }
 }
 fn increment(&mut self) {
 self.value += 1;
 if self.value >= self.max {
 self.value = 0;
 }
 }
 fn decrement(&mut self) {
 self.value = self.value.checked_sub(1).unwrap_or(self.max - 1);
 }
}

Create a Cell and NeighborsIter structures that encapsulate the logic of iterating over all neigbhours of a given cell:

#[derive(Debug, Copy, Clone)]
struct Cell {
 x: Coord,
 y: Coord,
}
impl Cell {
 fn new(x: usize, y: usize) -> Self {
 Cell {
 x: Coord::new(x, HEIGHT),
 y: Coord::new(y, WIDTH),
 }
 }
 fn into_neighbors_iter(self) -> NeighborsIter {
 NeighborsIter::new(self)
 }
}
struct NeighborsIter {
 state: usize,
 cell: Cell,
}
impl NeighborsIter {
 fn new(init: Cell) -> Self {
 Self {
 state: 0,
 cell: init,
 }
 }
}
impl Iterator for NeighborsIter {
 type Item = Cell;
 fn next(&mut self) -> Option<Self::Item> {
 match self.state {
 0 => {
 self.cell.x.decrement();
 self.cell.y.decrement()
 }
 1 => { self.cell.y.increment() }
 2 => { self.cell.y.increment() }
 3 => { self.cell.x.increment() }
 4 => { self.cell.x.increment() }
 5 => { self.cell.y.decrement() }
 6 => { self.cell.y.decrement() }
 7 => { self.cell.x.decrement() }
 _ => { return None; }
 }
 self.state += 1;
 Some(self.cell)
 }
}

Separate the counting of live neigbhors into its own function:

fn count_neighbors(&self, cell: Cell) -> u8 {
 let mut neighbours: u8 = 0;
 for cell in cell.into_neighbors_iter() {
 if self.state[cell.x.value][cell.y.value] {
 neighbours += 1;
 }
 }
 neighbours
}

Create a Board struct and move the next_step function into it as a method:

#[derive(Debug)]
struct Board {
 state: State
}
impl Board {
 pub fn new() -> Self {
 Board {
 state: [[false; WIDTH]; HEIGHT]
 }
 }
 pub fn from(state: State) -> Self {
 Board {
 state
 }
 }
 pub fn next_step(self) -> Board {
 let mut next_state = [[false; WIDTH]; HEIGHT];
 for row in 0..HEIGHT {
 for col in 0..WIDTH {
 match self.count_neighbors(Cell::new(row, col)) {
 3 => next_state[row][col] = true,
 2 if self.state[row][col] => next_state[row][col] = true,
 _ => {}
 }
 }
 }
 Board {
 state: next_state
 }
 }
 fn count_neighbors(&self, cell: Cell) -> u8 {
 let mut neighbours: u8 = 0;
 for cell in cell.into_neighbors_iter() {
 if self.state[cell.x.value][cell.y.value] {
 neighbours += 1;
 }
 }
 neighbours
 }
}

Final Code:

const WIDTH: usize = 10;
const HEIGHT: usize = 10;
type State = [[bool; WIDTH]; HEIGHT];
#[derive(Debug, Copy, Clone)]
struct Coord {
 value: usize,
 max: usize,
}
impl Coord {
 fn new(value: usize, max: usize) -> Self {
 Coord {
 value,
 max,
 }
 }
 fn increment(&mut self) {
 self.value += 1;
 if self.value >= self.max {
 self.value = 0;
 }
 }
 fn decrement(&mut self) {
 self.value = self.value.checked_sub(1).unwrap_or(self.max - 1);
 }
}
#[derive(Debug, Copy, Clone)]
struct Cell {
 x: Coord,
 y: Coord,
}
impl Cell {
 fn new(x: usize, y: usize) -> Self {
 Cell {
 x: Coord::new(x, HEIGHT),
 y: Coord::new(y, WIDTH),
 }
 }
 fn into_neighbors_iter(self) -> NeighborsIter {
 NeighborsIter::new(self)
 }
}
struct NeighborsIter {
 state: usize,
 cell: Cell,
}
impl NeighborsIter {
 fn new(init: Cell) -> Self {
 Self {
 state: 0,
 cell: init,
 }
 }
}
impl Iterator for NeighborsIter {
 type Item = Cell;
 fn next(&mut self) -> Option<Self::Item> {
 match self.state {
 0 => {
 self.cell.x.decrement();
 self.cell.y.decrement()
 }
 1 => { self.cell.y.increment() }
 2 => { self.cell.y.increment() }
 3 => { self.cell.x.increment() }
 4 => { self.cell.x.increment() }
 5 => { self.cell.y.decrement() }
 6 => { self.cell.y.decrement() }
 7 => { self.cell.x.decrement() }
 _ => { return None; }
 }
 self.state += 1;
 Some(self.cell)
 }
}
#[derive(Debug)]
struct Board {
 state: State
}
impl Board {
 pub fn new() -> Self {
 Board {
 state: [[false; WIDTH]; HEIGHT]
 }
 }
 pub fn from(state: State) -> Self {
 Board {
 state
 }
 }
 pub fn next_step(self) -> Board {
 let mut next_state = [[false; WIDTH]; HEIGHT];
 for row in 0..HEIGHT {
 for col in 0..WIDTH {
 match self.count_neighbors(Cell::new(row, col)) {
 3 => next_state[row][col] = true,
 2 if self.state[row][col] => next_state[row][col] = true,
 _ => {}
 }
 }
 }
 Board {
 state: next_state
 }
 }
 fn count_neighbors(&self, cell: Cell) -> u8 {
 let mut neighbours: u8 = 0;
 for cell in cell.into_neighbors_iter() {
 if self.state[cell.x.value][cell.y.value] {
 neighbours += 1;
 }
 }
 neighbours
 }
}

Performance:

After benchmarking the performance of the original implementation from the author and my new implementation, it is clear that the original modulus approach yields significantly slower performance on top of being more unreadable.

Experiment 1 parameters:
Grid - 100x100
Starting sate - Randomly generated (but same between the two methods)
Game steps - 1000

Experiment 1 average execution time:
Original implementation: 381.86 ms
New implementation: 162.13 ms

Experiment 2 parameters:
Grid - 10x10
Starting sate - Randomly generated (but same between the two methods)
Game steps - 10000

Experiment 2 average execution time:
Original implementation: 33.209 ms
New implementation: 14.175 ms

Additional information:

The library used for the benchmarking is criterion. And the raw data can be seen bellow.

game-of-life new impl, (100x100), random state, 1000 steps 
 time: [158.70 ms 162.13 ms 166.53 ms]
Found 7 outliers among 100 measurements (7.00%)
 2 (2.00%) high mild
 5 (5.00%) high severe
---------------------------------------------------------------------
game-of-life original impl, (100x100), random state, 1000 steps 
 time: [381.63 ms 381.86 ms 382.13 ms]
Found 6 outliers among 100 measurements (6.00%)
 2 (2.00%) low mild
 2 (2.00%) high mild
 2 (2.00%) high severe
--------------------------------------------------------------------
game-of-life new impl, (10x10), random state, 10000 steps 
 time: [13.847 ms 14.175 ms 14.580 ms]
Found 10 outliers among 100 measurements (10.00%)
 2 (2.00%) high mild
 8 (8.00%) high severe
---------------------------------------------------------------------
game-of-life original impl, (10x10), random state, 10000 steps 
 time: [33.176 ms 33.209 ms 33.247 ms]
Found 9 outliers among 100 measurements (9.00%)
 2 (2.00%) low mild
 4 (4.00%) high mild
 3 (3.00%) high severe

While this is nicely implemented, I'm quite certain that his will perform significantly slower than OPs original version. Especially your decrement and increment functions are very critical, performance wise.
I would strongly disagree with you. Please see the benchmarking data in my updated answer.
Would you mind sharing the benchmark code so I can reproduce it?
I have to agree that you get a speedup. But I don't think that's because your code is fast, but because OP's original code is very slow. On my machine. OP's version is 37 ms, yours is 15.3 ms, and my version with unsafe and ArrayExt is 1.95 ms.

Stack Exchange Network

Optimise Game of Life in Rust

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Optimise Game of Life in Rust

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions