I recently picked up Rust and was making a CLI for Conway's Game of Life. I got it working but, looking back at it, there are places it could be improved. The main one is the function that generates the next board. Specifically, the part of that function which calculates the amount of alive neighbours a cell has.
fn next_step(game_array: [[bool; WIDTH]; HEIGHT]) -> [[bool; WIDTH]; HEIGHT] {
let mut next_state = [[false; WIDTH]; HEIGHT];
let mut neighbours: u8; // Will never be above 8
const HEIGHT_I: isize = HEIGHT as isize;
const WIDTH_I: isize = WIDTH as isize;
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
for rownum in 0..HEIGHT {
for cellnum in 0..WIDTH {
// FROM HERE
neighbours = 0;
for [j, k] in NEIGHBOUR_LIST {
// This will break if width and height are set to larger than isize::MAX
if game_array[(((rownum as isize + j % HEIGHT_I) + HEIGHT_I) % HEIGHT_I) as usize]
[(((cellnum as isize + k % WIDTH_I) + WIDTH_I) % WIDTH_I) as usize] {
neighbours += 1;
}
}
// TO HERE
// This is the cleanest way I could find to implement Life rules
if neighbours == 3 || (game_array[rownum][cellnum] && neighbours == 2) {
next_state[rownum][cellnum] = true;
}
}
}
next_state
}
I wanted the edges of the board to loop and this is the best way I could find to check the neighbours of a cell. However, it is verbose and hard to read. Is there a better way that I am missing?
HEIGHT
and WIDTH
are the height and width of the board and will never be above isize::MAX
.
3 Answers 3
Your solution looks good! I would suggest the following improvements in order to make it more readable:
Use type aliases to improve the readability:
type State = [[bool; WIDTH]; HEIGHT];
Extract the coordinate wrapping logic into its own type to make it easier to understand:
#[derive(Debug, Copy, Clone)]
struct Coord {
value: usize,
max: usize,
}
impl Coord {
fn new(value: usize, max: usize) -> Self {
Coord {
value,
max,
}
}
fn increment(&mut self) {
self.value += 1;
if self.value >= self.max {
self.value = 0;
}
}
fn decrement(&mut self) {
self.value = self.value.checked_sub(1).unwrap_or(self.max - 1);
}
}
Create a Cell
and NeighborsIter
structures that encapsulate the logic of iterating over all neigbhours of a given cell:
#[derive(Debug, Copy, Clone)]
struct Cell {
x: Coord,
y: Coord,
}
impl Cell {
fn new(x: usize, y: usize) -> Self {
Cell {
x: Coord::new(x, HEIGHT),
y: Coord::new(y, WIDTH),
}
}
fn into_neighbors_iter(self) -> NeighborsIter {
NeighborsIter::new(self)
}
}
struct NeighborsIter {
state: usize,
cell: Cell,
}
impl NeighborsIter {
fn new(init: Cell) -> Self {
Self {
state: 0,
cell: init,
}
}
}
impl Iterator for NeighborsIter {
type Item = Cell;
fn next(&mut self) -> Option<Self::Item> {
match self.state {
0 => {
self.cell.x.decrement();
self.cell.y.decrement()
}
1 => { self.cell.y.increment() }
2 => { self.cell.y.increment() }
3 => { self.cell.x.increment() }
4 => { self.cell.x.increment() }
5 => { self.cell.y.decrement() }
6 => { self.cell.y.decrement() }
7 => { self.cell.x.decrement() }
_ => { return None; }
}
self.state += 1;
Some(self.cell)
}
}
Separate the counting of live neigbhors into its own function:
fn count_neighbors(&self, cell: Cell) -> u8 {
let mut neighbours: u8 = 0;
for cell in cell.into_neighbors_iter() {
if self.state[cell.x.value][cell.y.value] {
neighbours += 1;
}
}
neighbours
}
Create a Board
struct and move the next_step
function into it as a method:
#[derive(Debug)]
struct Board {
state: State
}
impl Board {
pub fn new() -> Self {
Board {
state: [[false; WIDTH]; HEIGHT]
}
}
pub fn from(state: State) -> Self {
Board {
state
}
}
pub fn next_step(self) -> Board {
let mut next_state = [[false; WIDTH]; HEIGHT];
for row in 0..HEIGHT {
for col in 0..WIDTH {
match self.count_neighbors(Cell::new(row, col)) {
3 => next_state[row][col] = true,
2 if self.state[row][col] => next_state[row][col] = true,
_ => {}
}
}
}
Board {
state: next_state
}
}
fn count_neighbors(&self, cell: Cell) -> u8 {
let mut neighbours: u8 = 0;
for cell in cell.into_neighbors_iter() {
if self.state[cell.x.value][cell.y.value] {
neighbours += 1;
}
}
neighbours
}
}
Final Code:
const WIDTH: usize = 10;
const HEIGHT: usize = 10;
type State = [[bool; WIDTH]; HEIGHT];
#[derive(Debug, Copy, Clone)]
struct Coord {
value: usize,
max: usize,
}
impl Coord {
fn new(value: usize, max: usize) -> Self {
Coord {
value,
max,
}
}
fn increment(&mut self) {
self.value += 1;
if self.value >= self.max {
self.value = 0;
}
}
fn decrement(&mut self) {
self.value = self.value.checked_sub(1).unwrap_or(self.max - 1);
}
}
#[derive(Debug, Copy, Clone)]
struct Cell {
x: Coord,
y: Coord,
}
impl Cell {
fn new(x: usize, y: usize) -> Self {
Cell {
x: Coord::new(x, HEIGHT),
y: Coord::new(y, WIDTH),
}
}
fn into_neighbors_iter(self) -> NeighborsIter {
NeighborsIter::new(self)
}
}
struct NeighborsIter {
state: usize,
cell: Cell,
}
impl NeighborsIter {
fn new(init: Cell) -> Self {
Self {
state: 0,
cell: init,
}
}
}
impl Iterator for NeighborsIter {
type Item = Cell;
fn next(&mut self) -> Option<Self::Item> {
match self.state {
0 => {
self.cell.x.decrement();
self.cell.y.decrement()
}
1 => { self.cell.y.increment() }
2 => { self.cell.y.increment() }
3 => { self.cell.x.increment() }
4 => { self.cell.x.increment() }
5 => { self.cell.y.decrement() }
6 => { self.cell.y.decrement() }
7 => { self.cell.x.decrement() }
_ => { return None; }
}
self.state += 1;
Some(self.cell)
}
}
#[derive(Debug)]
struct Board {
state: State
}
impl Board {
pub fn new() -> Self {
Board {
state: [[false; WIDTH]; HEIGHT]
}
}
pub fn from(state: State) -> Self {
Board {
state
}
}
pub fn next_step(self) -> Board {
let mut next_state = [[false; WIDTH]; HEIGHT];
for row in 0..HEIGHT {
for col in 0..WIDTH {
match self.count_neighbors(Cell::new(row, col)) {
3 => next_state[row][col] = true,
2 if self.state[row][col] => next_state[row][col] = true,
_ => {}
}
}
}
Board {
state: next_state
}
}
fn count_neighbors(&self, cell: Cell) -> u8 {
let mut neighbours: u8 = 0;
for cell in cell.into_neighbors_iter() {
if self.state[cell.x.value][cell.y.value] {
neighbours += 1;
}
}
neighbours
}
}
Performance:
After benchmarking the performance of the original implementation from the author and my new implementation, it is clear that the original modulus approach yields significantly slower performance on top of being more unreadable.
Experiment 1 parameters:
Grid - 100x100
Starting sate - Randomly generated (but same between the two methods)
Game steps - 1000
Experiment 1 average execution time:
Original implementation: 381.86 ms
New implementation: 162.13 ms
Experiment 2 parameters:
Grid - 10x10
Starting sate - Randomly generated (but same between the two methods)
Game steps - 10000
Experiment 2 average execution time:
Original implementation: 33.209 ms
New implementation: 14.175 ms
Additional information:
The library used for the benchmarking is criterion. And the raw data can be seen bellow.
game-of-life new impl, (100x100), random state, 1000 steps
time: [158.70 ms 162.13 ms 166.53 ms]
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe
---------------------------------------------------------------------
game-of-life original impl, (100x100), random state, 1000 steps
time: [381.63 ms 381.86 ms 382.13 ms]
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) low mild
2 (2.00%) high mild
2 (2.00%) high severe
--------------------------------------------------------------------
game-of-life new impl, (10x10), random state, 10000 steps
time: [13.847 ms 14.175 ms 14.580 ms]
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) high mild
8 (8.00%) high severe
---------------------------------------------------------------------
game-of-life original impl, (10x10), random state, 10000 steps
time: [33.176 ms 33.209 ms 33.247 ms]
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) low mild
4 (4.00%) high mild
3 (3.00%) high severe
-
\$\begingroup\$ While this is nicely implemented, I'm quite certain that his will perform significantly slower than OPs original version. Especially your
decrement
andincrement
functions are very critical, performance wise. \$\endgroup\$Finomnis– Finomnis2022年08月12日 09:01:55 +00:00Commented Aug 12, 2022 at 9:01 -
\$\begingroup\$ I would strongly disagree with you. Please see the benchmarking data in my updated answer. \$\endgroup\$DobromirM– DobromirM2022年08月12日 10:17:28 +00:00Commented Aug 12, 2022 at 10:17
-
\$\begingroup\$ Would you mind sharing the benchmark code so I can reproduce it? \$\endgroup\$Finomnis– Finomnis2022年08月12日 10:24:02 +00:00Commented Aug 12, 2022 at 10:24
-
\$\begingroup\$ github.com/DobromirM/game-of-life \$\endgroup\$DobromirM– DobromirM2022年08月12日 10:50:40 +00:00Commented Aug 12, 2022 at 10:50
-
\$\begingroup\$ I have to agree that you get a speedup. But I don't think that's because your code is fast, but because OP's original code is very slow. On my machine. OP's version is
37 ms
, yours is15.3 ms
, and my version with unsafe andArrayExt
is1.95 ms
. \$\endgroup\$Finomnis– Finomnis2022年08月12日 11:56:20 +00:00Commented Aug 12, 2022 at 11:56
I've refactored your code a bunch. Nothing major, just a lot of little opinionated optimizations.
- Apply @Andrea's change to your offset calculation
- Use iterators instead of a
for
loop calculate the number of neighbors
pub fn next_step(game_array: [[bool; WIDTH]; HEIGHT]) -> [[bool; WIDTH]; HEIGHT] {
let mut next_state = [[false; WIDTH]; HEIGHT];
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
for rownum in 0..HEIGHT {
for cellnum in 0..WIDTH {
// Calculate neighbors
let neighbours = NEIGHBOUR_LIST
.iter()
.filter(|[j, k]| {
game_array[(rownum as isize + j).rem_euclid(HEIGHT as isize) as usize]
[(cellnum as isize + k).rem_euclid(WIDTH as isize) as usize]
})
.count();
// Apply game-of-life game rules
if neighbours == 3 || (game_array[rownum][cellnum] && neighbours == 2) {
next_state[rownum][cellnum] = true;
}
}
}
next_state
}
Further, in a separate code snippet, because it's a more major refactoring:
- Use
array.map
to fillnext_state
, to avoid filling it withfalse
values first
pub fn next_step(game_array: [[bool; WIDTH]; HEIGHT]) -> [[bool; WIDTH]; HEIGHT] {
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
let mut rownum: usize = 0;
[[(); WIDTH]; HEIGHT].map(|row| {
let mut cellnum: usize = 0;
let row = row.map(|()| {
// Calculate neighbors
let neighbours = NEIGHBOUR_LIST
.iter()
.filter(|[j, k]| {
game_array[(rownum as isize + j).rem_euclid(HEIGHT as isize) as usize]
[(cellnum as isize + k).rem_euclid(WIDTH as isize) as usize]
})
.count();
let cell_active = game_array[rownum][cellnum];
cellnum += 1;
// Apply game-of-life game rules
neighbours == 3 || (cell_active && neighbours == 2)
});
rownum += 1;
row
})
}
Further remark - Owned vs Referenced
Arrays are usually Copy
. Therefore it could be that you loose a lot of performance (unless the compiler optimizes this away) by using owned values as input/output. For performance reasons, I'd allocate two buffers globally and then pass them in via references:
pub fn next_step(current_step: &[[bool; WIDTH]; HEIGHT], next_step: &mut [[bool; WIDTH]; HEIGHT]) {
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
for (rownum, row) in next_step.iter_mut().enumerate() {
for (cellnum, cell) in row.iter_mut().enumerate() {
// Calculate neighbors
let neighbours = NEIGHBOUR_LIST
.iter()
.filter(|[j, k]| {
current_step[(rownum as isize + j).rem_euclid(HEIGHT as isize) as usize]
[(cellnum as isize + k).rem_euclid(WIDTH as isize) as usize]
})
.count();
// Apply game-of-life game rules
*cell = neighbours == 3 || (current_step[rownum][cellnum] && neighbours == 2)
}
}
}
Further remark #2 - Extracting the offset into a function
You could wrap the offset calculation in an inline function, for improved readability:
pub fn next_step(current_step: &[[bool; WIDTH]; HEIGHT], next_step: &mut [[bool; WIDTH]; HEIGHT]) {
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
#[inline(always)]
fn wrapping_add<const LIMIT: usize>(value: usize, offset: isize) -> usize {
(value as isize + offset).rem_euclid(LIMIT as isize) as usize
}
for (rownum, row) in next_step.iter_mut().enumerate() {
for (cellnum, cell) in row.iter_mut().enumerate() {
// Calculate neighbors
let neighbours = NEIGHBOUR_LIST
.iter()
.filter(|[j, k]| {
current_step[wrapping_add::<HEIGHT>(rownum, *j)]
[wrapping_add::<WIDTH>(cellnum, *k)]
})
.count();
// Apply game-of-life game rules
*cell = neighbours == 3 || (current_step[rownum][cellnum] && neighbours == 2)
}
}
}
Further remark #3 - Unsafe and extending the slice type
As you can prove fairly easily that your access to game_array
will never be out of bounds, you can use unsafe{ get_unchecked() }
for another small performance increase. If you feel ok with unsafe
.
pub fn next_step(current_step: &[[bool; WIDTH]; HEIGHT], next_step: &mut [[bool; WIDTH]; HEIGHT]) {
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
#[inline(always)]
fn wrapping_add<const LIMIT: usize>(value: usize, offset: isize) -> usize {
(value as isize + offset).rem_euclid(LIMIT as isize) as usize
}
for (rownum, row) in next_step.iter_mut().enumerate() {
for (cellnum, cell) in row.iter_mut().enumerate() {
// Calculate neighbors
let neighbours = NEIGHBOUR_LIST
.iter()
.filter(|[j, k]| *unsafe {
current_step
.get_unchecked(wrapping_add::<HEIGHT>(rownum, *j))
.get_unchecked(wrapping_add::<WIDTH>(cellnum, *k))
})
.count();
// Apply game-of-life game rules
*cell = neighbours == 3 || (current_step[rownum][cellnum] && neighbours == 2)
}
}
}
You could even go one step further and add this functionality to the slice type itself:
trait ArrayExt<T, const SIZE: usize> {
fn get_wrapped(&self, index: isize) -> &T;
}
impl<T, const SIZE: usize> ArrayExt<T, SIZE> for [T; SIZE] {
#[inline(always)]
fn get_wrapped(&self, index: isize) -> &T {
unsafe { self.get_unchecked(index.rem_euclid(SIZE as isize) as usize) }
}
}
pub fn next_step(current_step: &[[bool; WIDTH]; HEIGHT], next_step: &mut [[bool; WIDTH]; HEIGHT]) {
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
for (rownum, row) in next_step.iter_mut().enumerate() {
for (cellnum, cell) in row.iter_mut().enumerate() {
// Calculate neighbors
let neighbours = NEIGHBOUR_LIST
.iter()
.filter(|[j, k]| {
*current_step
.get_wrapped(rownum as isize + *j)
.get_wrapped(cellnum as isize + *k)
})
.count();
// Apply game-of-life game rules
*cell = neighbours == 3 || (current_step[rownum][cellnum] && neighbours == 2)
}
}
}
Further remark #4 - Dynamically sized slices
You most likely don't want to operate exclusively on compile-time known array sizes, so here is a version that can operate on anything that fullfills &[&[bool]]
:
trait SliceExt<T> {
fn get_wrapped(&self, index: isize) -> &T;
}
impl<T, S> SliceExt<T> for S
where
S: AsRef<[T]>,
{
#[inline(always)]
fn get_wrapped(&self, index: isize) -> &T {
let slice = self.as_ref();
unsafe { slice.get_unchecked(index.rem_euclid(slice.len() as isize) as usize) }
}
}
pub fn next_step<In, NestedIn, Out, NestedOut>(current_step: In, mut next_step: Out)
where
In: AsRef<[NestedIn]>,
Out: AsMut<[NestedOut]>,
NestedIn: AsRef<[bool]>,
NestedOut: AsMut<[bool]>,
{
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
for (rownum, row) in next_step.as_mut().iter_mut().enumerate() {
for (cellnum, cell) in row.as_mut().iter_mut().enumerate() {
// Calculate neighbors
let neighbours = NEIGHBOUR_LIST
.iter()
.filter(|[j, k]| {
*current_step
.get_wrapped(rownum as isize + *j)
.get_wrapped(cellnum as isize + *k)
})
.count();
// Apply game-of-life game rules
*cell = neighbours == 3
|| (current_step.as_ref()[rownum].as_ref()[cellnum] && neighbours == 2)
}
}
}
This should work for both [[bool; WIDTH]; HEIGHT]
and Vec<Vec<bool>>
and anything in between.
There might be a very slight performance hit due to the dynamic .len()
instead of a compile time constant. It should be very small, though. But that of course would need to be benchmarked, it's hard to say just by looking at it.
Remark #5 - Multithreading
The advantage of using iterators is that it is now trivial to introduce multithreading using the rayon
library:
use rayon::prelude::*;
trait SliceExt<T> {
fn get_wrapped(&self, index: isize) -> &T;
}
impl<T, S> SliceExt<T> for S
where
S: AsRef<[T]>,
{
#[inline(always)]
fn get_wrapped(&self, index: isize) -> &T {
let slice = self.as_ref();
unsafe { slice.get_unchecked(index.rem_euclid(slice.len() as isize) as usize) }
}
}
pub fn next_step<In, NestedIn, Out, NestedOut>(current_step: In, mut next_step: Out)
where
In: AsRef<[NestedIn]> + Sync,
Out: AsMut<[NestedOut]> + Send,
NestedIn: AsRef<[bool]>,
NestedOut: AsMut<[bool]> + Send,
{
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
next_step
.as_mut()
.par_iter_mut()
.enumerate()
.for_each(|(rownum, row)| {
for (cellnum, cell) in row.as_mut().iter_mut().enumerate() {
// Calculate neighbors
let neighbours = NEIGHBOUR_LIST
.iter()
.filter(|[j, k]| {
*current_step
.get_wrapped(rownum as isize + *j)
.get_wrapped(cellnum as isize + *k)
})
.count();
// Apply game-of-life game rules
*cell = neighbours == 3
|| (current_step.as_ref()[rownum].as_ref()[cellnum] && neighbours == 2)
}
})
}
Performance
After running my versions in @DobromirM's benchmarking setup, I realized that the biggest problem with your existing code is indeed the ownership transfer in and out of next_step
, which causes two copies of the entire game_array
.
My fastest version is the one with ArrayExt
. This is the time it got:
game-of-life new impl, (10x10), random state, 10000 steps
time: [14.955 ms 15.036 ms 15.144 ms]
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
game-of-life original impl, (10x10), random state, 10000 steps
time: [38.076 ms 38.725 ms 39.437 ms]
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
game-of-life finomnis impl, (10x10), random state, 10000 steps
time: [1.9600 ms 1.9819 ms 2.0087 ms]
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) high mild
7 (7.00%) high severe
game-of-life new impl, (100x100), random state, 1000 steps
time: [165.55 ms 168.42 ms 171.62 ms]
Found 12 outliers among 100 measurements (12.00%)
6 (6.00%) high mild
6 (6.00%) high severe
game-of-life original impl, (100x100), random state, 1000 steps
time: [436.55 ms 441.34 ms 446.67 ms]
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
game-of-life finomnis impl, (100x100), random state, 1000 steps
time: [43.646 ms 43.870 ms 44.165 ms]
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
Experiment 1 average execution time:
Original implementation: 436.55 ms
DobromirM's implementation: 168.42 ms
My 'ArrayExt' implementation: 43.64 ms
Experiment 2 average execution time:
Original implementation: 38.076 ms
DobromirM's implementation: 14.955 ms
My 'ArrayExt' implementation: 1.960 ms
Fastest version and conclusion
I assume that your field will be quite large, probably more than 1000x1000. With that in mind, combining all the things I said, this is the fastest version I could come up with:
use rayon::prelude::*;
trait ArrayExt<T, const SIZE: usize> {
fn get_wrapped(&self, index: isize) -> &T;
}
impl<T, const SIZE: usize> ArrayExt<T, SIZE> for [T; SIZE] {
#[inline(always)]
fn get_wrapped(&self, index: isize) -> &T {
unsafe { self.get_unchecked(index.rem_euclid(SIZE as isize) as usize) }
}
}
pub fn next_step<const WIDTH: usize, const HEIGHT: usize>(
current_step: &[[bool; WIDTH]; HEIGHT],
next_step: &mut [[bool; WIDTH]; HEIGHT],
) {
const NEIGHBOUR_LIST: [[isize; 2]; 8] = [[-1, -1], [-1, 0], [-1, 1],
[ 0, -1], [ 0, 1],
[ 1, -1], [ 1, 0], [ 1, 1]];
next_step
.par_iter_mut()
.enumerate()
.for_each(|(rownum, row)| {
for (cellnum, cell) in row.iter_mut().enumerate() {
// Calculate neighbors
let neighbours: u8 = NEIGHBOUR_LIST
.iter()
.map(|[j, k]| {
*current_step
.get_wrapped(rownum as isize + *j)
.get_wrapped(cellnum as isize + *k) as u8
})
.sum();
// Apply game-of-life game rules
*cell = neighbours == 3 || (neighbours == 2 && current_step[rownum][cellnum])
}
});
}
1000x1000, 10 steps:
- Orginal: 420.63 ms
- DobromirM: 208.59 ms
- Mine (ArrayExt, without rayon): 54.772 ms
- Mine (ArrayExt, with rayon): 22.385 ms
This was when I hit a limit. The stack started to overflow.
Then I used this post to create a Box<[[bool; WIDTH]; HEIGHT]>
on the heap:
10000x10000, 1 step:
- Orginal: 4123.4 ms
- DobromirM: 1811.1 ms
- Mine (ArrayExt, without rayon): 691.3 ms
- Mine (ArrayExt, with rayon): 179.1 ms
Of course, this performance is nothing compared to what GPUs are capable of.
To close this chapter, the main takeaway for me is: Moves are not free.
Was fun, I hope you took something away from this answer.
And thanks to @DobromirM for providing the benchmarking code.
This looks more or less like how I would have written it, so I think it is nice. :)
(((rownum as isize + j % HEIGHT_I) + HEIGHT_I) % HEIGHT_I) as usize
I recognise this pattern from code I've written before in other languages: you want a modulo where the result is positive even if the dividend is negative, right? You can use Rust's .rem_euclid
for that:
(rownum as isize + j).rem_euclid(HEIGHT_I) as usize
(Warning: a.rem_euclid(b)
is only the same as (((a % b) + b) % b)
if b
is positive. I think it's rare that you'd want a negative divisor though.)
let mut neighbours: u8; // Will never be above 8
Is there any particular reason this is defined outside the loop? I don't think you ever want to carry over the neighbour count from one cell to the next, right? You could sink this declaration (let
) to where you set the initial value (= 0
) and combine them.