I am practicing Rust by creating data structures, such as this N-dimensional array.
The purpose of this structure is to easily define and address into arbitrarily-deeply nested arrays without having to write out Vec<Vec<Vec<T>>>
and incur the performance penalty coming from this (although performance is not a priority).
I am looking for feedback on how idiomatic the interface and implementation are, as well as how to improve my tests since I don't think they're very robust.
use std::ops::{Index, IndexMut};
#[derive(Debug, Clone)]
pub struct NdArray<T, const N: usize> {
dim: [usize; N],
data: Vec<T>,
}
impl<T, const N: usize> NdArray<T, N> {
fn calculate_capacity(dim: &[usize; N]) -> usize {
let mut cap = 1;
for &x in dim {
cap = usize::checked_mul(cap, x).expect("vector capacity overflowed usize");
}
cap
}
pub fn new(dim: [usize; N], default: T) -> Self
where
T: Clone,
{
let cap = Self::calculate_capacity(&dim);
NdArray {
dim,
data: vec![default; cap],
}
}
pub fn new_with(dim: [usize; N], generator: impl FnMut() -> T) -> Self {
let cap = Self::calculate_capacity(&dim);
NdArray {
dim,
data: {
let mut v = Vec::new();
v.resize_with(cap, generator);
v
},
}
}
fn get_flat_idx(&self, idx: &[usize; N]) -> usize {
let mut i = 0;
for d in 0..self.dim.len() {
assert!(
idx[d] < self.dim[d],
"index {} is out of bounds for dimension {} with size {}",
idx[d],
d,
self.dim[d],
);
// This cannot overflow since we already checked product of all dimensions fits usize
// and idx < self.dim.
i = i * self.dim[d] + idx[d];
}
i
}
}
impl<T, const N: usize> Index<&[usize; N]> for NdArray<T, N> {
type Output = T;
fn index(&self, idx: &[usize; N]) -> &T {
&self.data[self.get_flat_idx(idx)]
}
}
impl<T, const N: usize> IndexMut<&[usize; N]> for NdArray<T, N> {
fn index_mut(&mut self, idx: &[usize; N]) -> &mut T {
let i = self.get_flat_idx(idx);
&mut self.data[i]
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::panic::catch_unwind;
#[test]
fn nd_array_1d() {
let mut array = NdArray::new_with([10], || 1);
assert_eq!(array[&[0]], 1);
assert_eq!(array[&[9]], 1);
assert!(catch_unwind(|| array[&[10]]).is_err());
array[&[5]] = 99;
assert_eq!(array[&[4]], 1);
assert_eq!(array[&[5]], 99);
assert_eq!(array[&[6]], 1);
}
#[test]
fn nd_array() {
let mut array = NdArray::new([2, 3, 4], 0);
array[&[1, 2, 3]] = 10;
assert_eq!(array[&[1, 2, 3]], 10);
assert_eq!(array[&[1, 0, 0]], 0);
assert!(catch_unwind(|| array[&[2, 0, 0]]).is_err());
assert_eq!(array[&[0, 2, 0]], 0);
assert!(catch_unwind(|| array[&[0, 3, 0]]).is_err());
assert_eq!(array[&[0, 0, 3]], 0);
assert!(catch_unwind(|| array[&[0, 0, 4]]).is_err());
}
#[test]
fn nd_array_overflow() {
// Panics at the allocator code since sizeof(usize) > 2
// NdArray::new([usize::MAX / 2], 0);
assert!(catch_unwind(|| NdArray::new([usize::MAX / 2 + 1, 2], 0)).is_err());
assert!(catch_unwind(|| NdArray::new(
[usize::MAX / 3, usize::MAX / 3, usize::MAX / 3 + 1],
0
))
.is_err());
}
}
1 Answer 1
Nice example of const generics! In general your code is well organized and idiomatic. Some comments:
In your
new()
function, you are using an argument as a default value. It's more idiomatic to use theDefault
trait for this:pub fn new(dim: [usize; N]) -> Self where T: Clone + Default, { let cap = Self::calculate_capacity(&dim); NdArray { dim, data: vec![Default::default(); cap], } }
In some of your unit tests this means you have to use the turbofish to initialize: e.g.
NdArray::new
becomesNdArray::<usize, 2>::new
. However, this only occurs when the user does not access / modify anything in theNdArray
, and if they do that then type inference kicks in. So I recommend providing both options to the user, sayNdArray::new(dim: ...)
forT: Default
andNdArray::new_fillwith(dim: ..., default: T)
if the user wants to specify their own default value.Your structure could use some more functionality. For any collection of
T
,.len()
and.iter()
would be the minimum:pub fn len(&self) -> usize { self.data.len() } pub fn iter(&self) -> impl Iterator<Item = &T> { self.data.iter() }
What about adding two nd-arrays? Multiplying by a constant? Concatenating them along a certain dimension? Generalized matrix multiplication? Some of these will probably require certain utility functions.
- For example the following would probably be quite useful and a good exercise to write: iteration along a sub-dimensional slice of the array, like
pub fn iter_subarray(&self, coords: &[Option<usize>; N]) -> impl Iterator<Item = &T>
where if the coord isNone
then you iterate along all coordinates in that dimension, but if it'sSome(x)
then that coordinate is fixed.
- For example the following would probably be quite useful and a good exercise to write: iteration along a sub-dimensional slice of the array, like
You probably want to
#[derive(PartialEq, Eq)]
in addition toClone
andDebug
.Both the implementation and unit tests could use some handling of edge cases, in particular when the dimension is empty and when any of the dimension lengths is zero. Whether you allow these cases is up to you; I would personally allow the former but not the latter. Here's a unit test:
#[test] fn nd_array_edgecases() { let array_dim0 = NdArray::<usize, 0>::new([]); assert_eq!(array_dim0.len(), 1); assert_eq!(array_dim0[&[]], 0); let array_dim1 = NdArray::<usize, 1>::new([0]); assert_eq!(array_dim1.len(), 0); assert!(catch_unwind(|| array_dim1[&[0]]).is_err()); }
Great job with the unit tests for vector overflow, that is super useful.
In
get_flat_idx
, Clippy recommends:for (d, &dlen) in self.dim.iter().enumerate()
(then usedlen
instead ofself.dim[d]
). I agree, though in this case we still have to used
to index intoidx
.In your implementation, the field
.data
is not intended to dynamically change size. For this reason, you could initialize withVec::with_capacity(n)
instead ofVec::new()
in yournew_with()
function. You might also consider using a fixed-size or fixed-capacity array implementation, for example the ArrayVec crate.
-
\$\begingroup\$ For the naming of
new
/new_with
I followed the interface ofVec::resize
which explicitly doesn't makeDefault
special, but I agree with all of the other feedback. Thanks for the review! \$\endgroup\$apilat– apilat2021年02月25日 14:42:56 +00:00Commented Feb 25, 2021 at 14:42 -
\$\begingroup\$ @apilat That is fair. Perhaps
new
for your version andnew_default
for theDefault
version would be the best names. \$\endgroup\$Caleb Stanford– Caleb Stanford2021年02月25日 17:54:06 +00:00Commented Feb 25, 2021 at 17:54