I have a structure that is organized much like a Sudoku board, where individual cells reside in an n x n region, and those regions reside in an n x n area. I need to be able to address both the regions and the individual cells within the regions separately.
When operating on every individual cell (in a particular order), I (previously) had a number of huge, unwieldy, 4-deep 'for' loops like so (traversing in a doubly row-major fashion):
for (int i_row = 0; i_row < BOUND; ++i_row) {
for (int j_row = 0; j_row < BOUND; ++j_row) {
for (int i_col = 0; i_col < BOUND; ++i_col) {
for (int j_col = 0; j_col < BOUND; ++j_col) {
// operate on cell at (i_row, i_col, j_row, j_col)
}
}
}
}
To eliminate these, I created a utility class so I could write it once and use it everywhere:
public class TraversalUtil {
// private to prevent instantiation
private TraversalUtil() {}
public static void traverseDoubleRowMajor(TraversalFunc func) {
int bound = Context.getInstance().getBound();
for (int i_row = 0; i_row < bound; ++i_row) {
for (int j_row = 0; j_row < bound; ++j_row) {
for (int i_col = 0; i_col < bound; ++i_col) {
for (int j_col = 0; j_col < bound; ++j_col) {
func.exec(i_row, i_col, j_row, j_col);
}
}
}
}
}
@FunctionalInterface
public interface TraversalFunc {
public void exec(int i_row, int i_col, int j_row, int j_col);
}
}
So usage becomes:
TraversalUtil.traverseDoubleRowMajor( (i, j, k, m) -> {
System.out.println(getCellAt(i, j, k, m)); // or whatever
});
I'm wondering if this is bad practice or if anyone has any suggestions/changes/improvements. I'm reluctant to change the '2d array of 2d arrays' structure itself because it is a dream to work with in other contexts and circumstances, so fundamental design changes to that model is a much lower priority for me.
I have run into an issue where I've needed external variables (mainly primitives) to be available to the content inside the lambda. To work around that I've been using wrapped primitives marked as final, which seems to work pretty well for me.
Any thoughts are greatly appreciated!
1 Answer 1
where individual cells reside in an n x n region, and those regions reside in an n x n area
First problem, there are no Area
or Region
or Cell
in the code provided.
I need to be able to address both the regions and the individual cells within the regions separately.
Then you should have written something like
class Area {
Region getRegion(int row, int col);
}
class Region {
Cell getCell(int row, int coll);
}
class Cell {
}
so I could write it once and use it everywhere:
Then you should write something composable. void exec(...)
is not very composable; because it returns void
it can't be pipelined. Extract Stream
s (or methods returning a value), and keep Consumer
s (or methods returning void) simple.
For example if your utility method traverseDoubleRowMajor
were something like :
Stream<Cell> cellsOf(Area area)
instead of :
TraversalUtil.traverseDoubleRowMajor( (i, j, k, m) -> {
System.out.println(getCellAt(i, j, k, m)); // or whatever
});
you could then simply do :
cellsOf(area).forEach(System.out::println);
So how could you write a cellsOfRegion
? It'd be easy if you'd first made areas and regions etc explicit as I've shown above:
private static Stream<Cell> cellsOf(Area area) {
return gridElements(BOUND, BOUND, area::getRegion)
.flatMap(region -> gridElements(BOUND, BOUND, region::getCell));
}
private static <R> Stream<R> gridElements(int maxRow, int maxCol, BiFunction<Integer, Integer, R> f) {
return IntStream.range(0, maxRow).boxed().flatMap(row -> (
IntStream.range(0, maxCol).boxed().map(col -> f.apply(row, col))));
}
EDIT
For low iterations, how big is the performance hit for setting up the pipeline for a steam?
Short answer, you should try for yourself. Longer prediction: In almost all cases IO dominates anyway. Where IO is not involved memory access should dominate. And a pipeline using streams should be faster than a pipeline that puts the intermediate results in collections.
List<A> inputElements = elementsOf(input);
List<A> result1 = pipelineStep1(inputElements);
List<A> result2 = pipelineStep2(result1);
// etc etc...
B resultN = pipelineStepN(resultNMinusOne);
should be slower than
B result = elementsOf(input)
.filter(pipelineStep1)
.flatMap(pipelineStep2)
// etc etc...
.collect(pipelineStepN);
And second, is there a way to achieve different traversal orders (i.e. not just row major)?
Sure, one solution would be:
private static <R> Stream<R> colMajor(int maxRow, int maxCol, BiFunction<Integer, Integer, R> f) {
return IntStream.range(0, maxCol).boxed().flatMap(col -> (
IntStream.range(0, maxRow).boxed().map(row -> f.apply(row, col))));
}
Or with less repetition just
private static <R> Stream<R> colMajor(int maxRow, int maxCol, BiFunction<Integer, Integer, R> f) {
return gridElements(maxCol, maxRow, (col, row) -> f.apply(row, col);
}
If you do this you may consider renaming method and parameter names accordingly. e.g. instead of names like row
and col
in gridElements
, which you now would be using for both row-major and col-major traversal; canonical names for integer function parameters/loop counters i
, j
or m
, n
etc.
-
\$\begingroup\$ Ah, I really like this approach. I do have a few questions, however. For low iterations, how big is the performance hit for setting up the pipeline for a steam? And second, is there a way to achieve different traversal orders (i.e. not just row major)? \$\endgroup\$pulse0ne– pulse0ne2015年11月16日 19:07:41 +00:00Commented Nov 16, 2015 at 19:07
-
\$\begingroup\$ And for the record I do have explicit areas and regions defined in separate classes. I omitted those pieces of code here to put the focus on reviewing the traversal methods and resulting usage \$\endgroup\$pulse0ne– pulse0ne2015年11月16日 19:20:51 +00:00Commented Nov 16, 2015 at 19:20
i_col
&i_row
variable names, expecially not in combinations with thej
variants. I think thei
throughm
reads somewhat better. \$\endgroup\$