I have a method in Java that I'm using to partition byte[] arrays into a number of fixed-size sub-arrays. It's not necessary for the last array to be that size, but all preceding ones must be, and the maximum size of a sub array is the partition size.
Here's the code I've been using:
private static byte[][] partition(byte[] in, int partitionSize) {
int partitionCount = (int)Math.ceil((double)in.length / (double) partitionSize);
byte[][] temp = new byte[partitionCount][partitionSize];
for (int i = 0; i < partitionCount; i++) {
if (in.length < (partitionSize * (i + 1))) {
temp[i] = new byte[(in.length - (partitionSize * i))];
}
for(int j = 0; (j < partitionSize && (partitionSize * i + j) < in.length); j++) {
temp[i][j] = in[(partitionSize * i + j)];
}
}
return temp;
}
Are there any obvious ways I could optimize this for speed? The partitionSize I've been using so far is 971, and even though I was expecting it to take a while for large byte arrays (around a gigabyte in size) I'm hoping to be able to squeeze out a bit more performance.
1 Answer 1
You have to use System.arraycopy()
to get best performance on array copies.
Another Small Point : Your point of 'final array can be less in size' is ignored during the 2d array creation. i.e., the second dimension cannot be partitionSize
. It has to be dynamic based on the which partition it is.
The below code is high performance and also it is taking care of this last partition size also in the final returned 2d array.
private static byte[][] partition2(byte[] in, int partitionSize)
{
int partitionCount = (int)Math.ceil((double)in.length / (double) partitionSize);
byte[][] temp = new byte[partitionCount][];
for (int p = 0; p < partitionCount; p++)
{
int start = p * partitionSize;
int len = (p != partitionCount - 1) ? partitionSize : in.length - start;
byte[] partition = new byte[len];
System.arraycopy(in, start, partition, 0, len);
temp[p] = partition;
}
return temp;
}
With this snippet, you will be able to feel the difference if the input array size is significant.
-
1\$\begingroup\$ Definitely seeing a difference! Also, just to point out, my code was handling the final array being of a different size in the if statement. \$\endgroup\$Lev Knoblock– Lev Knoblock2018年08月16日 06:02:47 +00:00Commented Aug 16, 2018 at 6:02
-
\$\begingroup\$ I transpiled this to javascript, and noticed that it is slower than one of my implementations of a similar thing in JS. Is it possible that this could be improved even more? Also note that
in.length
should be static, so we don't need to constantly retrieve it every time. \$\endgroup\$FreezePhoenix– FreezePhoenix2018年08月16日 13:08:07 +00:00Commented Aug 16, 2018 at 13:08 -
\$\begingroup\$ @FreezePhoenix We aren't constantly retrieving it \$\endgroup\$somebody– somebody2018年08月16日 13:50:46 +00:00Commented Aug 16, 2018 at 13:50
-
\$\begingroup\$ @somebody In every iteration they retrieve that value again. \$\endgroup\$FreezePhoenix– FreezePhoenix2018年08月16日 13:53:02 +00:00Commented Aug 16, 2018 at 13:53
-
\$\begingroup\$ @FreezePhoenix ... No? it resolves to
partitionSize
and never retrievesin.length
except in the last iteration... \$\endgroup\$somebody– somebody2018年08月16日 22:57:28 +00:00Commented Aug 16, 2018 at 22:57
Array.forEach
? \$\endgroup\$row * row_length + column
\$\endgroup\$