I have implemented an Iterator
for my Pojo
class.
The purpose here is to lazily decode multiple Pojo
s encoded in a buffer (bytes array).
To improve performance, it needs to be thread-safe to be correctly used from multiple Thread (parallel Stream
).
The buffer is formed as follows :
- The first 4 bytes represent an int indicating how many bytes the next element uses in the buffer :
n
. - The following
n
bytes are encoded representation of a singlePojo
. - 4 other bytes for the length of the next element.
- And so on ...
I think it is achieved with this code :
public class StringDecodeIterator implements Iterator<Pojo> {
private final int size;
private final PojoDecoder decoder;
private final byte[] buffer;
private int offset = 0;
private int count = 0;
private StringDecodeIterator(int size, PojoDecoder decoder, byte[] buffer) {
this.size = size;
this.decoder = decoder;
this.buffer = buffer;
}
@Override
public synchronized boolean hasNext() {
return this.count < this.size;
}
@Override
public Pojo next() {
return this.decoder.decode(offset(), this.buffer);
}
private synchronized int offset() {
if (!(this.count < this.size)) {
throw new NoSuchElementException("No more element");
}
this.count++;
int o = this.offset;
this.offset += Utils.getInt(o) + 4;
return o;
}
}
Here are some explanations/details :
size
is the number of element to decode : It is a known value; we know how manyPojo
s are encoded.PojoDecoder decoder
is an object that can decodePojo
s from a buffer at a given position.interface PojoDecoder { Pojo decode(int offset, byte[] buffer); }
Utils#getInt(int);
just converts 4 bytes to an int from the given position.PojoDecoder#decode(int, byte[])
does not need to be in the thread-safety scope. Same instance can be used from multiple threads, as it does not hold any internal state.
-
4\$\begingroup\$ I don't thing it makes sense to try to make an Iterator thread safe. An iterator expects the caller to execute hasNext() and next() in series so the thread safety relies completely on the caller putting them in the same synchronized block. So there's not much point in trying to add synchronized keywords to the interator anymore. And if you need to support parallel streams, it's better to use a Spliterator instead. \$\endgroup\$TorbenPutkonen– TorbenPutkonen2024年02月12日 06:57:15 +00:00Commented Feb 12, 2024 at 6:57
-
1\$\begingroup\$ Welcome to Code Review! I have rolled back your last edits. Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers . \$\endgroup\$Heslacher– Heslacher2024年02月12日 10:46:12 +00:00Commented Feb 12, 2024 at 10:46
1 Answer 1
In its current form, the code doesn't work:
- unless it's an inner class of some enclosing one, the constructor should not be private;
Utils.getInt(...)
needs a second param: the array to read from.
I'm guessing that both StringDecodeIterator
and Utils
were originally inner classes of the same enclosing class, that also had a reference to buffer
which was used by Utils.getInt(int)
... For the future please make sure that the code you post here works in exactly the form it is posted in: it makes reviewing much easier.
The rest are just nitpicks:
- class name: I fail to see any
String
s here: how aboutByteArrayDecodingIterator
? - generalization suggestion: not sure if it will be useful in your case, but you can make this class and the decoder interface generic, so that you can use it also for any other type apart from just currently hard-coded
Pojo
. - naming: IMO
getNextOffset()
would be slightly more clear regarding its purpose than justoffset()
. - naming:
o
is a very ambiguous name, even for a local: how aboutresultOffset
? if (!(this.count < this.size))
: why not justif (this.count >= this.size)
?- exception message: I don't think
"No more element"
brings any value given that the class of the exception isNoSuchElementException
. Also, I'm not a native speaker, but I'm pretty sure it should be"No more elements"
(plural). - not sure if it's worth having a utility method
Utils.getInt(...)
just for
ByteBuffer.wrap(buffer, offset, Integer.BYTES).getInt()
(unless this needs to be blazing fast for some reason and even creating a temporaryByteBuffer
is too much of an overhead in your situation). - magic numbers: use
Integer.BYTES
instead of4
.
-
1\$\begingroup\$ Thanks for your answer, indeed, the class was inner in the past. I've taken your comments into account. About
ByteBuffer.wrap(...)
, in fact, this is what is used in the utils method, I've encapsulated it to simplify the review \$\endgroup\$William– William2024年02月12日 10:53:20 +00:00Commented Feb 12, 2024 at 10:53 -
1\$\begingroup\$ The much, much bigger problem than all of those listed, is that it's simply impossible to provide a thread-safe iterator, since the underlying API is simply not designed for such a use case. There's an inherent race condition with the
HasNext()/Next()
sequence. \$\endgroup\$Voo– Voo2024年02月12日 14:12:02 +00:00Commented Feb 12, 2024 at 14:12 -
\$\begingroup\$ @Voo indeed, good point. This should however rather be the answer to the OP (that i will gladly upvote), than a comment to this answer, that the OP may not even notice ;-) \$\endgroup\$morgwai– morgwai2024年02月12日 16:55:38 +00:00Commented Feb 12, 2024 at 16:55