TCP Packet design

Question 1

I'm implementing a client/server app using JSON (as String) via TCP/IP.

String --> Packet --> byte[] --> tcp/ip --> byte[] --> Packet --> String

The String will be put into a Packet and sent via TCP/IP.

This way I can simply send a Packet by getting its raw byte[] and send it into the TCP/IP socket. On the other side I can simply create a new Packet by putting the received byte[] into a Packet again.

I'm asking you if the Packet class looks okay and if there is anything else wrong with it.

public class Packet {
 private final String content;
 private final byte[] raw;
 private static final int LENGTH_OFFSET = 2;
 public Packet (byte[] data) {
 if (data == null) {
 throw new IllegalArgumentException("data may not be null");
 }
 if (data.length < 2) {
 throw new IllegalArgumentException("invalid data lenght (must be at least two byte)");
 }
 byte[] lengthByte = new byte[2];
 System.arraycopy(data, 0, lengthByte, 0, LENGTH_OFFSET);
 int contentLength = getLength(lengthByte);
 if (contentLength > data.length-LENGTH_OFFSET) {
 throw new IllegalArgumentException("corrupted data - length > data");
 }
 raw = new byte[contentLength];
 System.arraycopy(data, LENGTH_OFFSET, raw, 0, contentLength);
 content = new String(raw);
 }
 public Packet (String cnt) {
 if (cnt == null) {
 throw new IllegalArgumentException("content may not be null");
 }
 if (cnt.length() < 0) {
 throw new IllegalArgumentException("content must not be empty");
 } 
 content = cnt;
 int contentLength = content.length();
 byte[] lengthByte = getLength(contentLength);
 byte[] stringData = content.getBytes();
 raw = new byte[contentLength + LENGTH_OFFSET];
 System.arraycopy(lengthByte, 0, raw, 0, LENGTH_OFFSET);
 System.arraycopy(stringData, 0, raw, LENGTH_OFFSET, contentLength); 
 }
 private static byte[] getLength(int length) {
 if (length > 0xFFFF) {
 throw new IllegalArgumentException("data is too long");
 }
 byte high = trimSignum((length & 0xFF00) >> 8);
 byte low = trimSignum((length & 0xFF)); 
 return new byte[] {high, low};
 }
 private static byte trimSignum(int byteWithSignum) {
 if (byteWithSignum > 127) {
 return (byte)(0xFF - byteWithSignum);
 }
 return (byte)byteWithSignum;
 }
 public String getContent() {
 return content;
 }
 public byte[] getRaw() {
 return raw;
 }
 private static int getLength(byte[] lb) {
 int high = (lb[0] & 0xFF) << 8;
 int low = (lb[1] & 0xFF); 
 return high + low;
 }
}

Question 2

Naming

LENGTH_OFFSET sounds like the offset at which you find the length, not the offset after the length. I'd cede that I can't think of a better name though, that doesn't sound silly...
Personally I'd prefer lengthBytes to lengthByte, since there are more than one (is this a relic of previously using 8bit lengths?).
getLength sounds like an accessor: really this is doing a translation/conversion, and the name should reflect that.
cnt isn't very meaningful, and your ArgumentExceptions all refer to content.

String Encoding

I would be worried about using String.getBytes and String(byte[]), as you are not explicitly indicating the encoding. I'm not a Java programmer, but judging by this answer on StackOverflow and the documentation, you should really be using overloads which allow you to specifying the encoding (e.g. UTF8), which will enable reliable communication between machines, and between implementations of the protocol written in different languages.

Misc

raw is the whole (length + content) packet when you using the Packet(String cnt) constructor, but doesn't include the length when you use the Packet(byte[] data) constructor.
Spelling error in exception message: "lenght"
Nice to see lots of guard clauses: would checking for negative lengths be sensible?
A few more empty lines would be appreciated to break up logical parts of the larger methods (e.g. in the byte[] constructor, first run some checks (empty line) then read the length (empty line) then read the content).

Question 3

Double Counting

Your packet class has both a String and a byte representation of its data:

private final String content;
private final byte[] raw;

This seems unnecessary to me. Unless you're expecting multiple calls to fetch the content/raw from the class then caching the answer seems unnecessary. I'd store either the String value, or the byte value, but not both. I'm on the fence as to which is the best approach since it probably depends on the environment. On the one hand, using String internally removes mutability concerns around the caller changing the byte array. On the other, the internal byte array seems more intuitive for a Packet.

Testing

You haven't included unit tests in your review. To facilitate the review, I wrote a simple test to validate that given a message, it can be encoded, decoded and end up with the same message. The test looks something like this:

String messageToSend = "This is a string to send";
Packet encodedMessage = new Packet(messageToSend);
assertThat(encodedMessage.getContent()).isEqualTo(messageToSend);
Packet decodedMessage = new Packet(encodedMessage.getRaw());
assertThat(decodedMessage.getContent()).isEqualTo(messageToSend);

This is really several tests in one, which is why it's got multiple asserts.

String Encoding + Length

As @VisualMelon said, supplying an encoding to the getBytes call is almost certainly necessary, unless both sides of the communication are on the same machine. I think it's also worth pointing out that you're assuming one-byte encoding. When you're deciding the length of the packet, you use content.length(), where content is the String. However you should really be using the length of the byte array returned from content.getBytes(StandardCharsets.UTF_8) or similar... Whilst the lengths are possibly the same (standard ASCII in the String), as soon as you start using extended characters in the String the length of the byte array and the String length start to diverge. To demonstrate it, you can try a messageToSend like "This is a string to send \u0207!".

Array copying

It feels to me like you are doing unnecessary copying of sections of the array. Most of the methods that you're calling are quite happy working from the original array, using offsets. The only reason you might want to copy the buffer is if you don't own it (you don't trust the caller not to change it after supplying it to the constructor) and you continue to store it internally in which case you want to build a single buffer of the correct size. So, instead of:

byte[] lengthByte = new byte[2];
System.arraycopy(data, 0, lengthByte, 0, LENGTH_OFFSET);
int contentLength = getLength(lengthByte);
if (contentLength > data.length-LENGTH_OFFSET) {
 throw new IllegalArgumentException("corrupted data - length > data");
}
raw = new byte[contentLength];
System.arraycopy(data, LENGTH_OFFSET, raw, 0, contentLength);
content = new String(raw);

You can do this:

int contentLength = getLength(data);
int messageLength = HEADER_SIZE + contentLength;
if (messageLength > data.length) {
 throw new IllegalArgumentException("corrupted data - length > data");
}
raw = new byte[messageLength];
System.arraycopy(data, 0, raw, 0, messageLength);
content = new String(raw, HEADER_SIZE, contentLength, StandardCharsets.UTF_8)

Few things to note: * getLength is nondestructive and just operates on the first two bytes, it doesn't really care that the byte array may be bigger, so we don't need to give it its own array to work with. * I've renamed LENGTH_OFFSET to HEADER_SIZE, which makes more sense to me since it represents the size of the length field, which is effectively the header for the packet being sent. * I've used messageLength as the size of the header + the size of the content to make subsequent statements more readable * I'm passing the complete byte array, including its length, to the String constructor, but I'm specifying a start position (after the header) and the number of bytes to use.

Bytes and Signs

I don't think this does what you think it does:

private static byte trimSignum(int byteWithSignum) {
 if (byteWithSignum > 127) {
 return (byte)(0xFF - byteWithSignum);
 }
 return (byte)byteWithSignum;
}

I think you're trying to accommodate for byte being signed and you don't want 130 to be represent by in a signed byte as '-126', but the result is that you're doing 255 - 130, which is 125. You're effectively removing the sign without putting the information anywhere (when you decode how do you know if it's 125 or 125 representing 130?). This gets worse the closer byteWithSigNum gets to 255. The function is fine just being:

return (byte)byteWithSignum;

You can demonstrate the error with:

String messageToSend = "This is a string to send 5555555555555555555555555555555555555555555555555555665555555555555555555555555555555555555555555555550";

VisualMelon VisualMelon 7,58123 silver badges46 bronze badges · Answer 1 · 2018-07-11 10:10:46Z

Naming

LENGTH_OFFSET sounds like the offset at which you find the length, not the offset after the length. I'd cede that I can't think of a better name though, that doesn't sound silly...
Personally I'd prefer lengthBytes to lengthByte, since there are more than one (is this a relic of previously using 8bit lengths?).
getLength sounds like an accessor: really this is doing a translation/conversion, and the name should reflect that.
cnt isn't very meaningful, and your ArgumentExceptions all refer to content.

String Encoding

I would be worried about using String.getBytes and String(byte[]), as you are not explicitly indicating the encoding. I'm not a Java programmer, but judging by this answer on StackOverflow and the documentation, you should really be using overloads which allow you to specifying the encoding (e.g. UTF8), which will enable reliable communication between machines, and between implementations of the protocol written in different languages.

Misc

raw is the whole (length + content) packet when you using the Packet(String cnt) constructor, but doesn't include the length when you use the Packet(byte[] data) constructor.
Spelling error in exception message: "lenght"
Nice to see lots of guard clauses: would checking for negative lengths be sensible?
A few more empty lines would be appreciated to break up logical parts of the larger methods (e.g. in the byte[] constructor, first run some checks (empty line) then read the length (empty line) then read the content).

forsvarir forsvarir 11.8k7 gold badges39 silver badges72 bronze badges · Answer 2 · 2020-03-05 10:09:32Z

Double Counting

Your packet class has both a String and a byte representation of its data:

private final String content;
private final byte[] raw;

This seems unnecessary to me. Unless you're expecting multiple calls to fetch the content/raw from the class then caching the answer seems unnecessary. I'd store either the String value, or the byte value, but not both. I'm on the fence as to which is the best approach since it probably depends on the environment. On the one hand, using String internally removes mutability concerns around the caller changing the byte array. On the other, the internal byte array seems more intuitive for a Packet.

Testing

You haven't included unit tests in your review. To facilitate the review, I wrote a simple test to validate that given a message, it can be encoded, decoded and end up with the same message. The test looks something like this:

String messageToSend = "This is a string to send";
Packet encodedMessage = new Packet(messageToSend);
assertThat(encodedMessage.getContent()).isEqualTo(messageToSend);
Packet decodedMessage = new Packet(encodedMessage.getRaw());
assertThat(decodedMessage.getContent()).isEqualTo(messageToSend);

This is really several tests in one, which is why it's got multiple asserts.

String Encoding + Length

As @VisualMelon said, supplying an encoding to the getBytes call is almost certainly necessary, unless both sides of the communication are on the same machine. I think it's also worth pointing out that you're assuming one-byte encoding. When you're deciding the length of the packet, you use content.length(), where content is the String. However you should really be using the length of the byte array returned from content.getBytes(StandardCharsets.UTF_8) or similar... Whilst the lengths are possibly the same (standard ASCII in the String), as soon as you start using extended characters in the String the length of the byte array and the String length start to diverge. To demonstrate it, you can try a messageToSend like "This is a string to send \u0207!".

Array copying

It feels to me like you are doing unnecessary copying of sections of the array. Most of the methods that you're calling are quite happy working from the original array, using offsets. The only reason you might want to copy the buffer is if you don't own it (you don't trust the caller not to change it after supplying it to the constructor) and you continue to store it internally in which case you want to build a single buffer of the correct size. So, instead of:

byte[] lengthByte = new byte[2];
System.arraycopy(data, 0, lengthByte, 0, LENGTH_OFFSET);
int contentLength = getLength(lengthByte);
if (contentLength > data.length-LENGTH_OFFSET) {
 throw new IllegalArgumentException("corrupted data - length > data");
}
raw = new byte[contentLength];
System.arraycopy(data, LENGTH_OFFSET, raw, 0, contentLength);
content = new String(raw);

You can do this:

int contentLength = getLength(data);
int messageLength = HEADER_SIZE + contentLength;
if (messageLength > data.length) {
 throw new IllegalArgumentException("corrupted data - length > data");
}
raw = new byte[messageLength];
System.arraycopy(data, 0, raw, 0, messageLength);
content = new String(raw, HEADER_SIZE, contentLength, StandardCharsets.UTF_8)

Few things to note: * getLength is nondestructive and just operates on the first two bytes, it doesn't really care that the byte array may be bigger, so we don't need to give it its own array to work with. * I've renamed LENGTH_OFFSET to HEADER_SIZE, which makes more sense to me since it represents the size of the length field, which is effectively the header for the packet being sent. * I've used messageLength as the size of the header + the size of the content to make subsequent statements more readable * I'm passing the complete byte array, including its length, to the String constructor, but I'm specifying a start position (after the header) and the number of bytes to use.

Bytes and Signs

I don't think this does what you think it does:

private static byte trimSignum(int byteWithSignum) {
 if (byteWithSignum > 127) {
 return (byte)(0xFF - byteWithSignum);
 }
 return (byte)byteWithSignum;
}

I think you're trying to accommodate for byte being signed and you don't want 130 to be represent by in a signed byte as '-126', but the result is that you're doing 255 - 130, which is 125. You're effectively removing the sign without putting the information anywhere (when you decode how do you know if it's 125 or 125 representing 130?). This gets worse the closer byteWithSigNum gets to 255. The function is fine just being:

return (byte)byteWithSignum;

You can demonstrate the error with:

String messageToSend = "This is a string to send 5555555555555555555555555555555555555555555555555555665555555555555555555555555555555555555555555555550";

Stack Exchange Network

TCP Packet design

2 Answers 2

Naming

String Encoding

Misc

Double Counting

Testing

String Encoding + Length

Array copying

Bytes and Signs

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

TCP Packet design

2 Answers 2

Naming

String Encoding

Misc

Double Counting

Testing

String Encoding + Length

Array copying

Bytes and Signs

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions