Neatly printing integers interspersed with spaces - follow-up

Question 1

At this point, I have

"digit group" length assumed to be always 3
the user can pass the delimiter character of his/her own
handles (seems to handle) the minus sign gracefully
hacked away a test in a conditional

import java.util.Scanner;
public class Main {
 /**
 * Handles <b>most</b> locales.
 */
 private static final int GROUP_LENGTH = 3;
 private static final char DEFAULT_DELIMITER_CHAR = ' ';
 public static String neatify(final long number,
 final char delimiter) {
 final char[] charArray = Long.toString(number).toCharArray();
 final StringBuilder sb = new StringBuilder();
 int index;
 if (charArray[0] == '-') {
 sb.append('-');
 index = 1;
 } else {
 index = 0;
 }
 sb.append(charArray[index++]);
 for (; index < charArray.length; ++index) {
 if ((charArray.length - index) % GROUP_LENGTH == 0) {
 sb.append(delimiter);
 }
 sb.append(charArray[index]);
 }
 return sb.toString();
 }
 public static String neatify(final long number) {
 return neatify(number, DEFAULT_DELIMITER_CHAR);
 }
 public static void main(final String... args) {
 final Scanner scanner = new Scanner(System.in);
 while (scanner.hasNextLong()) {
 System.out.println(neatify(scanner.nextLong()));
 }
 }
}

By default, performance is important for me. If you know how to squeeze just a little bit more of that, let me know!

Question 2

Style

Why make the delimiter an input parameter, but not the group length (3)? The DEFAULT_DELIMITER_CHAR is also... out of place, either you have DEFAULT values for both, or you have for none, but the mixed concept is cumbersome.

Apart from that incongruity, the code is otherwise really quite neat, and understandable.

For what it's worth, I tend to use magic numbers, and not constants, when confronted with these situations. I am aware that magic numbers are not generally welcomed, but your code here:

public static String neatify(final long number) {
 return neatify(number, DEFAULT_DELIMITER_CHAR);
}

is really not much better, or perhaps even worse, than:

public static String neatify(final long number) {
 return neatify(number, ' ');
}

That's a personal thing.

Algorithm

I dislike the char-by-char algorithm you have chosen. The reason is that, as I see it, you are working in an "arse-backwards" way. The problem is about right-referenced values. The groups-of-3 blocks are referenced to the right-side of the value, not the left, so the trick is getting the reference correct to start with, and then you don't have to play tricks to keep it correct.

The trick is to find the right 'start' point that's correct, relative to the right, but then referenced from the left... let me explain:

 final String raw = Long.toString(val);
 int end = raw.length() - span * ((raw.length() - 1) / span);
 int start = 0;

What does the above code do? It converts the long input value to a string without any spaces. It uses the length() and from that, it subtracts the full number of blocks that have to be padded.

So, the following table gives the 'input' and 'end' result:

1 -> 1
12 -> 2
123 -> 3
1234 -> 1
12345 -> 2
123456 -> 3
.....

So, the start position for the first group is at char 0, and the end position is calculated above. From there on, it is just a simple case of getting the next group, and so on.

The full code I would recommend is:

public static String neatify(final long val, final char pad, final int span) {
 if (val == Long.MIN_VALUE) {
 long minblock = (long)Math.pow(10, span);
 // recursive hot-fix for otherwise impossible-to-make-positive value.
 return neatify(val / minblock, pad, span) + pad + neatify(-(val % minblock), pad, span);
 }
 if (val < 0) {
 return "-" + neatify(-val, pad, span);
 }
 
 final String raw = Long.toString(val);
 int end = raw.length() - span * ((raw.length() - 1) / span);
 int start = 0;
 StringBuilder sb = new StringBuilder(raw.length() + (raw.length() / span) + 1);
 while (start < raw.length()) {
 sb.append(pad);
 sb.append(raw.substring(start, end));
 start = end;
 end = start + span;
 }
 return sb.substring(1);
}

Note that, for this system, I have enabled all three input parameters, and I have done some simple 1-level-down recursive tricks to handle negative values (and Long.MIN_VALUE which is a PITA).

Update - Performance

Since the question was edited to include performance as a criteria, then the following code will do essentially the same as above, but with some performance tweaks:

it won't use recursion for negatives, but instead do some more complicated sign-juggling
it keeps everything as primitives (chars, etc.) instead of Strings
it pre-computes the output sizes, etc.

It is still referenced against the right-side of the output, and has the same basic concepts, just... faster. About twice as fast, and about 20% faster than the OP's...

public static String neatifyRLP(final long val, final char pad, final int span) {
 final String raw = Long.toString(val);
 final int signlen = val < 0 ? 1 : 0;
 final int digitlen = raw.length() - signlen;
 final char[] out = new char[signlen + digitlen + (digitlen - 1) / span];
 int pos = out.length - 1;
 int src = raw.length() - 1;
 final int ospan = span + 1;
 while (pos >= signlen) {
 out[pos] = (out.length - pos) % ospan == 0 ? pad : raw.charAt(src--);
 pos--;
 }
 if (val < 0) {
 out[0] = '-';
 }
 return new String(out);
}

I am not at all certain that it is more readable, but, it is not horrible either.

Here are the performance numbers based on neatifying 1000 random long values:

Task NumberPad -> OP: (Unit: MILLISECONDS)
 Count : 10000 Average : 0.2202
 Fastest : 0.1960 Slowest : 9.8499
 95Pctile : 0.2927 99Pctile : 0.4620
 TimeBlock : 0.265 0.222 0.222 0.215 0.218 0.217 0.217 0.208 0.206 0.211
 Histogram : 9769 218 7 1 4 1
Task NumberPad -> RL: (Unit: MILLISECONDS)
 Count : 10000 Average : 0.3524
 Fastest : 0.3127 Slowest : 15.0630
 95Pctile : 0.5521 99Pctile : 0.6260
 TimeBlock : 0.416 0.368 0.360 0.340 0.366 0.347 0.330 0.334 0.328 0.335
 Histogram : 9900 88 6 2 3 1
Task NumberPad -> RLP: (Unit: MILLISECONDS)
 Count : 10000 Average : 0.1866
 Fastest : 0.1683 Slowest : 5.9859
 95Pctile : 0.2280 99Pctile : 0.4241
 TimeBlock : 0.218 0.189 0.193 0.181 0.185 0.185 0.177 0.178 0.179 0.182
 Histogram : 9826 165 3 2 2 2

Question 3

(1) On numbers 1, 2, ..., 1e6, your implementation runs 4+ times longer than mine. (2) Both routines produce the same results.

Question 4

If performance is a concern of yours, you should state it in your question.

Question 5

Ah, maan. It's inconclusive. The winner routine really depends on execution order.

Question 6

The dependence was obviously due to the fact that I had to populate the lists in order to be able to assert that the routines agree on common input. After removing it, everything turned out to be fine... for me. My routine is still (slightly) faster. ;)

Question 7

@coderodde - FYI - Gist, with results gist.github.com/rolfl/5c979b71c95a97578dea

Question 8

Small things that come to my mind are,

else {
 index = 0;
}

That part can be avoided.

Also I feel that the errors should be handled more gracefully. For instance if I pass in a value that is out of long's range, then the problem should react in a different way, than to passing a sting which cannot be converted to a long. So, why not handle the errors at the neatify() method level instead of restricting it in the main().

Also I feel delimiter should be restricted to only few values. Why not use an enum and allow the user to just restrict the choice of the delimiters.

IMO, I avoid using for loops without initialization. I feel that is what while and do-while's are for. I feel while loops in that case would offer more readability. But that's just fine if you use for loop ;)...

Hope that helps.

rolfl rolfl 98.1k17 gold badges219 silver badges419 bronze badges · Accepted Answer · 2015-05-11 15:57:02Z

Style

Why make the delimiter an input parameter, but not the group length (3)? The DEFAULT_DELIMITER_CHAR is also... out of place, either you have DEFAULT values for both, or you have for none, but the mixed concept is cumbersome.

Apart from that incongruity, the code is otherwise really quite neat, and understandable.

For what it's worth, I tend to use magic numbers, and not constants, when confronted with these situations. I am aware that magic numbers are not generally welcomed, but your code here:

public static String neatify(final long number) {
 return neatify(number, DEFAULT_DELIMITER_CHAR);
}

is really not much better, or perhaps even worse, than:

public static String neatify(final long number) {
 return neatify(number, ' ');
}

That's a personal thing.

Algorithm

I dislike the char-by-char algorithm you have chosen. The reason is that, as I see it, you are working in an "arse-backwards" way. The problem is about right-referenced values. The groups-of-3 blocks are referenced to the right-side of the value, not the left, so the trick is getting the reference correct to start with, and then you don't have to play tricks to keep it correct.

The trick is to find the right 'start' point that's correct, relative to the right, but then referenced from the left... let me explain:

 final String raw = Long.toString(val);
 int end = raw.length() - span * ((raw.length() - 1) / span);
 int start = 0;

What does the above code do? It converts the long input value to a string without any spaces. It uses the length() and from that, it subtracts the full number of blocks that have to be padded.

So, the following table gives the 'input' and 'end' result:

1 -> 1
12 -> 2
123 -> 3
1234 -> 1
12345 -> 2
123456 -> 3
.....

So, the start position for the first group is at char 0, and the end position is calculated above. From there on, it is just a simple case of getting the next group, and so on.

The full code I would recommend is:

public static String neatify(final long val, final char pad, final int span) {
 if (val == Long.MIN_VALUE) {
 long minblock = (long)Math.pow(10, span);
 // recursive hot-fix for otherwise impossible-to-make-positive value.
 return neatify(val / minblock, pad, span) + pad + neatify(-(val % minblock), pad, span);
 }
 if (val < 0) {
 return "-" + neatify(-val, pad, span);
 }
 
 final String raw = Long.toString(val);
 int end = raw.length() - span * ((raw.length() - 1) / span);
 int start = 0;
 StringBuilder sb = new StringBuilder(raw.length() + (raw.length() / span) + 1);
 while (start < raw.length()) {
 sb.append(pad);
 sb.append(raw.substring(start, end));
 start = end;
 end = start + span;
 }
 return sb.substring(1);
}

Note that, for this system, I have enabled all three input parameters, and I have done some simple 1-level-down recursive tricks to handle negative values (and Long.MIN_VALUE which is a PITA).

Update - Performance

Since the question was edited to include performance as a criteria, then the following code will do essentially the same as above, but with some performance tweaks:

it won't use recursion for negatives, but instead do some more complicated sign-juggling
it keeps everything as primitives (chars, etc.) instead of Strings
it pre-computes the output sizes, etc.

It is still referenced against the right-side of the output, and has the same basic concepts, just... faster. About twice as fast, and about 20% faster than the OP's...

public static String neatifyRLP(final long val, final char pad, final int span) {
 final String raw = Long.toString(val);
 final int signlen = val < 0 ? 1 : 0;
 final int digitlen = raw.length() - signlen;
 final char[] out = new char[signlen + digitlen + (digitlen - 1) / span];
 int pos = out.length - 1;
 int src = raw.length() - 1;
 final int ospan = span + 1;
 while (pos >= signlen) {
 out[pos] = (out.length - pos) % ospan == 0 ? pad : raw.charAt(src--);
 pos--;
 }
 if (val < 0) {
 out[0] = '-';
 }
 return new String(out);
}

I am not at all certain that it is more readable, but, it is not horrible either.

Here are the performance numbers based on neatifying 1000 random long values:

Task NumberPad -> OP: (Unit: MILLISECONDS)
 Count : 10000 Average : 0.2202
 Fastest : 0.1960 Slowest : 9.8499
 95Pctile : 0.2927 99Pctile : 0.4620
 TimeBlock : 0.265 0.222 0.222 0.215 0.218 0.217 0.217 0.208 0.206 0.211
 Histogram : 9769 218 7 1 4 1
Task NumberPad -> RL: (Unit: MILLISECONDS)
 Count : 10000 Average : 0.3524
 Fastest : 0.3127 Slowest : 15.0630
 95Pctile : 0.5521 99Pctile : 0.6260
 TimeBlock : 0.416 0.368 0.360 0.340 0.366 0.347 0.330 0.334 0.328 0.335
 Histogram : 9900 88 6 2 3 1
Task NumberPad -> RLP: (Unit: MILLISECONDS)
 Count : 10000 Average : 0.1866
 Fastest : 0.1683 Slowest : 5.9859
 95Pctile : 0.2280 99Pctile : 0.4241
 TimeBlock : 0.218 0.189 0.193 0.181 0.185 0.185 0.177 0.178 0.179 0.182
 Histogram : 9826 165 3 2 2 2

(1) On numbers 1, 2, ..., 1e6, your implementation runs 4+ times longer than mine. (2) Both routines produce the same results.
If performance is a concern of yours, you should state it in your question.
Ah, maan. It's inconclusive. The winner routine really depends on execution order.
The dependence was obviously due to the fact that I had to populate the lists in order to be able to assert that the routines agree on common input. After removing it, everything turned out to be fine... for me. My routine is still (slightly) faster. ;)
@coderodde - FYI - Gist, with results gist.github.com/rolfl/5c979b71c95a97578dea

Stack Exchange Network

Neatly printing integers interspersed with spaces - follow-up

2 Answers 2

Style

Algorithm

Update - Performance

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Neatly printing integers interspersed with spaces - follow-up

2 Answers 2

Style

Algorithm

Update - Performance

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions