Return to Revisions

4 of 4

Commonmark migration

edited Jun 10, 2020 at 13:24

Style

Why make the delimiter an input parameter, but not the group length (3)? The DEFAULT_DELIMITER_CHAR is also... out of place, either you have DEFAULT values for both, or you have for none, but the mixed concept is cumbersome.

Apart from that incongruity, the code is otherwise really quite neat, and understandable.

For what it's worth, I tend to use magic numbers, and not constants, when confronted with these situations. I am aware that magic numbers are not generally welcomed, but your code here:

public static String neatify(final long number) {
 return neatify(number, DEFAULT_DELIMITER_CHAR);
}

is really not much better, or perhaps even worse, than:

public static String neatify(final long number) {
 return neatify(number, ' ');
}

That's a personal thing.

Algorithm

I dislike the char-by-char algorithm you have chosen. The reason is that, as I see it, you are working in an "arse-backwards" way. The problem is about right-referenced values. The groups-of-3 blocks are referenced to the right-side of the value, not the left, so the trick is getting the reference correct to start with, and then you don't have to play tricks to keep it correct.

The trick is to find the right 'start' point that's correct, relative to the right, but then referenced from the left... let me explain:

 final String raw = Long.toString(val);
 int end = raw.length() - span * ((raw.length() - 1) / span);
 int start = 0;

What does the above code do? It converts the long input value to a string without any spaces. It uses the length() and from that, it subtracts the full number of blocks that have to be padded.

So, the following table gives the 'input' and 'end' result:

1 -> 1
12 -> 2
123 -> 3
1234 -> 1
12345 -> 2
123456 -> 3
.....

So, the start position for the first group is at char 0, and the end position is calculated above. From there on, it is just a simple case of getting the next group, and so on.

The full code I would recommend is:

public static String neatify(final long val, final char pad, final int span) {
 if (val == Long.MIN_VALUE) {
 long minblock = (long)Math.pow(10, span);
 // recursive hot-fix for otherwise impossible-to-make-positive value.
 return neatify(val / minblock, pad, span) + pad + neatify(-(val % minblock), pad, span);
 }
 if (val < 0) {
 return "-" + neatify(-val, pad, span);
 }
 
 final String raw = Long.toString(val);
 int end = raw.length() - span * ((raw.length() - 1) / span);
 int start = 0;
 StringBuilder sb = new StringBuilder(raw.length() + (raw.length() / span) + 1);
 while (start < raw.length()) {
 sb.append(pad);
 sb.append(raw.substring(start, end));
 start = end;
 end = start + span;
 }
 return sb.substring(1);
}

Note that, for this system, I have enabled all three input parameters, and I have done some simple 1-level-down recursive tricks to handle negative values (and Long.MIN_VALUE which is a PITA).

Update - Performance

Since the question was edited to include performance as a criteria, then the following code will do essentially the same as above, but with some performance tweaks:

it won't use recursion for negatives, but instead do some more complicated sign-juggling
it keeps everything as primitives (chars, etc.) instead of Strings
it pre-computes the output sizes, etc.

It is still referenced against the right-side of the output, and has the same basic concepts, just... faster. About twice as fast, and about 20% faster than the OP's...

public static String neatifyRLP(final long val, final char pad, final int span) {
 final String raw = Long.toString(val);
 final int signlen = val < 0 ? 1 : 0;
 final int digitlen = raw.length() - signlen;
 final char[] out = new char[signlen + digitlen + (digitlen - 1) / span];
 int pos = out.length - 1;
 int src = raw.length() - 1;
 final int ospan = span + 1;
 while (pos >= signlen) {
 out[pos] = (out.length - pos) % ospan == 0 ? pad : raw.charAt(src--);
 pos--;
 }
 if (val < 0) {
 out[0] = '-';
 }
 return new String(out);
}

I am not at all certain that it is more readable, but, it is not horrible either.

Here are the performance numbers based on neatifying 1000 random long values:

Task NumberPad -> OP: (Unit: MILLISECONDS)
 Count : 10000 Average : 0.2202
 Fastest : 0.1960 Slowest : 9.8499
 95Pctile : 0.2927 99Pctile : 0.4620
 TimeBlock : 0.265 0.222 0.222 0.215 0.218 0.217 0.217 0.208 0.206 0.211
 Histogram : 9769 218 7 1 4 1
Task NumberPad -> RL: (Unit: MILLISECONDS)
 Count : 10000 Average : 0.3524
 Fastest : 0.3127 Slowest : 15.0630
 95Pctile : 0.5521 99Pctile : 0.6260
 TimeBlock : 0.416 0.368 0.360 0.340 0.366 0.347 0.330 0.334 0.328 0.335
 Histogram : 9900 88 6 2 3 1
Task NumberPad -> RLP: (Unit: MILLISECONDS)
 Count : 10000 Average : 0.1866
 Fastest : 0.1683 Slowest : 5.9859
 95Pctile : 0.2280 99Pctile : 0.4241
 TimeBlock : 0.218 0.189 0.193 0.181 0.185 0.185 0.177 0.178 0.179 0.182
 Histogram : 9826 165 3 2 2 2

answered May 11, 2015 at 15:57

rolfl

98.1k
17
219
419

default