I'm mostly curious if there's a faster or more efficient algorithm I could use. I created this for a unique side project, as well as for an issue at work, so what the function does (e.g., return a negative int) isn't what I'm focused on.
In short, can I make this faster or more efficient?
// GetBufferLength returns the minimum length of a buffer that can hold the
// string representation of all the numbers with in the given range of
// integers, N.
// ADD will add additional space for extra characters. E.g., if one
// wanted "1,ドル2,ドル3ドル...", ADD should be 2 in order to gain the additional
// space for the comma and the dollar sign.
// Will return a negative value if N is negative.
func GetBufferLength(n, add int) int {
// Handle some edge cases
if n < 10 {
return n + (n * add)
}
if n <= 100 {
return (n*(2+add) - 9) + 1
}
// Gather the largest width of all our numbers.
if 0 < n {
// Account for 0
n++
// Find the greatest power of 10.
i, x := 1, 0
for i < n {
i *= 10
// Go doesn't fully support C-style for/while loops,
// so occasionally we have to do some janky stuff.
if i > n {
i /= 10
break
}
// Number of powers.
x++
}
// Count width of numbers.
l := 10
for 1 < x {
// Algorithm is X = W*N, where W is the current width of the
// number and N is the number of numbers (i.e., the number
// itself).
l *= ((x + add) * 10)
x--
// Since we start with X being the largest power, we need to
// subtract the previous power, otherwise we'd double up.
//
// E.g., given 1000 we'd end up with (4*1000) = 4000, and
// that would imply that _all_ numbers from 1-1000 are
// 4 digits wide. Since 0-9 are one digit, 10-99 are two.
// and 100-999 are three, we need to compute _their_ widths
// and subtract their difference from the largest power.
l -= ((x + add) * 10)
}
// Include single digit numbers
return (n-i)*(2+add) + l
}
return n
}
1 Answer 1
Note that your algorithm is not correct:
For example input n=9, add=2
it properly returns 27
(9*1 + 9*2
), but for n=10
it should be 31
(+ 2 for representing "10"
and +2 as defined by add
) but your algorithm returns 32
.
The error increases drastically as n
grows. For example n=10000, add=2
it returns 1098374
! Even if all the 10000 numbers would be 5 characters long, it would be a really high estimation of 10000*5 + 10000*2 = 70000
(instead of a million).
All the following implementations along with the benchmarking code can be found on the Go Playground. The code posted there is not an executable but a test file. Save it in a file like XX_test.go
, and run go test -bench .
to run the benchmarks.
The algorithm
Let's examine the algoritm. Since the extra space add
has to be added for each number, it always means n * add
to the number of total digits. We can incorporate this at the end, so we don't have to think about it. So let's reduce the problem and algorithm putting add
aside for now.
Example
Let's take the input number n=23456
. How can we calculate the total length of all numbers starting from 1
up to 23456
?
- The numbers whose length is 5 digits is
13456 + 1
, their length:13456*5 + 5
- Numbers with 4 digits and their length:
9000*4
- Numbers with 3 digits and their length:
900*3
- Numbers with 2 digits and their length:
90*2
- Numbers with 1 digit and their length:
9*1
Formalize
Let's formalize the algorithm:
- Subtract 1 from the first digit and add +1 to the result, and multiply the result with the digits count of this number.
- Add the total number of digits of all the numbers having less digits.
- Add extra digits specified by the
add
parameter.
Implementation
Things to note:
To determine which power of 10 preceeds the input n
, we may use a loop always multiplying the previous number by 10
. These multiplications (the powers of 10
) are constants, we don't need to do this in the function, we can precalculate them and store them in a slice beforehand. Note that for practical reasons, I will store the power of 10's -1. See below the implementation for details.
var pows = []int{0, 9, 99, 999, 9999, 99999, 999999, 9999999, 99999999, 999999999}
If we have a sorted slice of the power of 10's, we can use binary search to find the power of 10 for our input number. However as this slice is small (~10 elements), it is faster to just loop over it in a sequencial way.
The total number of digits of all numbers having the same width (same amount of digits) are also constant values. We will precalculate these and store them so we don't have to repeat this constant calculation. For the first 8 digits:
var sums = []int{0, 9, 189, 2889, 38889, 488889, 5888889, 68888889, 788888889}
To find the power of 10, it's a simple loop, something like this:
for i := 1; ; i++ {
if n <= pows[i] {
// Found it, n is lower than pow(10, i) and greater than pow(10, i-1)
}
}
The code
Once we have the power of 10 (its index is i
), the formalized algorithm:
- Subtract 1 from the first digit:
n - (pows[i-1]+1)
Add +1:
n - (pows[i-1]+1) + 1 = n - pows[i-1]
(this is why we stored powers of 10 -1)
Multiply with the digits count:
(n-pows[i-1])*i
- Add the total number of digits of all the numbers having less digits:
(n-pows[i-1])*i + sums[i-1]
- Add extra digits specified by the
add
parameter:
(n-pows[i-1])*i + sums[i-1] + n*add
And that's all! It's extremely short and compact! It contains 2 multiplications, no divisions, and maybe like 10 additions and subtractions!
The complete code:
var sums = []int{0, 9, 189, 2889, 38889, 488889, 5888889, 68888889, 788888889}
var pows = []int{0, 9, 99, 999, 9999, 99999, 999999, 9999999, 99999999, 999999999}
// n must be in the range of 1..999999999
func GetBufLen(n, add int) int {
for i := 1; ; i++ {
if n <= pows[i] {
return (n-pows[i-1])*i + sums[i-1] + n*add
}
}
}
Further optimization for small numbers
If the GetBufLen()
is called frequently with small numbers, it is profitable to put some additional code which "hardcodes" simplified handling of these cases. Here it is:
// n must be in the range of 1..999999999
func GetBufLenOpt(n, add int) int {
if n < 10 {
return n + n*add
}
if n < 100 {
return n*2 - 9 + n*add
}
for i := 3; ; i++ {
if n <= pows[i] {
return (n-pows[i-1])*i + sums[i-1] + n*add
}
}
}
Benchmarking, speed analysis
Let's compare the speed of this algorithm with yours. We'll write a short benchmarking code, one for each of the functions (all look like this, except they call the appropriate function):
func BenchmarkLen(b *testing.B) {
for i := 0; i < b.N; i++ {
for j := 1; j < testUpTo; j++ {
GetBufLen(j, 2)
}
}
}
Benchmark results
testUpTo = 1000: ~21% faster
BenchmarkLen 200000 7750 ns/op
BenchmarkLenOpt 200000 6365 ns/op
BenchmarkLen2 200000 8120 ns/op
testUpTo = 100000: ~26% faster
BenchmarkLen 2000 1025102 ns/op
BenchmarkLenOpt 2000 974097 ns/op
BenchmarkLen2 1000 1328132 ns/op
testUpTo = 1000000: ~28% faster
BenchmarkLen 100 11591159 ns/op
BenchmarkLenOpt 100 11691169 ns/op
BenchmarkLen2 100 16331633 ns/op
-
1\$\begingroup\$ Urgh, thanks. I'm not sure how I 1) missed your answer, and 2) came up with such ugly code. Thanks. \$\endgroup\$Eric Lagergren– Eric Lagergren2015年10月12日 19:32:50 +00:00Commented Oct 12, 2015 at 19:32
for
loops. What's wrong withx := 0; for i := 10; i <= n; i*=10 {x++}
? That should give you the samex
value you're calculating. Presumably you could also usex := int(math.Log10(float64(n)))
instead of a loop. \$\endgroup\$i
to 1000, when it really should be100
.i
is used later in the function as well. \$\endgroup\$