I have an algorithm that performs the base conversion. The input is a string and two ints b1
and b2
. The string represents an int in base b1
and converts it into base in b2
.
My code is:
def convert_base(num_as_string, b1, b2):
if num_as_string == '0' or num_as_string == '-0':
return num_as_string
base10, degree = 0, 0
is_neg = False
if num_as_string[0] == '-':
is_neg, num_as_string = True, num_as_string[1:]
if b1 == 10:
base10 = int(num_as_string)
else:
for digit in num_as_string[::-1]:
base10 += string.hexdigits.index(digit.lower()) * (b1 ** degree)
degree += 1
if b2 == 10:
return '-' + str(base10) if is_neg else str(base10)
converted = []
while base10 > 0:
digit = base10 % b2
converted.append(digit)
base10 //= b2
res = ''
for i in converted[::-1]:
res += string.hexdigits[i].upper()
return '-' + res if is_neg else res
Now I am trying to see if this can be improved in terms of time and space complexity. But I am not sure how to analyze the complexity of this code.
I know everything is constant before for digit in num_as_string[::-1]:
. In this loop, it's just \$O(n)\$ where \$n\$ is just number of digits of the input.
Then in while base10 > 0
, it runs the look while base10
becomes 0. So, this would be something like O(number of digits in base10)
Finally, on for i in converted[::-1]
, this would also be O(number of digits in base10)
.
So, I am assuming the time complexity of this code is something like O(n) + 2 * O(number of digits in base10)
which is linear.
For space, I am assuming it's O(number of digits in base10)
because I store it in converted
list.
Are my observations correct? If not, what am I missing? Also, can this code be improved in terms of time and space complexity?
2 Answers 2
...
for digit in num_as_string[::-1]:
. In this loop, it's just \$O(n)\$ where n is just number of digits of the input.I am assuming the time complexity of this code is something like O(n) + 2 * O(number of digits in base10) which is linear.
This is not quite right. The second and third loops will loop the number of digits in base \$b_2\$ (not in base 10), which is approximately \$n * \frac {\log b_1}{\log b_2}\$ times, so your time complexity would be:
$$O(n) + 2 * \frac{\log b_1}{\log b_2} * O(n)$$
which is of course is still simply \$O(n)\$.
This also means your space complexity is not "O(number of digits in base10)"; it is O(number digits in \$b_2\$), but again, these are constant factors, and becomes simply \$O(n)\$.
Still, it is unusual to express it the complexity in terms of the number of digits of the input. Usually, you have an input value N, (which can be expressed in \$\log_{b_1}N\$ digits), and would express the complexity of the algorithm as \$O(\log N)\$.
Except ...
res = ''
for i in converted[::-1]:
res += string.hexdigits[i].upper()
Which actually makes this an \$O(n^2)\$ algorithm, since while you are looping, you are copying all previous digits to add one character. Convert all the digits into the appropriate character, and then join them all together at once:
res = ''.join(string.hexdigits[digit] for digit in converted[::-1]).upper()
Using % b2
and //= b2
back-to-back is generally inefficient. When the math library computes one, it almost always has computed the other as a side effect. The divmod()
function returns both values:
while base10 > 0:
base10, digit = divmod(base10, b2)
converted.append(digit)
Practice for a coding interview? You'd better clean up this code considerably. In addition to @Reinderien's suggestions, look at your two return
statements
return '-' + str(base10) if is_neg else str(base10)
return '-' + res if is_neg else res
These look exactly the same, if res = str(base10)
. Try to rework your code to handle the is_neg
test only once, and only use 1 return
statement.
-
1\$\begingroup\$ whenever I have timed it, separate
%=
and//=
statements have been faster than adivmod
call. I suspect it is because the interpreter has op codes for the former and needs to do a function call for the later. However, thedivmod
call is usually clearer, particularly with well named variables. \$\endgroup\$RootTwo– RootTwo2019年08月27日 02:12:35 +00:00Commented Aug 27, 2019 at 2:12 -
\$\begingroup\$ @RootTwo Interesting. My timing experiments seem to agree ... as long as you operate on integers,
divmod
takes about 50% longer. With either argument as a float,divmod
appears to be slightly faster. \$\endgroup\$AJNeufeld– AJNeufeld2019年08月27日 20:03:34 +00:00Commented Aug 27, 2019 at 20:03
Much of the following feedback is not really performance-related, but will make the code more easily legible and easily analyzed for the purposes of performance improvement.
Add type hints
b1
and b2
are presumably int
, so add : int
after their declaration. num_as_string
and the function's return value are both str
.
Multiple assignment
...has its limited applications, and in this case:
is_neg, num_as_string = True, num_as_string[1:]
there's not really an advantage to combining these two statements. Just do them separately, for legibility.
Exponentiation
You do this:
base10 += string.hexdigits.index(digit.lower()) * (b1 ** degree)
degree += 1
But rather than maintaining degree
as a number that increases linearly, it's probably a better idea to maintain a number that increases by a multiplicative factor of b1
on every iteration.
Separation of concerns
Fundamentally, convert_base
is doing two different things: parsing a string given a particular base into a number, and formatting a number given a particular base into a string. So just make two functions. It's more useful, testable, legible and maintainable.
-
\$\begingroup\$ Thanks for the detailed answer. Do you think my analysis of the time and space complexity looks correct? \$\endgroup\$Dawn17– Dawn172019年08月26日 21:35:51 +00:00Commented Aug 26, 2019 at 21:35
-
\$\begingroup\$ Have a read through wiki.python.org/moin/TimeComplexity .
num_as_string[1:]
is not constant time, strictly-speaking; it's an O(k) "get slice". \$\endgroup\$Reinderien– Reinderien2019年08月28日 02:37:13 +00:00Commented Aug 28, 2019 at 2:37 -
\$\begingroup\$ Otherwise, refer to the other answer. \$\endgroup\$Reinderien– Reinderien2019年08月28日 02:39:48 +00:00Commented Aug 28, 2019 at 2:39
b2
is, say, 20? \$\endgroup\$b1
andb2
is between 2 and 16 \$\endgroup\$