Time and space complexity of base conversion code

Question 1

I have an algorithm that performs the base conversion. The input is a string and two ints b1 and b2. The string represents an int in base b1 and converts it into base in b2.

My code is:

def convert_base(num_as_string, b1, b2):
 if num_as_string == '0' or num_as_string == '-0':
 return num_as_string
 base10, degree = 0, 0
 is_neg = False
 if num_as_string[0] == '-':
 is_neg, num_as_string = True, num_as_string[1:]
 if b1 == 10:
 base10 = int(num_as_string)
 else:
 for digit in num_as_string[::-1]:
 base10 += string.hexdigits.index(digit.lower()) * (b1 ** degree)
 degree += 1
 if b2 == 10:
 return '-' + str(base10) if is_neg else str(base10)
 converted = []
 while base10 > 0:
 digit = base10 % b2
 converted.append(digit)
 base10 //= b2
 res = ''
 for i in converted[::-1]:
 res += string.hexdigits[i].upper()
 return '-' + res if is_neg else res

Now I am trying to see if this can be improved in terms of time and space complexity. But I am not sure how to analyze the complexity of this code.

I know everything is constant before for digit in num_as_string[::-1]:. In this loop, it's just \$O(n)\$ where \$n\$ is just number of digits of the input.

Then in while base10 > 0, it runs the look while base10 becomes 0. So, this would be something like O(number of digits in base10)

Finally, on for i in converted[::-1], this would also be O(number of digits in base10).

So, I am assuming the time complexity of this code is something like O(n) + 2 * O(number of digits in base10) which is linear.

For space, I am assuming it's O(number of digits in base10) because I store it in converted list.

Are my observations correct? If not, what am I missing? Also, can this code be improved in terms of time and space complexity?

Question 2

Just a quick question to check your goal here. Do you only want suggestions for your algorithm, or do you also want general programming advice (e.g. suggestions to use python's standard library for some of the base conversions)?

Question 3

What happens if b2 is, say, 20?

Question 4

@IvoMerchiers Just for my algorithm. This is a practice for a coding interview. @Mathias Ettinger Sorry. Forgot to mention that b1 and b2 is between 2 and 16

Question 5

What if num_as_string starts with a '+' sign?

Question 6

... for digit in num_as_string[::-1]:. In this loop, it's just \$O(n)\$ where n is just number of digits of the input.

I am assuming the time complexity of this code is something like O(n) + 2 * O(number of digits in base10) which is linear.

This is not quite right. The second and third loops will loop the number of digits in base \$b_2\$ (not in base 10), which is approximately \$n * \frac {\log b_1}{\log b_2}\$ times, so your time complexity would be:

$$O(n) + 2 * \frac{\log b_1}{\log b_2} * O(n)$$

which is of course is still simply \$O(n)\$.

This also means your space complexity is not "O(number of digits in base10)"; it is O(number digits in \$b_2\$), but again, these are constant factors, and becomes simply \$O(n)\$.

Still, it is unusual to express it the complexity in terms of the number of digits of the input. Usually, you have an input value N, (which can be expressed in \$\log_{b_1}N\$ digits), and would express the complexity of the algorithm as \$O(\log N)\$.

Except ...

res = ''
for i in converted[::-1]:
 res += string.hexdigits[i].upper()

Which actually makes this an \$O(n^2)\$ algorithm, since while you are looping, you are copying all previous digits to add one character. Convert all the digits into the appropriate character, and then join them all together at once:

res = ''.join(string.hexdigits[digit] for digit in converted[::-1]).upper()

Using % b2 and //= b2 back-to-back is generally inefficient. When the math library computes one, it almost always has computed the other as a side effect. The divmod() function returns both values:

while base10 > 0:
 base10, digit = divmod(base10, b2)
 converted.append(digit)

Practice for a coding interview? You'd better clean up this code considerably. In addition to @Reinderien's suggestions, look at your two return statements

return '-' + str(base10) if is_neg else str(base10)
return '-' + res if is_neg else res

These look exactly the same, if res = str(base10). Try to rework your code to handle the is_neg test only once, and only use 1 return statement.

Question 7

whenever I have timed it, separate %= and //= statements have been faster than a divmod call. I suspect it is because the interpreter has op codes for the former and needs to do a function call for the later. However, the divmod call is usually clearer, particularly with well named variables.

Question 8

@RootTwo Interesting. My timing experiments seem to agree ... as long as you operate on integers, divmod takes about 50% longer. With either argument as a float, divmod appears to be slightly faster.

Question 9

Much of the following feedback is not really performance-related, but will make the code more easily legible and easily analyzed for the purposes of performance improvement.

Add type hints

b1 and b2 are presumably int, so add : int after their declaration. num_as_string and the function's return value are both str.

Multiple assignment

...has its limited applications, and in this case:

is_neg, num_as_string = True, num_as_string[1:]

there's not really an advantage to combining these two statements. Just do them separately, for legibility.

Exponentiation

You do this:

 base10 += string.hexdigits.index(digit.lower()) * (b1 ** degree)
 degree += 1

But rather than maintaining degree as a number that increases linearly, it's probably a better idea to maintain a number that increases by a multiplicative factor of b1 on every iteration.

Separation of concerns

Fundamentally, convert_base is doing two different things: parsing a string given a particular base into a number, and formatting a number given a particular base into a string. So just make two functions. It's more useful, testable, legible and maintainable.

Question 10

Thanks for the detailed answer. Do you think my analysis of the time and space complexity looks correct?

Question 11

Have a read through wiki.python.org/moin/TimeComplexity . num_as_string[1:] is not constant time, strictly-speaking; it's an O(k) "get slice".

Question 12

Otherwise, refer to the other answer.

AJNeufeld 35.3k5 gold badges41 silver badges103 bronze badges · Accepted Answer · 2019-08-26 23:17:36Z

... for digit in num_as_string[::-1]:. In this loop, it's just \$O(n)\$ where n is just number of digits of the input.

I am assuming the time complexity of this code is something like O(n) + 2 * O(number of digits in base10) which is linear.

This is not quite right. The second and third loops will loop the number of digits in base \$b_2\$ (not in base 10), which is approximately \$n * \frac {\log b_1}{\log b_2}\$ times, so your time complexity would be:

$$O(n) + 2 * \frac{\log b_1}{\log b_2} * O(n)$$

which is of course is still simply \$O(n)\$.

This also means your space complexity is not "O(number of digits in base10)"; it is O(number digits in \$b_2\$), but again, these are constant factors, and becomes simply \$O(n)\$.

Still, it is unusual to express it the complexity in terms of the number of digits of the input. Usually, you have an input value N, (which can be expressed in \$\log_{b_1}N\$ digits), and would express the complexity of the algorithm as \$O(\log N)\$.

Except ...

res = ''
for i in converted[::-1]:
 res += string.hexdigits[i].upper()

Which actually makes this an \$O(n^2)\$ algorithm, since while you are looping, you are copying all previous digits to add one character. Convert all the digits into the appropriate character, and then join them all together at once:

res = ''.join(string.hexdigits[digit] for digit in converted[::-1]).upper()

Using % b2 and //= b2 back-to-back is generally inefficient. When the math library computes one, it almost always has computed the other as a side effect. The divmod() function returns both values:

while base10 > 0:
 base10, digit = divmod(base10, b2)
 converted.append(digit)

Practice for a coding interview? You'd better clean up this code considerably. In addition to @Reinderien's suggestions, look at your two return statements

return '-' + str(base10) if is_neg else str(base10)
return '-' + res if is_neg else res

These look exactly the same, if res = str(base10). Try to rework your code to handle the is_neg test only once, and only use 1 return statement.

whenever I have timed it, separate %= and //= statements have been faster than a divmod call. I suspect it is because the interpreter has op codes for the former and needs to do a function call for the later. However, the divmod call is usually clearer, particularly with well named variables.
@RootTwo Interesting. My timing experiments seem to agree ... as long as you operate on integers, divmod takes about 50% longer. With either argument as a float, divmod appears to be slightly faster.

Stack Exchange Network

Time and space complexity of base conversion code

2 Answers 2

Except ...

Add type hints

Multiple assignment

Exponentiation

Separation of concerns

You must log in to answer this question.

Hot Network Questions

Time and space complexity of base conversion code

2 Answers 2

Except ...

Add type hints

Multiple assignment

Exponentiation

Separation of concerns

You must log in to answer this question.

Related

Hot Network Questions