4

I'm trying to convert an array of integers into their binary representations in python. I know that native python has a function called bin that does this. Numpy also has a similar function: numpy.binary_repr.

The problem is that none of these are vectorized approaches, as in, they only take one single value at a time. So, in order for me to convert a whole array of inputs, I have to use a for-loop and call these functions multiple times, which isn't very efficient.

Is there any way to perform this conversion without for-loops? Are there any vectorized forms of these functions? I've tried numpy.apply_along_axis but no luck. I've also tried using np.fromiter and map and it was also a no go.

I know similar questions have been asked a few other times (like here), but none of the answers given are actually vectorized.

Pointing me into any direction would be greatly appreciated!

Thanks =)

rafaelc
59.4k15 gold badges64 silver badges87 bronze badges
asked Jul 23, 2018 at 2:59
5
  • Just out of curiosity, how big is your data? Just ran simple [np.binary_repr(z) for z in x] list comprehension for 1MM items and took 1.41 s ± 182 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) Commented Jul 23, 2018 at 3:20
  • The dataset's I'll be using will likely not be that large, usually in the order of 1,000, maybe 10,000. But I'm going to have to evaluate an objective function multiple times for optimization, so I'm trying to shave off some processing time every single corner I find. Commented Jul 23, 2018 at 3:31
  • How many bits are your integers, and are they signed or unsigned? Commented Jul 23, 2018 at 4:38
  • Unsigned integers - I'll only be working with positive ints. The largest int I'll have to convert is probably 10,000. So probably 16 bits would do just fine. Commented Jul 23, 2018 at 4:41
  • Take a look at the answers to "Convert integer to binary array with suitable padding" Commented Jul 23, 2018 at 4:43

2 Answers 2

2

The easiest way is to use binary_repr with vectorize, it will preserve the original array shape, e.g.:

binary_repr_v = np.vectorize(np.binary_repr)
x = np.arange(-9, 21).reshape(3, 2, 5)
print(x)
print()
print(binary_repr_v(x, 8))

The output:

[[[-9 -8 -7 -6 -5]
 [-4 -3 -2 -1 0]]
 [[ 1 2 3 4 5]
 [ 6 7 8 9 10]]
 [[11 12 13 14 15]
 [16 17 18 19 20]]]
[[['11110111' '11111000' '11111001' '11111010' '11111011']
 ['11111100' '11111101' '11111110' '11111111' '00000000']]
 [['00000001' '00000010' '00000011' '00000100' '00000101']
 ['00000110' '00000111' '00001000' '00001001' '00001010']]
 [['00001011' '00001100' '00001101' '00001110' '00001111']
 ['00010000' '00010001' '00010010' '00010011' '00010100']]]
answered Mar 28, 2019 at 21:08
Sign up to request clarification or add additional context in comments.

Comments

2

The quickest way I've found (so far) is to use the pd.Series.apply() function.

Here are the testing results:

import pandas as pd
import numpy as np
x = np.random.randint(1,10000000,1000000)
# Fastest method
%timeit pd.Series(x).apply(bin)
# 135 ms ± 539 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# rafaelc's method
%timeit [np.binary_repr(z) for z in x]
# 725 ms ± 5.31 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# aparpara's method
binary_repr_v = np.vectorize(np.binary_repr)
%timeit binary_repr_v(x, 8)
# 7.46 s ± 24.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
answered May 11, 2021 at 16:41

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.