I'm trying to convert an array of integers into their binary representations in python. I know that native python has a function called bin that does this. Numpy also has a similar function: numpy.binary_repr.
The problem is that none of these are vectorized approaches, as in, they only take one single value at a time. So, in order for me to convert a whole array of inputs, I have to use a for-loop and call these functions multiple times, which isn't very efficient.
Is there any way to perform this conversion without for-loops? Are there any vectorized forms of these functions? I've tried numpy.apply_along_axis but no luck. I've also tried using np.fromiter and map and it was also a no go.
I know similar questions have been asked a few other times (like here), but none of the answers given are actually vectorized.
Pointing me into any direction would be greatly appreciated!
Thanks =)
2 Answers 2
The easiest way is to use binary_repr with vectorize, it will preserve the original array shape, e.g.:
binary_repr_v = np.vectorize(np.binary_repr)
x = np.arange(-9, 21).reshape(3, 2, 5)
print(x)
print()
print(binary_repr_v(x, 8))
The output:
[[[-9 -8 -7 -6 -5]
[-4 -3 -2 -1 0]]
[[ 1 2 3 4 5]
[ 6 7 8 9 10]]
[[11 12 13 14 15]
[16 17 18 19 20]]]
[[['11110111' '11111000' '11111001' '11111010' '11111011']
['11111100' '11111101' '11111110' '11111111' '00000000']]
[['00000001' '00000010' '00000011' '00000100' '00000101']
['00000110' '00000111' '00001000' '00001001' '00001010']]
[['00001011' '00001100' '00001101' '00001110' '00001111']
['00010000' '00010001' '00010010' '00010011' '00010100']]]
Comments
The quickest way I've found (so far) is to use the pd.Series.apply() function.
Here are the testing results:
import pandas as pd
import numpy as np
x = np.random.randint(1,10000000,1000000)
# Fastest method
%timeit pd.Series(x).apply(bin)
# 135 ms ± 539 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# rafaelc's method
%timeit [np.binary_repr(z) for z in x]
# 725 ms ± 5.31 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# aparpara's method
binary_repr_v = np.vectorize(np.binary_repr)
%timeit binary_repr_v(x, 8)
# 7.46 s ± 24.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Comments
Explore related questions
See similar questions with these tags.
[np.binary_repr(z) for z in x]list comprehension for 1MM items and took1.41 s ± 182 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)