I have a list of integers, e.g. i=[1,7,3,1,5]
which I first transform to a list of the respective binary representations of length L
, e.g. b=["001","111","011","001","101"]
with L=3
.
Now I want to compute at how many of the L
positions in the binary representation there is a 1
as well as a zero 0
. In my example the result would be return=2
since there is always a 1
in the third (last) position for these entries. I want to compute this inside a function with a numba decorator.
Currently my code is:
@nb.njit
def count_mixed_bits_v2(lst):
andnumber = lst[0] & lst[1]
ornumber = lst[0] | lst[1]
for i in range(1, len(lst)-1):
andnumber = andnumber & lst[i+1]
ornumber = ornumber | lst[i+1]
xornumber = andnumber ^ ornumber
result = 0
while xornumber > 0:
result += xornumber & 1
xornumber = xornumber >> 1
return result
First I take the AND of all numbers, ans also the OR of all numbers, then the XOR of those two results will have a 1 where the condition is fulfilled. In the end I count the number of 1's in the binary representation. My code seems a bit lengthy and I'm wondering if it could be more efficient as well. Thanks for any comment!
Edit: Without the numba decorator the following function works:
def count_mixed_bits(lst):
xor = reduce(and_, lst) ^ reduce(or_, lst)
return bin(xor).count("1")
(Credit to trincot)
-
1\$\begingroup\$ Please do not edit the question after you have received an answer, it is against the rules. \$\endgroup\$pacmaninbw– pacmaninbw ♦2020年11月12日 14:21:43 +00:00Commented Nov 12, 2020 at 14:21
-
2\$\begingroup\$ I've not changed my initial question, but was asked to compare the execution time of the solutions. Why is that a problem? \$\endgroup\$HighwayJohn– HighwayJohn2020年11月12日 14:25:51 +00:00Commented Nov 12, 2020 at 14:25
-
1\$\begingroup\$ @pacmaninbw Against which of those rules exactly? \$\endgroup\$Stefan Pochmann– Stefan Pochmann2020年11月12日 14:43:37 +00:00Commented Nov 12, 2020 at 14:43
-
1\$\begingroup\$ @StefanPochmann The rule is that everyone that sees the question and the answer should see the same question that the person who answered did. \$\endgroup\$pacmaninbw– pacmaninbw ♦2020年11月12日 15:24:04 +00:00Commented Nov 12, 2020 at 15:24
-
1\$\begingroup\$ "Do not change the code in the question after receiving an answer." seems to apply here. Adding the timing code (and generally all the changes, including revision 2) seem to fall under that. \$\endgroup\$Vogel612– Vogel6122020年11月12日 15:41:04 +00:00Commented Nov 12, 2020 at 15:41
2 Answers 2
I don't know numba, but here's a little rewrite:
- Shorter variable names like
and_
, using the underscore as suggested by PEP 8 ("used by convention to avoid conflicts with Python keyword") and as done byoperator.and_
. - Yours crashes if the list has fewer than two elements, I start with neutral values instead.
- Looping over the list elements rather than the indexes.
- Using augmented assignments like
&=
. - In the result loop, drop the last 1-bit so you only have as many iterations as there are 1-bits.
def count_mixed_bits(lst):
and_, or_ = ~0, 0
for x in lst:
and_ &= x
or_ |= x
xor_ = and_ ^ or_
result = 0
while xor_ > 0:
result += 1
xor_ &= xor_ - 1
return result
-
\$\begingroup\$ Oh, this is indeed really fast! \$\endgroup\$HighwayJohn– HighwayJohn2020年11月12日 13:56:37 +00:00Commented Nov 12, 2020 at 13:56
-
\$\begingroup\$ @HighwayJohn What times do you get for the various solutions, and how are you measuring? Can you share your benchmark code? \$\endgroup\$Kelly Bundy– Kelly Bundy2020年11月12日 13:59:48 +00:00Commented Nov 12, 2020 at 13:59
-
\$\begingroup\$ Yes, let me make another edit. I think my current benchmark was flawed, since it compiled the numba function each time. \$\endgroup\$HighwayJohn– HighwayJohn2020年11月12日 14:03:12 +00:00Commented Nov 12, 2020 at 14:03
-
\$\begingroup\$ Unfortunately my benchmark comparison was deleted but your code is the fastest. Thanks a lot \$\endgroup\$HighwayJohn– HighwayJohn2020年11月12日 14:26:15 +00:00Commented Nov 12, 2020 at 14:26
I only see a few micro optimisations:
- Iterate the list instead of a range, so that you don't have to do another lookup with
list[i+1]
- Use more assignment operators, such as
&=
,|=
and>>=
- It is not needed to use
lst[1]
inandnumber = lst[0] & lst[1]
. It can be justandnumber = lst[0]
So:
def count_mixed_bits_v2(lst):
andnumber = ornumber = lst[0]
for value in lst:
andnumber &= value
ornumber |= value
xornumber = andnumber ^ ornumber
result = 0
while xornumber > 0:
result += xornumber & 1
xornumber >>= 1
return result
This visits the first list value again (in the first iteration), even though it is not necessary. But that probably does not really hurt performance, and keeps the code simple.
-
2\$\begingroup\$ I feel like
functools.reduce
might be more suited to the operations. \$\endgroup\$hjpotter92– hjpotter922020年11月12日 13:05:19 +00:00Commented Nov 12, 2020 at 13:05 -
1\$\begingroup\$ Yes, I had the same reaction before. The OP asked a previous question on Stack Overflow where I answered like that, but apparently, that is not supported by numba, which only supports a subset of Python. \$\endgroup\$trincot– trincot2020年11月12日 13:10:17 +00:00Commented Nov 12, 2020 at 13:10
-
\$\begingroup\$ On numba.pydata.org/numba-doc/dev/reference/pysupported.html it says "The functools.reduce() function is supported but the initializer argument is required." But I could not get it to work. \$\endgroup\$HighwayJohn– HighwayJohn2020年11月12日 13:26:15 +00:00Commented Nov 12, 2020 at 13:26
-
\$\begingroup\$ So what did your
reduce
call look like? It should be likereduce(and_, lst, lst[0])
then \$\endgroup\$trincot– trincot2020年11月12日 13:28:27 +00:00Commented Nov 12, 2020 at 13:28 -
\$\begingroup\$ @trincot Yeah, you are right. \$\endgroup\$GZ0– GZ02020年11月12日 13:53:02 +00:00Commented Nov 12, 2020 at 13:53