4
\$\begingroup\$

I have a list of integers, e.g. i=[1,7,3,1,5] which I first transform to a list of the respective binary representations of length L, e.g. b=["001","111","011","001","101"] with L=3.

Now I want to compute at how many of the L positions in the binary representation there is a 1 as well as a zero 0. In my example the result would be return=2 since there is always a 1 in the third (last) position for these entries. I want to compute this inside a function with a numba decorator. Currently my code is:

@nb.njit
def count_mixed_bits_v2(lst):
 andnumber = lst[0] & lst[1]
 ornumber = lst[0] | lst[1]
 for i in range(1, len(lst)-1):
 andnumber = andnumber & lst[i+1]
 ornumber = ornumber | lst[i+1]
 xornumber = andnumber ^ ornumber
 result = 0
 while xornumber > 0:
 result += xornumber & 1
 xornumber = xornumber >> 1
 return result

First I take the AND of all numbers, ans also the OR of all numbers, then the XOR of those two results will have a 1 where the condition is fulfilled. In the end I count the number of 1's in the binary representation. My code seems a bit lengthy and I'm wondering if it could be more efficient as well. Thanks for any comment!

Edit: Without the numba decorator the following function works:

def count_mixed_bits(lst):
 xor = reduce(and_, lst) ^ reduce(or_, lst)
 return bin(xor).count("1")

(Credit to trincot)

pacmaninbw
26.2k13 gold badges47 silver badges113 bronze badges
asked Nov 12, 2020 at 12:08
\$\endgroup\$
7
  • 1
    \$\begingroup\$ Please do not edit the question after you have received an answer, it is against the rules. \$\endgroup\$ Commented Nov 12, 2020 at 14:21
  • 2
    \$\begingroup\$ I've not changed my initial question, but was asked to compare the execution time of the solutions. Why is that a problem? \$\endgroup\$ Commented Nov 12, 2020 at 14:25
  • 1
    \$\begingroup\$ @pacmaninbw Against which of those rules exactly? \$\endgroup\$ Commented Nov 12, 2020 at 14:43
  • 1
    \$\begingroup\$ @StefanPochmann The rule is that everyone that sees the question and the answer should see the same question that the person who answered did. \$\endgroup\$ Commented Nov 12, 2020 at 15:24
  • 1
    \$\begingroup\$ "Do not change the code in the question after receiving an answer." seems to apply here. Adding the timing code (and generally all the changes, including revision 2) seem to fall under that. \$\endgroup\$ Commented Nov 12, 2020 at 15:41

2 Answers 2

4
\$\begingroup\$

I don't know numba, but here's a little rewrite:

  • Shorter variable names like and_, using the underscore as suggested by PEP 8 ("used by convention to avoid conflicts with Python keyword") and as done by operator.and_.
  • Yours crashes if the list has fewer than two elements, I start with neutral values instead.
  • Looping over the list elements rather than the indexes.
  • Using augmented assignments like &=.
  • In the result loop, drop the last 1-bit so you only have as many iterations as there are 1-bits.
def count_mixed_bits(lst):
 and_, or_ = ~0, 0
 for x in lst:
 and_ &= x
 or_ |= x
 xor_ = and_ ^ or_
 result = 0
 while xor_ > 0:
 result += 1
 xor_ &= xor_ - 1
 return result
answered Nov 12, 2020 at 13:52
\$\endgroup\$
4
  • \$\begingroup\$ Oh, this is indeed really fast! \$\endgroup\$ Commented Nov 12, 2020 at 13:56
  • \$\begingroup\$ @HighwayJohn What times do you get for the various solutions, and how are you measuring? Can you share your benchmark code? \$\endgroup\$ Commented Nov 12, 2020 at 13:59
  • \$\begingroup\$ Yes, let me make another edit. I think my current benchmark was flawed, since it compiled the numba function each time. \$\endgroup\$ Commented Nov 12, 2020 at 14:03
  • \$\begingroup\$ Unfortunately my benchmark comparison was deleted but your code is the fastest. Thanks a lot \$\endgroup\$ Commented Nov 12, 2020 at 14:26
4
\$\begingroup\$

I only see a few micro optimisations:

  • Iterate the list instead of a range, so that you don't have to do another lookup with list[i+1]
  • Use more assignment operators, such as &=, |= and >>=
  • It is not needed to use lst[1] in andnumber = lst[0] & lst[1]. It can be just andnumber = lst[0]

So:

def count_mixed_bits_v2(lst):
 andnumber = ornumber = lst[0]
 for value in lst:
 andnumber &= value
 ornumber |= value
 xornumber = andnumber ^ ornumber
 result = 0
 while xornumber > 0:
 result += xornumber & 1
 xornumber >>= 1
 return result

This visits the first list value again (in the first iteration), even though it is not necessary. But that probably does not really hurt performance, and keeps the code simple.

answered Nov 12, 2020 at 13:01
\$\endgroup\$
5
  • 2
    \$\begingroup\$ I feel like functools.reduce might be more suited to the operations. \$\endgroup\$ Commented Nov 12, 2020 at 13:05
  • 1
    \$\begingroup\$ Yes, I had the same reaction before. The OP asked a previous question on Stack Overflow where I answered like that, but apparently, that is not supported by numba, which only supports a subset of Python. \$\endgroup\$ Commented Nov 12, 2020 at 13:10
  • \$\begingroup\$ On numba.pydata.org/numba-doc/dev/reference/pysupported.html it says "The functools.reduce() function is supported but the initializer argument is required." But I could not get it to work. \$\endgroup\$ Commented Nov 12, 2020 at 13:26
  • \$\begingroup\$ So what did your reduce call look like? It should be like reduce(and_, lst, lst[0]) then \$\endgroup\$ Commented Nov 12, 2020 at 13:28
  • \$\begingroup\$ @trincot Yeah, you are right. \$\endgroup\$ Commented Nov 12, 2020 at 13:53

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.