Return the number of bits (N) that will need to be changed in order to convert an integer(X), into another integer(Y).
Constraint: 0 <= X <= Y <= 2**128
Example:
X = 1, Y = 2 => N = 2
X = 2, Y = 3 => N = 1
X = 0, Y = 4 => N = 1
I have written following code, how to improve it?
def bit_diff(n1,n2):
cmn_d = n1 & n2
all_d = n1 | n2
diff = all_d - cmn_d
cnt = 0
while diff:
if diff & 1:
cnt+=1
diff>>=1
return cnt
print bit_diff(1,2)
3 Answers 3
cmn_d = n1 & n2 all_d = n1 | n2 diff = all_d - cmn_d
There is an operator for that:
diff = n1 ^ n2
Having found the bits that differ, you want to count the set bits of diff
, also known as the population count or the Hamming weight. The approach you use is simple and valid, but there are higher performing approaches which parallelise it (see here and following sections).
However, the standard approaches for the fastest parallel implementations without direct hardware support are based on fixed-width integers, and they typically go to 32 bits or 64 bits; since your specification says that you want to support 128-bit integers, you'll have to extrapolate, although it's not that hard.
Style-wise, you could work on your variable naming (realizing that naming is one of the two hardest things in programming).
cmn_d
could be common_digits
, all_d
all_digits
and cnt
count
. This would make it slightly easier to read
Python also has an official style-guide, PEP8, which recommends using spaces around operators, so you would write diff >>= 1
and count += 1
.
Performance wise, your code is already quite fast (not that surprising, considering that it does just bit-shifts), so I don't think a different approach is needed here, although the improvement in @PeterTaylors answer is slightly faster:
+----+----+--------+--------------+
| n1 | n2 | Harsha | Peter Taylor |
+----+----+--------+--------------+
| 1 | 2 | 322 ns | 276 ns |
| 2 | 3 | 255 ns | 202 ns |
| 0 | 4 | 389 ns | 326 ns |
+----+----+--------+--------------+
So, unless you need to run this more than a few million times per second on small cases, your implementation should be fine (with the small modification of using n1 ^ n2
).
-
\$\begingroup\$ There is nowhere near enough context in the question to say whether it's worth trying to improve performance or not. The first rule of optimisation says that it isn't, but if it does turn out to be the major bottleneck then a) the parallelisation approach should run about 16 times faster in the average case assuming uniformity; b) the benchmarks should use at least one case where
n1 ^ n2
is 128 bits long, to cover that average case. \$\endgroup\$Peter Taylor– Peter Taylor2017年04月18日 15:22:41 +00:00Commented Apr 18, 2017 at 15:22 -
\$\begingroup\$ @PeterTaylor True, in general. However, anything that runs under a micro-second is probably going to be fine (unless it is a really tight loop). But to know that it is, you need to profile it (which is usually the third rule of optimization, always profile before considering optimizations). So my post is basically saying exactly what you are saying: probably there are no optimizations needed, but I added the information that now you can decide if it is needed (do you need to perform this calculation more than a million times a second on simple cases?). \$\endgroup\$Graipher– Graipher2017年04月18日 15:28:24 +00:00Commented Apr 18, 2017 at 15:28
-
\$\begingroup\$ @PeterTaylor Re the benchmark cases, I just used the examples given by the OP, but sure, if you have a better benchmark case, I'd be happy to include it... \$\endgroup\$Graipher– Graipher2017年04月18日 15:28:37 +00:00Commented Apr 18, 2017 at 15:28
Python is slow compared to C, but it'll work:
def bit_diff(n1,n2):
i = n1 ^ n2
c = 0
while i:
i &= i - 1
c += 1
return c
print bit_diff(1,2)
This is faster:
def bit_diff(n1,n2):
i = n1 ^ n2
c = bin(i).count("1")
return c
print (bit_diff(34247483290175849302758**6500,43287543289074382905743**6500))
2506164
[Finished in 1.0s]
You can also use lookup tables if you need to do this faster, Google the terms "Hamming weight" and "popcount". SSE4 includes an instruction POPCNT, if you use C it will be very fast.
Explore related questions
See similar questions with these tags.