2
\$\begingroup\$

I solved LeetCode 137 in C++. TLDR of the problem: array of numbers, nums is given; all numbers appear 3 times, except one which appear only once. Find the number that appears once.

When trying to convert my C++ solution to python, I came across few funny things, i.e. the bit operations seem to behave a bit differently in python, on negative integers.

So, I switched to using formatting as binary string - but, to my surprise, INT_MIN (-2^32 - 1) resulted in a string of length 33, containing the sign as well. Where could I read more about this and understand why it happens?

How can I improve the code below, to make it more pythonic?

def singleNumber(self, nums: List[int]) -> int:
 INT_BASE = 33 # because of INT32_MIN
 # give python what it likes
 counts_nz = [0 for _ in range(INT_BASE)]
 vals_bit = ["0" for _ in range(INT_BASE)]
 for num in nums:
 # 33 because of INT32_MIN takes 33 bits to represent.
 for idx, bin_val in enumerate(f"{num:033b}"):
 if bin_val != "0": # can be "1" or "-"
 counts_nz[idx] += 1
 vals_bit[idx] = bin_val
 # make the bits binary string -- set to value for M3 + 1 and 0 otherwise
 bin_res = "".join(
 [
 vals_bit[idx] if count % 3 == 1 else "0"
 for idx, count in enumerate(counts_nz)
 ]
 )
 return int(bin_res, 2)
Reinderien
70.9k5 gold badges76 silver badges256 bronze badges
asked Jul 4, 2023 at 18:40
\$\endgroup\$
1
  • 1
    \$\begingroup\$ With immutable types: use the * operator instead of list comprehension. [0] * INT_BASE and ['0'] * INT_BASE \$\endgroup\$ Commented Jul 5, 2023 at 8:13

1 Answer 1

2
\$\begingroup\$

INT_MIN

Yeah, python doesn't really expose a "machine integer" concept. Page 5 of the 1974 MACLISP manual explained

it is impossible to get "overflow" in bignum arithmetic ...

Now if C code assigns int16 n = 32767 and then n++ increments, we may get the most negative int rather than 32768. Which is fine by programmers but makes mathematicians a bit cross.

In python the sign bit is entirely separate.

The bit counting trick is nice, I like it. Here is an implementation of that which treats the sign bit seperately. Also, pep-8 asks for a snake_case function name. And List[int] works but in modern interpreters we prefer a lower list[int] annotation.

import random
import unittest
from hypothesis import given
import hypothesis.strategies as st
def single_number(nums: list[int]) -> int:
 res = 0
 for bit in range(32):
 counts = sum(num < 0 for num in nums)
 mask = 1 << bit
 for num in nums:
 if abs(num) & mask:
 counts += 1
 if counts % 3 == 1:
 res |= mask
 BIAS = 2**31
 if res >= BIAS: # caller will need to see a negative result
 res -= (2 * BIAS - 1)
 return res
class TestLeet137(unittest.TestCase):
 def test_leet137(self):
 self.assertEqual(3, single_number([2, 2, 3, 2]))
 self.assertEqual(0, single_number([2, 1, 2, 1, 2, 1, 0]))
 self.assertEqual(-99, single_number([0, 1, 0, 1, 0, 1, -99]))
 def test_negative_binary(self):
 num = -6
 self.assertEqual("-00000000000000000000000000000110", f"{num:033b}")
# These constants are specified by https://leetcode.com/problems/single-number-ii
LO = -(2**31)
HI = 2**31 - 1
@given(st.lists(st.integers(min_value=LO, max_value=HI), min_size=1))
def test_leet(randoms: list[int]):
 target, *distractors = randoms
 arg = [target] + distractors * 3
 assert single_number(arg) == target
 arg.sort()
 assert single_number(arg) == target
 random.shuffle(arg)
 assert single_number(arg) == target

The loop effectively produces a 32-bit machine word, so we clean up at the end in order to properly return a negative result.

Also in the 2nd test I was highlighting the not-very-binary {"-", "0", "1"} bits you were getting from that string conversion.

This might be a good opportunity to import ctypes or fixedint.


Given that we need to apply a bias anyway, it seems more convenient to operate on strictly non-negative numbers:

def single_number(nums: list[int]) -> int:
 BIAS = 2**31
 nums = [num + BIAS for num in nums]
 assert all(num >= 0 for num in nums)
 res = 0
 for bit in range(32):
 counts = 0
 mask = 1 << bit
 for num in nums:
 if abs(num) & mask:
 counts += 1
 if counts % 3 == 1:
 res |= mask
 return res - BIAS

No loop change after the counts assignment -- we're just removing negatives up front and then re-adjusting at end. Well, ok, I suppose that pasted abs() is superfluous and should be deleted.

answered Jul 5, 2023 at 0:35
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.