Skip to main content
Code Review

Return to Answer

bytearray, not the read-only bytes object!
Source Link
AJNeufeld
  • 35.2k
  • 5
  • 41
  • 103

Finally, bit manipulation will always incur an overhead over direct indexing. Using a bytesbytearray(26) object to hold the twenty-six flags is likely faster than using a bitarray. It is no longer meeting your implied goal of storing the flags inside a single integer. It requires perhaps 22 additional bytes of memory, but does not require installation of an external package.

def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = bytesbytearray(26)
 first = ord('a')
 for ch in string:
 flag = ord(ch) - first
 if letter_flags[flag]:
 return False
 letter_flags[flag] = 1
 return True

Finally, bit manipulation will always incur an overhead over direct indexing. Using a bytes(26) object to hold the twenty-six flags is likely faster than using a bitarray. It is no longer meeting your implied goal of storing the flags inside a single integer. It requires perhaps 22 additional bytes of memory, but does not require installation of an external package.

def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = bytes(26)
 first = ord('a')
 for ch in string:
 flag = ord(ch) - first
 if letter_flags[flag]:
 return False
 letter_flags[flag] = 1
 return True

Finally, bit manipulation will always incur an overhead over direct indexing. Using a bytearray(26) object to hold the twenty-six flags is likely faster than using a bitarray. It is no longer meeting your implied goal of storing the flags inside a single integer. It requires perhaps 22 additional bytes of memory, but does not require installation of an external package.

def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = bytearray(26)
 first = ord('a')
 for ch in string:
 flag = ord(ch) - first
 if letter_flags[flag]:
 return False
 letter_flags[flag] = 1
 return True
Add explanation why bit manipulation is slow in Python; add bitarray & bytes solutions.
Source Link
AJNeufeld
  • 35.2k
  • 5
  • 41
  • 103

You are tryingtrying to use prime numbers to store flags in ana single integer, statingto indicate whether or not a lowercase letter has been seen. Using bits to store these flags in a single integer is much simpler.

def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = 0
 first = ord('a')
 for ch in string:
 flag = 1 << (ord(ch) - first)
 if letter_flags & flag:
 return False
 letter_flags |= flag
 return True

Since larger integers in Python are stored as objects on the heap, and are immutable, bit manipulation requires creating a new object when the bits of the integer are changed. As such, bit manipulation in Python is not as fast as in languages like C, C++, or Java.

There is a bitarray package which can be installed (pip install bitarray) which may be used to create mutable bit arrays. Using a bitarray instead of an integer will be much faster, yet still keeps the memory footprint of the application near its absolute minimum. Since a bitarray can be thought of as the bits of an integer, this can still be thought of as storing your "seen flags" in a single integer.

from bitarray import bitarray
def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = bitarray(26)
 letter_flags.setall(False)
 first = ord('a')
 for ch in string:
  flag = ord(ch) - first
 if letter_flags[flag]:
 return False
 letter_flags[flag] = True
 return True

Finally, bit manipulation will always incur an overhead over direct indexing. Using a bytes(26) object to hold the twenty-six flags is likely faster than using a bitarray. It is no longer meeting your implied goal of storing the flags inside a single integer. It requires perhaps 22 additional bytes of memory, but does not require installation of an external package.

def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = bytes(26)
 first = ord('a')
 for ch in string:
 flag = ord(ch) - first
 if letter_flags[flag]:
 return False
 letter_flags[flag] = 1

 return True

You are trying to use prime numbers to store flags in an integer, stating whether or not a lowercase letter has been seen. Using bits is much simpler.

def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = 0
 first = ord('a')
 for ch in string:
 flag = 1 << (ord(ch) - first)
 if letter_flags & flag:
 return False
 letter_flags |= flag
 return True

You are trying to use prime numbers to store flags in a single integer, to indicate whether or not a lowercase letter has been seen. Using bits to store these flags in a single integer is much simpler.

def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = 0
 first = ord('a')
 for ch in string:
 flag = 1 << (ord(ch) - first)
 if letter_flags & flag:
 return False
 letter_flags |= flag
 return True

Since larger integers in Python are stored as objects on the heap, and are immutable, bit manipulation requires creating a new object when the bits of the integer are changed. As such, bit manipulation in Python is not as fast as in languages like C, C++, or Java.

There is a bitarray package which can be installed (pip install bitarray) which may be used to create mutable bit arrays. Using a bitarray instead of an integer will be much faster, yet still keeps the memory footprint of the application near its absolute minimum. Since a bitarray can be thought of as the bits of an integer, this can still be thought of as storing your "seen flags" in a single integer.

from bitarray import bitarray
def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = bitarray(26)
 letter_flags.setall(False)
 first = ord('a')
 for ch in string:
  flag = ord(ch) - first
 if letter_flags[flag]:
 return False
 letter_flags[flag] = True
 return True

Finally, bit manipulation will always incur an overhead over direct indexing. Using a bytes(26) object to hold the twenty-six flags is likely faster than using a bitarray. It is no longer meeting your implied goal of storing the flags inside a single integer. It requires perhaps 22 additional bytes of memory, but does not require installation of an external package.

def is_unique(string: str) -> bool:
 """Determine if a string contains unique lowercase letters
 Returns `True` if all lowercase letters are unique, `False` otherwise.
 Calling the function with `string` containing anything other than
 lowercase letters results in undefined behaviour.
 """
 letter_flags = bytes(26)
 first = ord('a')
 for ch in string:
 flag = ord(ch) - first
 if letter_flags[flag]:
 return False
 letter_flags[flag] = 1

 return True
avoid magic constant
Source Link
AJNeufeld
  • 35.2k
  • 5
  • 41
  • 103
def is_unique(string: str) -> bool:
 # Pigeon hole principle: a string longer than 26 characters must have duplicates!
 if len(string) > 26:
 return False
 return len(string) == len(set(string))
from string import ascii_lowercase
def is_unique(string: str) -> bool:
 # Pigeon hole principle: a string longer than 26 characters must have duplicates!
 if len(string) > len(ascii_lowercase):
 return False
 return len(string) == len(set(string))
def is_unique(string: str) -> bool:
 # Pigeon hole principle: a string longer than 26 characters must have duplicates!
 if len(string) > 26:
 return False
 return len(string) == len(set(string))
from string import ascii_lowercase
def is_unique(string: str) -> bool:
 # Pigeon hole principle: a string longer than 26 characters must have duplicates!
 if len(string) > len(ascii_lowercase):
 return False
 return len(string) == len(set(string))
formatting
Source Link
AJNeufeld
  • 35.2k
  • 5
  • 41
  • 103
Loading
Source Link
AJNeufeld
  • 35.2k
  • 5
  • 41
  • 103
Loading
lang-py

AltStyle によって変換されたページ (->オリジナル) /