Creating a bijection between integers and a set of strings

Question 1

I have a set of unique strings, I want to create a unique integer identifier for each string.

Usage I want a function to move back and forth, if I give it an integer it returns the corresponding string and vice versa.

Here is how I am doing it

def str_to_int(S):
 integers = list(range(len(S)))
 my_dict = dict(zip(S,integers))
 rev_dict = dict(zip(integers,S))
 return my_dict, rev_dict

If I need to get the integer identifier of an item of S, I need to call the function and then the appropriate returned dictionary.

I want something simpler, given an integer or a string, it knows, somehow automatically if it's an int or str and return the other identifier (i.e. if I give an int it returns the str identifier and vice versa). Is it possible to do it in a single function ? (if possible without being obliged to recreate dictionaries for each call)

Edit: I thought of doing to functions str_to_int(S:set, string:str)->int and int_to_str(S:set ,integer:int)->str but the problem is 1) that's two functions, 2) each time two dictionaries are created.

Question 2

Since what you want is something a little more complicated than what a normal dictionary can do, I think you want to encapsulate all of this in a class that behaves the way you want the dict to behave. You can make it "look like" a dict by implementing __getitem__, something like:

from typing import Dict, List, Set, Union, overload
class StringTable:
 """Associate strings with unique integer IDs."""
 def __init__(self, strings: Set[str]):
 """Initialize the string table with the given set of strings."""
 self._keys: List[str] = []
 self._ids: Dict[str, int] = {}
 for key in strings:
 self._ids[key] = len(self._keys)
 self._keys.append(key)
 @overload
 def __getitem__(self, o: int) -> str: ...
 @overload
 def __getitem__(self, o: str) -> int: ...
 def __getitem__(self, o: Union[int, str]) -> Union[str, int]:
 """Accepts either a string or int and returns its counterpart."""
 if isinstance(o, int):
 return self._keys[o]
 elif isinstance(o, str):
 return self._ids[o]
 else:
 raise TypeError("Bad argument!")
 def __len__(self) -> int:
 return len(self._keys)

Now you can use it like:

strings = {"foo", "bar", "baz"}
bijection = StringTable(strings)
for s in strings:
 print(s, bijection[s])
 assert bijection[bijection[s]] == s

etc.

Question 3

Didn't know about overload and that usage of Ellipsis, nice!

Question 4

I'm not sure why you insist on doing this only with a single function, but we surely can. Also, just pass in the prebuilt dictionaries so that you don't need to build them on every call.

def build_mapping(S):
 integers = list(range(len(S)))
 return dict(zip(S, integers)), dict(zip(integers, S))
def get_value(key, conv, rev_conv):
 return conv[key] if isinstance(key, str) else rev_conv[key]
S = ['foo', 'bar', 'baz', 'hello', 'world']
conv, rev_conv = build_mapping(S)
key = 'hello'
key2 = 3
# print "3 hello"
print(get_value(key, conv, rev_conv), get_value(key2, conv, rev_conv))

Question 5

I thought it's more readable using a single function and it's less lines of code. But, I guess, you don't recommend it. Am I right?

Question 6

If the keys are strings and ints, they can't collide, so they can go in the same dict.

strings = ['one', 'alpha', 'blue']
# mapping from strings to ints and ints to strings
two_way_dict = {}
for i,s in enumerate(strings):
 two_way_dict.update([(s,i), (i,s)])
bijection = two_way_dict.get
#example
bijection('alpha') -> 1
bijection(1) -> 'alpha'

Samwise Samwise 4,0107 silver badges14 bronze badges · Answer 1 · 2020-02-10 20:20:37Z

Since what you want is something a little more complicated than what a normal dictionary can do, I think you want to encapsulate all of this in a class that behaves the way you want the dict to behave. You can make it "look like" a dict by implementing __getitem__, something like:

from typing import Dict, List, Set, Union, overload
class StringTable:
 """Associate strings with unique integer IDs."""
 def __init__(self, strings: Set[str]):
 """Initialize the string table with the given set of strings."""
 self._keys: List[str] = []
 self._ids: Dict[str, int] = {}
 for key in strings:
 self._ids[key] = len(self._keys)
 self._keys.append(key)
 @overload
 def __getitem__(self, o: int) -> str: ...
 @overload
 def __getitem__(self, o: str) -> int: ...
 def __getitem__(self, o: Union[int, str]) -> Union[str, int]:
 """Accepts either a string or int and returns its counterpart."""
 if isinstance(o, int):
 return self._keys[o]
 elif isinstance(o, str):
 return self._ids[o]
 else:
 raise TypeError("Bad argument!")
 def __len__(self) -> int:
 return len(self._keys)

Now you can use it like:

strings = {"foo", "bar", "baz"}
bijection = StringTable(strings)
for s in strings:
 print(s, bijection[s])
 assert bijection[bijection[s]] == s

etc.

Didn't know about overload and that usage of Ellipsis, nice!

Juho Juho 3,63921 silver badges18 bronze badges · Answer 2 · 2020-02-10 19:54:53Z

I'm not sure why you insist on doing this only with a single function, but we surely can. Also, just pass in the prebuilt dictionaries so that you don't need to build them on every call.

def build_mapping(S):
 integers = list(range(len(S)))
 return dict(zip(S, integers)), dict(zip(integers, S))
def get_value(key, conv, rev_conv):
 return conv[key] if isinstance(key, str) else rev_conv[key]
S = ['foo', 'bar', 'baz', 'hello', 'world']
conv, rev_conv = build_mapping(S)
key = 'hello'
key2 = 3
# print "3 hello"
print(get_value(key, conv, rev_conv), get_value(key2, conv, rev_conv))

I thought it's more readable using a single function and it's less lines of code. But, I guess, you don't recommend it. Am I right?

RootTwo RootTwo 10.6k1 gold badge14 silver badges30 bronze badges · Answer 3 · 2020-02-11 07:15:44Z

If the keys are strings and ints, they can't collide, so they can go in the same dict.

strings = ['one', 'alpha', 'blue']
# mapping from strings to ints and ints to strings
two_way_dict = {}
for i,s in enumerate(strings):
 two_way_dict.update([(s,i), (i,s)])
bijection = two_way_dict.get
#example
bijection('alpha') -> 1
bijection(1) -> 'alpha'

Stack Exchange Network

Creating a bijection between integers and a set of strings

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Creating a bijection between integers and a set of strings

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions