Dictionary additional function implementations

Question 1

I've written a utility class BetterDict to help me on some of my other projects. I've implemented the &, +, and - operators. It extends from the base class dict so I can use all the builtin functions without having to rewrite them all. I would like some feedback on my implementation, best practices, more pythonic way of doing this, etc.

from typing import NewType
BetterDict = NewType('BetterDict', object)
class BetterDict(dict):
 
 def __and__(self, other: BetterDict) -> BetterDict:
 """
 Returns a dictionary containing only the keys that are
 present in both dicts.
 """
 result = BetterDict()
 for (sk, _), (ok, ov) in zip(self.items(), other.items()):
 if sk == ok:
 result[sk] = ov
 return result
 
 def __add__(self, other: BetterDict) -> BetterDict:
 """
 Combines two dictionaries, replacing same keys with "other" values.
 """
 result = BetterDict()
 for k, v in self.items():
 result[k] = v
 for k, v in other.items():
 result[k] = v
 return result
 
 def __sub__(self, other: BetterDict) -> BetterDict:
 """
 Returns a dictionary containing only the keys that are not
 present in both dicts.
 """
 result = BetterDict()
 keys = list(set(self.keys()).symmetric_difference(other.keys()))
 for key in keys:
 if key in self.keys():
 result[key] = self[key]
 if key in other.keys():
 result[key] = other[key]
 return result

Example Code

from better_dict import BetterDict
a = BetterDict({'a': 1})
b = BetterDict({'a': 3, 'b': 2})
print(a & b)
print(a + b)
print(a - b)

Question 2

Your implementation of __and__() does not work as claimed. For example, print(BetterDict({'z': 3, 'a': 2, 'b': 1}) & BetterDict({'a': 3, 'b': 2})) produces an empty dict. You need to figure out which keys exist in both dicts, not simply assume that items from each dict are stored in the same order (chances are, they are not).

Question 3

As @FMc already commented, your method __add__() assumes that both dicts have the same keys at the same positions, which is a very bold assumption.

Furthermore, the documentation of __add__() does not indicate, that values of other will override values of self, which may be surprising to the user (and users generally don't like to be surprised).

Some further nitpicks:

You restrict other to BetterDict, though it would also work with a plain old dict.
I don't like for (sk, _), (ok, ov) in zip(self.items(), other.items()): since you throw away the value of the first item anyway. It should read for sk, (ok, ov) in zip(self.keys(), other.items()): imho. But since this assumes, that both dicts have the same keys in the same places, this should be rewritten entirely anyways.
Instead of the NewType you should use typing.Self when referring to BetterDict within its methods.

Question 4

Interesting class.

What motivated it? You might come up with some more descriptive name than BetterDict, perhaps a name drawn from the business domain that motivated it.

comprehensions

The first two methods, {__and__, __or__}, might become just "return some dict comprehension".

C-language intersection

The builtin set operators are implemented in C rather than python bytecode. I wonder if, at least for some workloads, __and__ would go quicker if we were to use the intersection of these sets: self.keys() & other.keys()

dict addition

This is idiomatically a one-liner:

 return { **self, **other }

We have defined certain semantics, but they are not the ones the word "addition" would have called to mind. For example, adding a pair of Counter objects won't sum the counts. Adding a pair of id_to_list dicts won't yield .values() with longer lists.

Well now, this is funny. I was about to suggest to you that dict | union seems the more natural description of the semantics this offers, similar to set union. So I did a little digging around, and learned that pep-584 implemented d1 | d2 a few years back, in cPython 3.9. Notice that it doesn't commute: (d1 | d2) != (d2 | d1)

dict subtraction

I guess I'm gonna have to quibble about this notation, too.

You're explicitly relying on .symmetric_difference(), which is very nice, I'm sure it's fast, it works fine for a pair of sets s1 ^ s2.

The natural notation for your dictionary operation would be d1 ^ d2. I feel the current notation is misleading, for same reason as above with addition.

I wonder if a pair of filter() operations would go quicker than the current code, at least for some workloads.

unit tests

It wouldn't hurt to throw in an automated test suite. The three print's don't tell the Gentle Reader what the expected output is, and they won't show a Red bar if a maintenance engineer introduces a regression.

Example workloads and benchmark timings wouldn't hurt, either.

I don't agree with the notation choices, but this codebase does achieve its original design goals.

I would be willing to delegate or accept maintenance tasks on it.

Richard Neumann Richard Neumann 5262 silver badges9 bronze badges · Accepted Answer · 2023-11-27 07:23:31Z

As @FMc already commented, your method __add__() assumes that both dicts have the same keys at the same positions, which is a very bold assumption.

Furthermore, the documentation of __add__() does not indicate, that values of other will override values of self, which may be surprising to the user (and users generally don't like to be surprised).

Some further nitpicks:

You restrict other to BetterDict, though it would also work with a plain old dict.
I don't like for (sk, _), (ok, ov) in zip(self.items(), other.items()): since you throw away the value of the first item anyway. It should read for sk, (ok, ov) in zip(self.keys(), other.items()): imho. But since this assumes, that both dicts have the same keys in the same places, this should be rewritten entirely anyways.
Instead of the NewType you should use typing.Self when referring to BetterDict within its methods.

Stack Exchange Network

Dictionary additional function implementations

2 Answers 2

comprehensions

C-language intersection

dict addition

dict subtraction

unit tests

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Dictionary additional function implementations

2 Answers 2

comprehensions

C-language intersection

dict addition

dict subtraction

unit tests

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions