Python: finding keys with unique values in a dictionary?

Question 1

I receive a dictionary as input, and want to return a list of keys for which the dictionary values are unique in the scope of that dictionary.

I will clarify with an example. Say my input is dictionary a, constructed as follows:

a = dict()
a['cat'] = 1
a['fish'] = 1
a['dog'] = 2 # <-- unique
a['bat'] = 3
a['aardvark'] = 3
a['snake'] = 4 # <-- unique
a['wallaby'] = 5
a['badger'] = 5

The result I expect is ['dog', 'snake'].

There are obvious brute force ways to achieve this, however I wondered if there's a neat Pythonian way to get the job done.

Question 2

I think efficient way if dict is too large would be

countMap = {}
for v in a.itervalues():
 countMap[v] = countMap.get(v,0) + 1
uni = [ k for k, v in a.iteritems() if countMap[v] == 1]

Question 3

It would be prettier with collections.defaultdict(int), IMO

Question 4

yes, but I would leave it so that people know what we use to do when there were no defaultdicts

Question 5

Note that this actually is a bruteforce:

l = a.values()
b = [x for x in a if l.count(a[x]) == 1]

Question 6

it will not output ['dog', 'snake']

Question 7

Isn't l.count('dog') zero? l is [3, 3, 2, 1, 4, 5, 1, 5] on my system.

Question 8

ok, i see that cobbal has already corrected the code. thanks.

Question 9

Here is a solution that only requires traversing the dict once:

def unique_values(d):
 seen = {} # dict (value, key)
 result = set() # keys with unique values
 for k,v in d.iteritems():
 if v in seen:
 result.discard(seen[v])
 else:
 seen[v] = k
 result.add(k)
 return list(result)

Question 10

If a value occurs 3 times, you will try to remove a non-existent element from result ... docs say """remove(elem) Remove element elem from the set. Raises KeyError if elem is not contained in the set."""

Question 11

Right you are! I have corrected it to use discard() instead.

Question 12

>>> b = []
>>> import collections
>>> bag = collections.defaultdict(lambda: 0)
>>> for v in a.itervalues():
... bag[v] += 1
...
>>> b = [k for (k, v) in a.iteritems() if bag[v] == 1]
>>> b.sort() # optional
>>> print b
['dog', 'snake']
>>>

Question 13

@Ryan: True but lambda: 0 is more explicit than int ... AFAICT, until defaultdict arrived [2.5], the number of persons who knew that int() produced 0 [since 2.2] instead of an exception was < epsilon and the number who had a use for that knowledge was even smaller :-)

Question 14

A little more verbose, but does need only one pass over a:

revDict = {}
for k, v in a.iteritems():
 if v in revDict:
 revDict[v] = None
 else:
 revDict[v] = k
[ x for x in revDict.itervalues() if x != None ]

( I hope it works, since I can't test it here )

Question 15

Doesn't work if one of the dictionary keys is None. For example if a is {None: 1} the output should be [None] but the above code will produce []. Also: x is not None is preferable to x != None.

Question 16

Thanks for the comment! You are totally right. In praxis, it seldom happens that None is used ... but even then, one could create some DummyObject: "Dummy = object()" instead of using None.

Question 17

What about subclassing?

class UniqueValuesDict(dict):
 def __init__(self, *args):
 dict.__init__(self, *args)
 self._inverse = {}
 def __setitem__(self, key, value):
 if value in self.values():
 if value in self._inverse:
 del self._inverse[value]
 else:
 self._inverse[value] = key
 dict.__setitem__(self, key, value)
 def unique_values(self):
 return self._inverse.values()
a = UniqueValuesDict()
a['cat'] = 1
a['fish'] = 1
a[None] = 1
a['duck'] = 1
a['dog'] = 2 # <-- unique
a['bat'] = 3
a['aardvark'] = 3
a['snake'] = 4 # <-- unique
a['wallaby'] = 5
a['badger'] = 5
assert a.unique_values() == ['dog', 'snake']

Question 18

This has the advantage of a smaller memory footprint, but you end up doing an O(N) search every time you set an item, so it's liable to be a lot slower than the dictionary tabulation method. Also, I think you could use a set for _inverse instead of a dict.

Question 19

Another problem: the OP placed no constraints on how the contents of the dict were obtained. So one would expect that del a['bat']; print a.unique_values() would cause aardvark to appear in the output but sadly it doesn't and fixing that would require even more convolutions and double__underscores :-(

Question 20

Here's another variation.

>>> import collections
>>> inverse= collections.defaultdict(list)
>>> for k,v in a.items():
... inverse[v].append(k)
... 
>>> [ v[0] for v in inverse.values() if len(v) == 1 ]
['dog', 'snake']

I'm partial to this because the inverted dictionary is such a common design pattern.

Question 21

You want [ v[0] for k,v ...] in the last line to get ['dog', 'snake'] as requested.

Question 22

(1) Instead of .items(), use .iteritems(). (2) The last line extracts the key needlessly; should be [v[0] for v in inverse.itervalues() if len(v) == 1 (3) In any case building the whole inverted dict is overkill.

Question 23

You could do something like this (just count the number of occurrences for each value):

def unique(a):
 from collections import defaultdict
 count = defaultdict(lambda: 0)
 for k, v in a.iteritems():
 count[v] += 1
 for v, c in count.iteritems():
 if c <= 1:
 yield v

Question 24

This is yielding values (2, 4) when it should yield keys ('dog', 'snake').

Question 25

I find defaultdict(int) to be a bit more clear than defaultdict(lambda:0). Since a default dict of almost any other type will simply use the type name.

Question 26

Use nested list comprehensions!

print [v[0] for v in 
 dict([(v, [k for k in a.keys() if a[k] == v])
 for v in set(a.values())]).values()
 if len(v) == 1]

Question 27

I don't see how using list comprehensions in this way is a win. For me, this just makes the solution harder to comprehend (no pun intended). Readability is key and this solution is just not that readable IMO.

Question 28

Rax asked for "a neat Pythonian way to get the job done," as opposed to "obvious" solutions to an otherwise trivial problem.

Question 29

(1) Use k in a instead of k in a.keys() (2) Use whatever.itervalues() instead of whatever.values() (3) The dict(yadda yadda) part is building the already-overkill inverse of a inefficiently (4) It's neither neat nor Python(ic|ian) ... but it's certainly not obvious! (5) Count the number of responders whose first effort at the so-called trivial problem was a stuff-up.

Question 30

This solution can be edited (using only the delete key!) to get rid of building the inverse; still O(N^2) though: print [v[0] for v in [[k for k in a if a[k] == v] for v in set(a.values())] if len(v) == 1]

score 15 · Accepted Answer · 2009-06-23 13:02:44Z

15

I think efficient way if dict is too large would be

countMap = {}
for v in a.itervalues():
 countMap[v] = countMap.get(v,0) + 1
uni = [ k for k, v in a.iteritems() if countMap[v] == 1]

Share

Improve this answer

edited Jul 29, 2011 at 0:34

answered Jun 23, 2009 at 13:02

Anurag Uniyal's user avatar

Anurag Uniyal Anurag Uniyal

89.2k41 gold badges181 silver badges222 bronze badges

2 Comments

Ryan Ginstrom

Ryan Ginstrom Over a year ago

It would be prettier with collections.defaultdict(int), IMO

2009年06月24日T01:13:36.047Z+00:00

Anurag Uniyal

Anurag Uniyal Over a year ago

yes, but I would leave it so that people know what we use to do when there were no defaultdicts

2009年06月24日T13:49:03.297Z+00:00

CollectivesTM on Stack Overflow

Python: finding keys with unique values in a dictionary?

9 Answers 9

2 Comments

3 Comments

2 Comments

1 Comment

2 Comments

2 Comments

2 Comments

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

9 Answers 9

2 Comments

3 Comments

2 Comments

1 Comment

2 Comments

2 Comments

2 Comments

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related