Given list of pairs (email-date):
['[email protected]', '2 august 1976'],
['[email protected]', '3 august 2001'],
['[email protected]', '2 october 2001'],
['[email protected]', '2 october 2001'],
['[email protected]', '2 august 2001']]
I need to turn it into dictionary grouped by dates (one date → many emails).
I can do it with for loops:
def emails_for_one_date():
for each_entry in email_date_table:
for each_date in each_entry:
current_date_as_key = each_date
for each_entry2 in email_date_table:
list_emails = []
if each_entry[1] == current_date_as_key:
list_emails.append( each_entry[0])
date_for_emails_dict[current_date_as_key] = list_emails
print("------------------- dict here")
print(date_for_emails_dict)
But Python is such a powerful language, I want to know the 'pythonic' way of doing it! I suspect it can be one line.
3 Answers 3
A good function will should accept the data as a parameter and return the resulting data structure. Call print(emails_by_date(email_date_table))
to put it all together.
Two Pythonic techniques you want to use are:
dict.setdefault(key[, default])
, which gets rid of the need forif
, as well as the requirement for the input to be grouped chronologically- Multiple assignment in the for-loop, which removes the ugly
[0]
and[1]
indexing:If the target list is a comma-separated list of targets: The object must be an iterable with the same number of items as there are targets in the target list, and the items are assigned, from left to right, to the corresponding targets.
def emails_by_date(pairs):
ret = {}
for email, date in pairs:
ret.setdefault(date, []).append(email)
return ret
As you suspect, there is an easier way. The first thing we'll need is defaultdict
from the collections
module.
from collections import defaultdict
A defaultdict
acts like a dictionary, with the difference that whatever type is specified upon construction will be used as a default if a given key does not exist in the dictionary. For example:
>>> d = defaultdict(int)
>>> d[1]
0
>>> d
defaultdict(<class 'int'>, {1: 0})
With a normal dictionary, this would give you a KeyError
(as 1 did not exist). With a defaultdict
, it sees that 1 does not exist, and inserts an int()
(which is 0).
In our case, we're going to want a list to be the default:
d = defaultdict(list)
Given that we have a list of pairs, we can just grab each of name and date out of the list as we go:
for name, date in email_date_table:
Now we just lookup the date
key, and append the current name
:
d[date].append(name)
If d[date]
doesn't exist, it will create an empty list and append name
to it. If it does exist, it will simply give us back the existing list of names that is mapped to by that key.
The whole thing then becomes:
d = defaultdict(list)
for name, date in email_date_table:
d[date].append(name)
-
\$\begingroup\$ With
defaultdict
, be aware that simply querying the resulting data structure for non-existent dates will cause it to be littered with empty list entries. \$\endgroup\$200_success– 200_success2014年07月31日 08:16:04 +00:00Commented Jul 31, 2014 at 8:16 -
\$\begingroup\$ both answers are great - I did not know that just by creating dictionary with date as key, it puts all corresponding emails in list! \$\endgroup\$ERJAN– ERJAN2014年07月31日 09:02:01 +00:00Commented Jul 31, 2014 at 9:02
Here's the one-liner, not that it's recommended:
email_date_dict = {date: [name for name, date2 in email_date_table if date2 == date] for _, date in email_date_table}
It's inefficient (because it iterates over email_date_table
repeatedly) and not very readable but illustrates how powerful python's list and dictionary comprehensions can be.
date_for_emails_dict
as well. \$\endgroup\$