12
\$\begingroup\$

I'm combining two lists of dicts, based on the value of the key 'time'.

If have two lists, for example:

a=[{'time': '25 APR', 'total': 10, 'high': 10}, 
 {'time': '26 APR', 'total': 5, 'high': 5}]
b=[{'time': '24 APR', 'total': 10, 'high': 10}, 
 {'time': '26 APR', 'total': 15, 'high': 5}]

These lists contain values per day, for a specific source, say a and b.

I would like to add the dicts in the lists, based on the value of the key 'time'. So the end result would be:

c=[{'time': '24 APR', 'total': 10, 'high': 10}, 
 {'time': '25 APR', 'total': 10, 'high': 10}, 
 {'time': '26 APR', 'total': 20, 'high': 10}]

Notice that the results for 26 APR are added.

The way I do this now, is as follows:

from collections import Counter
import itertools
lst = sorted(itertools.chain(totals_30_days, past_30_days), key=lambda x: x['time'])
f = []
for k,v in itertools.groupby(lst, key=lambda x:x['time']):
 v = list(v)
 # Check if there are more than one item, because adding an empty Counter() 
 # will delete any keys with zero or negative values. 
 if len(v) > 1:
 e = Counter()
 for i in v:
 c = Counter(i)
 time = c.pop('time', None)
 e = e + c
 e['time'] = time
 f.append(dict(e))
 else:
 f.append(v[0])
print(f)

The result is correct:

[{'high': 10, 'total': 10, 'time': '24 APR'}, {'high': 10, 'total': 10, 'time': '25 APR'}, {'high': 10, 'total': 20, 'time': '26 APR26 APR'}]

But I wonder if it could be more efficient. Any ideas?

asked Apr 3, 2015 at 20:19
\$\endgroup\$
0

1 Answer 1

8
\$\begingroup\$

itertools.groupby() is a good start. To process each group, you want to take advantage of functools.reduce(). reduce() does the right thing, whether there is just one record in a group or multiple records.

Your code suffers from readability problems. One obvious issue is that a, b, c, d, e, and f are horribly meaningless variable names. Another problem is that the only way to see what it does is to trace through the code. (Well, it would help if you wrote the entire introductory text to this question as a giant comment, but ideally the code should be eloquent enough to speak for itself.)

Let's start with the goal of being able to write this:

a=[{'time': '25 APR', 'total': 10, 'high': 10}, 
 {'time': '26 APR', 'total': 5, 'high': 5}]
b=[{'time': '24 APR', 'total': 10, 'high': 10}, 
 {'time': '26 APR', 'total': 15, 'high': 5}]
merger = merge_list_of_records_by('time', add)
print(merger(a + b))

Then, it's a matter of writing a merge_list_of_records_by() function to make that happen.

from functools import reduce
from itertools import groupby
from operator import add, itemgetter
def merge_records_by(key, combine):
 """Returns a function that merges two records rec_a and rec_b.
 The records are assumed to have the same value for rec_a[key]
 and rec_b[key]. For all other keys, the values are combined
 using the specified binary operator.
 """
 return lambda rec_a, rec_b: {
 k: rec_a[k] if k == key else combine(rec_a[k], rec_b[k])
 for k in rec_a
 }
def merge_list_of_records_by(key, combine):
 """Returns a function that merges a list of records, grouped by
 the specified key, with values combined using the specified
 binary operator."""
 keyprop = itemgetter(key)
 return lambda lst: [
 reduce(merge_records_by(key, combine), records)
 for _, records in groupby(sorted(lst, key=keyprop), keyprop)
 ]
answered Apr 3, 2015 at 22:40
\$\endgroup\$
1
  • \$\begingroup\$ Hi @200_success, thanks for the answer. This works perfectly. I really like the way you keep it flexible, allowing to choose the operator. I timed both solutions, and yours is almost an order quicker. Thanks a lot. \$\endgroup\$ Commented Apr 14, 2015 at 8:09

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.