When sorting by multiple keys, I need the second key to alternate between standard and reverse sorting. This is to minimize the change in the second key. For example, if I have a list of dictionaries, I want the first sort (group key) to be sorted and then the second sort (sort key) to alternate. This should produce a sort like this:
[{'group': 'a', 'sort': 1},
{'group': 'a', 'sort': 2},
{'group': 'a', 'sort': 2},
{'group': 'b', 'sort': 2},
{'group': 'b', 'sort': 1},
{'group': 'b', 'sort': 1},
{'group': 'c', 'sort': 1},
{'group': 'c', 'sort': 1},
{'group': 'c', 'sort': 2}]
That limits the change in the field called "sort" as it alternates each time as the "group" changes.
I have looked around and did not see a standard way to accomplish this without rolling my own. I think the code is fairly straight forward but curious if I might be missing something or an easier/faster way.
Any comments would be appreciated.
#!/usr/bin/env python3
""" Misc sort routines """
from operator import itemgetter
import pprint
def alternate_sort(data, group_key, sort_key,
group_reverse=False, sort_reverse=False):
""" Sort data within groups while alternating asc/desc on sort key. """
partitions = [] # Tuples of start/end indexes per group
last_key = None # Remember when groups change
group_start = 0 # Remember start index of group
final_sort = [] # The final sorted listing
# Make sure the data is sorted by group
data = sorted(data, key=itemgetter(group_key), reverse=group_reverse)
# Gather the group partitions
for i, g in enumerate(data):
# When the group changes
if last_key != g[group_key]:
# End the previous group partition if there is one
if i > 0:
partitions.append((group_start, i))
# Start a new group partition
group_start = i
last_key = g[group_key]
# Close the last group partition
partitions.append((group_start, len(data)))
# Go through the partitions, alternate the sort each time using sort key.
for i, p in enumerate(partitions):
part = data[p[0]: p[1]]
if i % 2 == 0:
final_sort += sorted(
part, key=itemgetter(sort_key), reverse=sort_reverse)
else:
final_sort += sorted(
part, key=itemgetter(sort_key), reverse=not sort_reverse)
return final_sort
if __name__ == "__main__":
unsorted_data = [
{"group": "a", "sort" : 1},
{"group": "a", "sort" : 2},
{"group": "b", "sort" : 2},
{"group": "b", "sort" : 1},
{"group": "c", "sort" : 1},
{"group": "a", "sort" : 2},
{"group": "c", "sort" : 2},
{"group": "b", "sort" : 1},
{"group": "c", "sort" : 1}
]
sorted_data = alternate_sort(unsorted_data, "group", "sort",
group_reverse=True, sort_reverse=True)
pprint.pprint(sorted_data)
sorted_data = alternate_sort(unsorted_data, "group", "sort")
pprint.pprint(sorted_data)
-
1\$\begingroup\$ Can you edit your post to describe the problem this code solves? An example would also help. Imagine a reader who has no idea what you mean by "group key". \$\endgroup\$Gareth Rees– Gareth Rees2015年05月03日 16:48:55 +00:00Commented May 3, 2015 at 16:48
-
\$\begingroup\$ @GarethRees, changes made, hope that gives a better idea of what I am attempting to do with this code. \$\endgroup\$clutton– clutton2015年05月03日 16:58:42 +00:00Commented May 3, 2015 at 16:58
1 Answer 1
1. Quick review
This function returns a new list, like the built-in
sorted
. It does not update data in place, like thesort
method on lists. So a name containing "sorted" would give a better hint at its behaviour.The docstring is not detailed enough to explain how to call the function. What arguments do I pass for
group_key
andsort_key
? What happens if I passTrue
forgroup_reverse
orsort_reverse
?The function calls
operator.itemgetter
to make key functions. But this means that the function will only work when the keys are dictionary or sequence values. If you want to use object properties as keys, you're out of luck (that would needoperator.attrgetter
). And if you want to compute something more complicated by passing in your own function, that's no good either.I would recommend accepting any function for
group_key
andsort_key
and letting the caller useoperator.itemgetter
if that's what they need. This makes the function behave more like the built-insorted
and similar functions that takekey
arguments.Extend a list using the
extend
method, which updates the list in-place. If you use augmented assignment+=
, Python may have to create a new list and copy all the elements across.
2. Rewrite
The algorithm you've chosen is:
Sort data by the group key.
Iterate over the data, collecting the groups.
Iterate over the groups, sorting each one.
Steps 2 and 3 can be combined into one iteration using itertools.groupby
. The alternation between ascending and descending can be implemented using itertools.cycle
. These two iterations can be combined using zip
, like this:
from itertools import cycle, groupby
def sorted_alternately(data, group_key, sort_key,
group_reverse=False, sort_reverse=False):
"""Sort data on group_key (in reverse order if group_reverse=True) but
within each group of items, alternately sort in ascending and
descending order on sort_key (in descending and ascending order in
sort_reverse=True).
>>> from itertools import product
>>> from operator import itemgetter as get
>>> sorted_alternately(product(range(3), repeat=2), get(0), get(1))
[(0, 0), (0, 1), (0, 2), (1, 2), (1, 1), (1, 0), (2, 0), (2, 1), (2, 2)]
"""
data = sorted(data, key=group_key, reverse=group_reverse)
groups = groupby(data, key=group_key)
reverses = cycle((sort_reverse, not sort_reverse))
result = []
for (_, group), reverse in zip(groups, reverses):
result.extend(sorted(group, key=sort_key, reverse=reverse))
return result
-
\$\begingroup\$ Excellent! It felt like I was doing it the long (and less functional) way and I guess I was. Now off to read more of the itertools documentation. \$\endgroup\$clutton– clutton2015年05月03日 20:08:48 +00:00Commented May 3, 2015 at 20:08