43

I've got a list of integers and I want to be able to identify contiguous blocks of duplicates: that is, I want to produce an order-preserving list of duples where each duples contains (int_in_question, number of occurrences).

For example, if I have a list like:

[0, 0, 0, 3, 3, 2, 5, 2, 6, 6]

I want the result to be:

[(0, 3), (3, 2), (2, 1), (5, 1), (2, 1), (6, 2)]

I have a fairly simple way of doing this with a for-loop, a temp, and a counter:

result_list = []
current = source_list[0]
count = 0
for value in source_list:
 if value == current:
 count += 1
 else:
 result_list.append((current, count))
 current = value
 count = 1
result_list.append((current, count))

But I really like python's functional programming idioms, and I'd like to be able to do this with a simple generator expression. However I find it difficult to keep sub-counts when working with generators. I have a feeling a two-step process might get me there, but for now I'm stumped.

Is there a particularly elegant/pythonic way to do this, especially with generators?

Denis de Bernardy
79.2k14 gold badges138 silver badges158 bronze badges
asked Jun 15, 2011 at 2:37
1

1 Answer 1

84
>>> from itertools import groupby
>>> L = [0, 0, 0, 3, 3, 2, 5, 2, 6, 6]
>>> grouped_L = [(k, sum(1 for i in g)) for k,g in groupby(L)]
>>> # Or (k, len(list(g))), but that creates an intermediate list
>>> grouped_L
[(0, 3), (3, 2), (2, 1), (5, 1), (2, 1), (6, 2)]

Batteries included, as they say.

Suggestion for using sum and generator expression from JBernardo; see comment.

answered Jun 15, 2011 at 2:42
Sign up to request clarification or add additional context in comments.

6 Comments

+1, maybe you could change len(list(g)) for sum(1 for i in g) to avoid intermediate storage.
@JBernardo: Good suggestion, thanks. Creating a list from g has always kind of bothered me when I use groupby for this.
@JBernardo: Actually I'm gonna go with creating the intermediate list. Although perhaps doing the sum would be more efficient, I think the former is far more readable (really states exactly what we want to happen) and thus more pythonic! I do think that this "adding ones" solution is hinting at something lacking in generators, specifically that there's no way of explicitly, with a built-in function, telling how many elements will be generated. Might this be amended in the future?
@machine: It's in principle impossible. Consider: def long_gen(): while True: yield 1 What is the len of this? See: stackoverflow.com/questions/390852/…
@machine: You're welcome. I've seen this use of sum in other places but hadn't thought to use it in this case. I think it would be quickly understood by most readers.
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.