Collections.Counter in Python3

Question 1

Can you suggest how I might make the following code more efficient:

from collections import Counter
a = [1,1,3,3,1,3,2,1,4,1,6,6]
c = Counter(a)
length = len(set(c.values()))
normalisedValueCount = {}
previousCount = 0
i = 0
for key in sorted(c, reverse=True):
 count = c[key]
 if not previousCount == count:
 i = i+1
 previousCount = count
 normalisedValueCount[key] = i/length
print(normalisedValueCount)

It basically gives a dictionary similar to counter, but instead of counting the number of occurrences, it contains a weighting based on the number of occurrences.

The number 1 is associated with 1.0 (4/length) because it occurs the most often.
2 and 4 occur the least often and are associated with the value 1/length.
6 is the second least occurring value and is associated with 2/length.
3 is the third least occurring value and is associated with 3/length.

Some more examples:

The list a[1,2,3] results in a normalisedValueCount of 1:1.0, 2:1.0, 3:1.0.
The list a[2,1,2] results in a normalisedValueCount of 2:1.0, 1:0.5.
The list a[2,1,2,3] results in a normalisedValueCount of 2:1.0, 1:0.5, 3:0.5.
The list a[2,2,2,2,2,2,2,2,2,2,1,2,3,3] results in a normalisedValueCount of 2:1.0, 3:0.66666, 1:0.33333.

Question 2

Your most recent edit changed the contents of the last list but didn't update the normaizedValueCounts. Are those the results you would be expecting?

Question 3

I have no idea what you code is doing, it doesn't look like any normalization I've seen. I'd offer a more specific suggestion on restructuring if I understood what you are doing.

You:

Put a in a counter
Put that counter's values into a set
sort the keys of the counter

I'd look for another approach that doesn't involve so much moving around.

if not previousCount == count:

better as

if previousCount != count:

EDIT

Counter.most_common returns what you are fetching using sorted(c, reverse=True)
itertools.groupby allows you group together common elements nicely (such as the same count)
enumerate can be used to count over the elements in a list rather then keeping track of the counter

My code:

c = Counter(a)
length = len(set(c.values()))
counter_groups = itertools.groupby(c.most_common(), key = lambda x: x[1]))
normalisedValueCount = {}
for i, (group_key, items) in enumerate(counter_groups)
 for key, count in items:
 normalisedValueCount[key] = (i+1)/length

Question 4

Shouldn't it be?- counter_groups = itertools.groupby(reversed(combinationsCount.most_common()), key = lambda x: x[1])

Question 5

@Baz, I may be wrong, but I don't think so. most_common returns the elements from the most common to the least common. Of course your the best judge of whether that is what you wanted.

Question 6

Yes, but if the items are sorted from most to least common then the for loop section will start with the most common number when i = 0. This will result in the most common values being associated with the smaller normalisedValueCounts ((i+1)/length), which is the opposite of what we want. Thanks again!

Question 7

@Baz, okay, I guess I misread the original code. Your welcome.

Winston Ewert Winston Ewert 30.7k4 gold badges52 silver badges79 bronze badges · Answer 1 · 2011-09-23 22:12:45Z

I have no idea what you code is doing, it doesn't look like any normalization I've seen. I'd offer a more specific suggestion on restructuring if I understood what you are doing.

You:

Put a in a counter
Put that counter's values into a set
sort the keys of the counter

I'd look for another approach that doesn't involve so much moving around.

if not previousCount == count:

better as

if previousCount != count:

EDIT

Counter.most_common returns what you are fetching using sorted(c, reverse=True)
itertools.groupby allows you group together common elements nicely (such as the same count)
enumerate can be used to count over the elements in a list rather then keeping track of the counter

My code:

c = Counter(a)
length = len(set(c.values()))
counter_groups = itertools.groupby(c.most_common(), key = lambda x: x[1]))
normalisedValueCount = {}
for i, (group_key, items) in enumerate(counter_groups)
 for key, count in items:
 normalisedValueCount[key] = (i+1)/length

Shouldn't it be?- counter_groups = itertools.groupby(reversed(combinationsCount.most_common()), key = lambda x: x[1])
@Baz, I may be wrong, but I don't think so. most_common returns the elements from the most common to the least common. Of course your the best judge of whether that is what you wanted.
Yes, but if the items are sorted from most to least common then the for loop section will start with the most common number when i = 0. This will result in the most common values being associated with the smaller normalisedValueCounts ((i+1)/length), which is the opposite of what we want. Thanks again!
@Baz, okay, I guess I misread the original code. Your welcome.

Stack Exchange Network

Collections.Counter in Python3

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Collections.Counter in Python3

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions