Can you suggest how I might make the following code more efficient:
from collections import Counter
a = [1,1,3,3,1,3,2,1,4,1,6,6]
c = Counter(a)
length = len(set(c.values()))
normalisedValueCount = {}
previousCount = 0
i = 0
for key in sorted(c, reverse=True):
count = c[key]
if not previousCount == count:
i = i+1
previousCount = count
normalisedValueCount[key] = i/length
print(normalisedValueCount)
It basically gives a dictionary similar to counter, but instead of counting the number of occurrences, it contains a weighting based on the number of occurrences.
- The number 1 is associated with 1.0 (
4/length
) because it occurs the most often. - 2 and 4 occur the least often and are associated with the value
1/length
. - 6 is the second least occurring value and is associated with
2/length
. - 3 is the third least occurring value and is associated with
3/length
.
Some more examples:
- The list
a[1,2,3]
results in anormalisedValueCount
of 1:1.0, 2:1.0, 3:1.0. - The list
a[2,1,2]
results in anormalisedValueCount
of 2:1.0, 1:0.5. - The list
a[2,1,2,3]
results in anormalisedValueCount
of 2:1.0, 1:0.5, 3:0.5. - The list
a[2,2,2,2,2,2,2,2,2,2,1,2,3,3]
results in anormalisedValueCount
of 2:1.0, 3:0.66666, 1:0.33333.
1 Answer 1
I have no idea what you code is doing, it doesn't look like any normalization I've seen. I'd offer a more specific suggestion on restructuring if I understood what you are doing.
You:
- Put a in a counter
- Put that counter's values into a set
- sort the keys of the counter
I'd look for another approach that doesn't involve so much moving around.
if not previousCount == count:
better as
if previousCount != count:
EDIT
- Counter.most_common returns what you are fetching using sorted(c, reverse=True)
- itertools.groupby allows you group together common elements nicely (such as the same count)
- enumerate can be used to count over the elements in a list rather then keeping track of the counter
My code:
c = Counter(a)
length = len(set(c.values()))
counter_groups = itertools.groupby(c.most_common(), key = lambda x: x[1]))
normalisedValueCount = {}
for i, (group_key, items) in enumerate(counter_groups)
for key, count in items:
normalisedValueCount[key] = (i+1)/length
-
\$\begingroup\$ Shouldn't it be?- counter_groups = itertools.groupby(reversed(combinationsCount.most_common()), key = lambda x: x[1]) \$\endgroup\$Baz– Baz2011年09月27日 19:55:16 +00:00Commented Sep 27, 2011 at 19:55
-
\$\begingroup\$ @Baz, I may be wrong, but I don't think so.
most_common
returns the elements from the most common to the least common. Of course your the best judge of whether that is what you wanted. \$\endgroup\$Winston Ewert– Winston Ewert2011年09月27日 21:41:07 +00:00Commented Sep 27, 2011 at 21:41 -
\$\begingroup\$ Yes, but if the items are sorted from most to least common then the for loop section will start with the most common number when i = 0. This will result in the most common values being associated with the smaller normalisedValueCounts ((i+1)/length), which is the opposite of what we want. Thanks again! \$\endgroup\$Baz– Baz2011年09月28日 18:50:30 +00:00Commented Sep 28, 2011 at 18:50
-
\$\begingroup\$ @Baz, okay, I guess I misread the original code. Your welcome. \$\endgroup\$Winston Ewert– Winston Ewert2011年09月28日 19:59:21 +00:00Commented Sep 28, 2011 at 19:59
Explore related questions
See similar questions with these tags.
normaizedValueCount
s. Are those the results you would be expecting? \$\endgroup\$