3
\$\begingroup\$

Can you suggest how I might make the following code more efficient:

from collections import Counter
a = [1,1,3,3,1,3,2,1,4,1,6,6]
c = Counter(a)
length = len(set(c.values()))
normalisedValueCount = {}
previousCount = 0
i = 0
for key in sorted(c, reverse=True):
 count = c[key]
 if not previousCount == count:
 i = i+1
 previousCount = count
 normalisedValueCount[key] = i/length
print(normalisedValueCount)

It basically gives a dictionary similar to counter, but instead of counting the number of occurrences, it contains a weighting based on the number of occurrences.

  • The number 1 is associated with 1.0 (4/length) because it occurs the most often.
  • 2 and 4 occur the least often and are associated with the value 1/length.
  • 6 is the second least occurring value and is associated with 2/length.
  • 3 is the third least occurring value and is associated with 3/length.

Some more examples:

  • The list a[1,2,3] results in a normalisedValueCount of 1:1.0, 2:1.0, 3:1.0.
  • The list a[2,1,2] results in a normalisedValueCount of 2:1.0, 1:0.5.
  • The list a[2,1,2,3] results in a normalisedValueCount of 2:1.0, 1:0.5, 3:0.5.
  • The list a[2,2,2,2,2,2,2,2,2,2,1,2,3,3] results in a normalisedValueCount of 2:1.0, 3:0.66666, 1:0.33333.
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Sep 23, 2011 at 20:13
\$\endgroup\$
1
  • \$\begingroup\$ Your most recent edit changed the contents of the last list but didn't update the normaizedValueCounts. Are those the results you would be expecting? \$\endgroup\$ Commented Sep 25, 2011 at 19:28

1 Answer 1

1
\$\begingroup\$

I have no idea what you code is doing, it doesn't look like any normalization I've seen. I'd offer a more specific suggestion on restructuring if I understood what you are doing.

You:

  1. Put a in a counter
  2. Put that counter's values into a set
  3. sort the keys of the counter

I'd look for another approach that doesn't involve so much moving around.

if not previousCount == count:

better as

if previousCount != count:

EDIT

  1. Counter.most_common returns what you are fetching using sorted(c, reverse=True)
  2. itertools.groupby allows you group together common elements nicely (such as the same count)
  3. enumerate can be used to count over the elements in a list rather then keeping track of the counter

My code:

c = Counter(a)
length = len(set(c.values()))
counter_groups = itertools.groupby(c.most_common(), key = lambda x: x[1]))
normalisedValueCount = {}
for i, (group_key, items) in enumerate(counter_groups)
 for key, count in items:
 normalisedValueCount[key] = (i+1)/length
answered Sep 23, 2011 at 22:12
\$\endgroup\$
4
  • \$\begingroup\$ Shouldn't it be?- counter_groups = itertools.groupby(reversed(combinationsCount.most_common()), key = lambda x: x[1]) \$\endgroup\$ Commented Sep 27, 2011 at 19:55
  • \$\begingroup\$ @Baz, I may be wrong, but I don't think so. most_common returns the elements from the most common to the least common. Of course your the best judge of whether that is what you wanted. \$\endgroup\$ Commented Sep 27, 2011 at 21:41
  • \$\begingroup\$ Yes, but if the items are sorted from most to least common then the for loop section will start with the most common number when i = 0. This will result in the most common values being associated with the smaller normalisedValueCounts ((i+1)/length), which is the opposite of what we want. Thanks again! \$\endgroup\$ Commented Sep 28, 2011 at 18:50
  • \$\begingroup\$ @Baz, okay, I guess I misread the original code. Your welcome. \$\endgroup\$ Commented Sep 28, 2011 at 19:59

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.