I wrote a solution to the leetcode problem Sort characters by frequency and the solution passes 34/35 test cases. On the last test case, there is a "time limit exceeded" error with an input string of 100000+ "a" and "b" characters. How can the following code be optimized to handle an input of that size?
def frequencySort(s):
"""
:type s: str
:rtype: str
"""
freq = {}
for i in s:
if i in freq:
freq[i] +=1
else:
freq[i] = 1
output = ""
newDict = sorted(freq.items(), key=lambda kv: kv[1],reverse = True)
for k,v in newDict:
for i in range(v):
output += k
return output
3 Answers 3
The "set or increment dictionary value" logic
freq = {}
for i in s:
if i in freq:
freq[i] +=1
else:
freq[i] = 1
can be simplified using a defaultdict
:
from collections import defaultdict
# ...
freq = defaultdict(int)
for i in s:
freq[i] += 1
But instead of creating, updating and sorting your own dictionary you can
use the built-in Counter
class:
A counter tool is provided to support convenient and rapid tallies.
and its most_common()
method:
Return a list of the n most common elements and their counts from the most common to the least.
That simplifies the code to
from collections import Counter
def frequencySort(s):
"""
:type s: str
:rtype: str
"""
counter = Counter(s)
output = "".join(char * freq for char, freq in counter.most_common())
return output
-
1\$\begingroup\$ I'm new here, and don't yet know the local customs of codereview.SE. Would you normally also make PEP8 improvements by putting the function name in lower case and adding another newline before the def? \$\endgroup\$Oddthinking– Oddthinking2018年09月04日 02:31:15 +00:00Commented Sep 4, 2018 at 2:31
-
2\$\begingroup\$ @Oddthinking: Yes, and PEP8 improvements are in fact part of many answers to Python questions on CR. In this particular case, the function name is given by the LeetCode submission template, so that OP cannot change it. (But it would still be a valid point for a review. You can write an answer if you want.) – The missing newline before the def was my fault (there is no import statement in the original code). \$\endgroup\$Martin R– Martin R2018年09月04日 06:24:08 +00:00Commented Sep 4, 2018 at 6:24
-
\$\begingroup\$
join
is a bit faster when provided with alist
rather than agenerator
, so I would write"".join([...)
. Other than that, excellent answer, +1. \$\endgroup\$Ma0– Ma02018年09月04日 07:33:58 +00:00Commented Sep 4, 2018 at 7:33 -
\$\begingroup\$ If you prefer functional style you can also use
itertools.starmap
andoperator.mul
(orstr.__mul__
):return "".join(starmap(operator.mul, Counter(s).most_common()))
. \$\endgroup\$David Foerster– David Foerster2018年09月04日 09:26:39 +00:00Commented Sep 4, 2018 at 9:26
The section:
for i in range(v):
output += k
Can be rewritten as:
output += k*v
So that characters can be appended to the output in chunks this is much faster then doing it character by character
I think your code is getting a timeout because of an O(n^2) on the last part.
for k,v in newDict:
for i in range(v):
output += k
The solution below maps every item in the ordered dict object to an ordered tuple, which is then much easier to iterate:
def solution(st):
dict_ = {}
for s in st:
if s in dict_:
dict_[s] += 1
else:
dict_[s] = 1
items = sorted(dict_.items(), key=lambda kv: kv[1], reverse=True)
string = ""
for item in items:
If item[1] != 0:
string = string + item[0]*item[1]
return string
-
1\$\begingroup\$ Welcome to CodeReview@SE. I agree with your problem analysis. Did you digest the older answers? \$\endgroup\$greybeard– greybeard2020年05月18日 09:03:28 +00:00Commented May 18, 2020 at 9:03
Explore related questions
See similar questions with these tags.
output += k
--- this rebuilds the string for each additional character as strings are immutable in Python (see Shlemiel the painter's algorithm). You're better off using''.join(chars)
, wherechars
is a list or generator. \$\endgroup\$