This is the question I tried Leetcode: String Compression
Given an array of characters chars, compress it using the following algorithm:
Begin with an empty string s. For each group of consecutive repeating characters in chars:
If the group's length is 1, append the character to s. Otherwise, append the character followed by the group's length. The compressed string s should not be returned separately, but instead, be stored in the input character array chars. Note that group lengths that are 10 or longer will be split into multiple characters in chars.
After you are done modifying the input array, return the new length of the array.
You must write an algorithm that uses only constant extra space.
I tried to solve this solution in O(n) time complexity. Even though I did it, it is a lot slower.
def compress(self, chars: List[str]) -> int:
i = 0
while i < len(chars):
count = 0
for char in chars[i:]:
if char != chars[i]:
break
count += 1
if count != 1:
list_appending = list(str(count))
chars[i+1:i+count] = list_appending
i += 1 + len(list_appending)
else:
i += 1
return len(chars)
Please can you give the reason why my solution is not so fast?? Why is my O(n) solution for Leetcode (443) String Compression not optimal?
-
1\$\begingroup\$ It is not \$O(n)\$. List slicing itself is linear in the length of the slice. \$\endgroup\$vnp– vnp2021年09月01日 16:57:34 +00:00Commented Sep 1, 2021 at 16:57
-
\$\begingroup\$ Pardon me for that I didn't notice. \$\endgroup\$ravinteja chilukamari– ravinteja chilukamari2021年09月02日 06:32:45 +00:00Commented Sep 2, 2021 at 6:32
1 Answer 1
It's not \$O(n)\$ but \$O(n^2)\$, because:
chars[i:]
takes \$O(n)\$ time and space. You could instead move a second indexj
forwards until the last occurrence and then compute the length asj - i + 1
.chars[i+1:i+count] = list_appending
takes \$O(n)\$ time. Instead of replacing allcount
equal characters, better replace exactly as many as you need for the digits. Then the remaining characters don't get moved and it's \$O(1)\$ (amortized).
You shrink the list. I'd say you're not supposed to. If you were supposed to shrink the list, why would they want you to return the list length? I think you're supposed to write the compressed data into a prefix of the list and return where that compressed data ends. Like they requested in an earlier problem.
Converting str(count)
to a list is unnecessary. And I'd find digits
a more meaningful name than list_appending
.
PEP 8 suggests to write chars[i+1:i+count]
as chars[i+1 : i+count]
(and I agree).
This is essentially a job for itertools.groupby
, here's a solution using that:
def compress(self, chars: List[str]) -> int:
def compress():
for c, g in groupby(chars):
yield c
k = countOf(g, c)
if k > 1:
yield from str(k)
for i, chars[i] in enumerate(compress()):
pass
return i + 1
Just for fun a bad but also fast regex solution:
def compress(self, chars: List[str]) -> int:
chars[:] = re.sub(
r'(.)1円+',
lambda m: m[1] + str(len(m[0])),
''.join(chars))
return len(chars)
-
\$\begingroup\$ I really appreciate your efforts. And thanks for the information I would follow rule about clean coding too. But still that code ran so slow only about beating 20% users. Can I know why?? \$\endgroup\$ravinteja chilukamari– ravinteja chilukamari2021年09月02日 06:30:00 +00:00Commented Sep 2, 2021 at 6:30
-
\$\begingroup\$ It's not much slower than the faster ones, if at all. Look at the time distribution graph. And LeetCode's timing is unstable. How often did you submit it, and what were the individual results? \$\endgroup\$no comment– no comment2021年09月02日 15:34:52 +00:00Commented Sep 2, 2021 at 15:34