This is the Kindergarten Excursion problem from kattis.com:
The kindergarten teachers had finally managed to get all the kids in a line for the walk to the bus station and the weekly excursion. What they hadn’t thought of was the fact that today different kids were supposed to go to different excursions. They should walk as one line to the bus station, but to avoid total chaos when they arrive there, all the kids going to the zoo should be in the beginning of the line, the ones going to the lake should be in the middle and the rest, going to the science museum, should be in the end.
Since it takes a lot of time to get all the kids to stand in a line no kid may step out of the line. To get the line organized after excursion group, kids standing next to each other can swap places. Now the kindergarten teachers wonder, if they will have time to do this and still catch their bus.
Task
You will be given a sequence of numbers 0, 1, and 2, denoting the excursion destinations of each kid from first to last in the line. Pairs of adjacent kids can swap positions, and the line should be organized after destination number starting with 0 and ending with 2. What is the minimum number of swaps required to organize the line?
Input
The only line of input contains a string consisting of characters 0, 1 or 2, denoting the destinations of kids. The length of the string will be at most 1,000,000 characters.
Output
Output one line with one integer – the minimum number of swaps needed to get the kids in order.
Sample Input 1
10210
Sample Output 1
5
This is my code. My solution is too slow and I need help with optimization or a different approach for this problem.
def excursion(str):
digits = [0, 0, 0]
x = 0
for i in range(len(str)):
for j in range((int(str[i])) + 1, 3):
x += digits[j]
digits[int(str[i])] += 1
return x
print(excursion(input()))
2 Answers 2
The code in the post runs in linear time, so we can't hope to improve the asymptotic complexity, but we might be able to make some piecewise optimizations. Let's start by making a random test case:
>>> import random
>>> CASE = ''.join(random.choice('012') for _ in range(10**6))
Timing the performance on this test case establishes a baseline for any improvements.
>>> from timeit import timeit
>>> timeit(lambda:excursion(CASE), number=1)
1.1623761800583452
The expression
int(str[i]))
appears twice, so let's remember it in a local variable:def excursion2(str): digits = [0, 0, 0] result = 0 for i in range(len(str)): digit = int(str[i]) for j in range(digit + 1, 3): result += digits[j] digits[digit] += 1 return result
This saves about 25% of the runtime:
>>> timeit(lambda:excursion2(CASE), number=1) 0.867930110078305
The code only uses the index
i
to look up the corresponding character ofstr
. So instead of iterating over the indexes, iterate over the characters themselves:def excursion3(str): digits = [0, 0, 0] result = 0 for c in str: digit = int(c) for j in range(digit + 1, 3): result += digits[j] digits[digit] += 1 return result
This saves about 3%:
>>> timeit(lambda:excursion3(CASE), number=1) 0.8365003759972751
We can save an assignment statement by using
map
to apply the functionint
to the characters ofstr
:def excursion4(str): digits = [0, 0, 0] result = 0 for digit in map(int, str): for j in range(digit + 1, 3): result += digits[j] digits[digit] += 1 return result
This saves about 4%:
>>> timeit(lambda:excursion4(CASE), number=1) 0.7819818791467696
We could cache the range objects, to avoid rebuilding them on each loop:
def excursion5(str): n = 3 digits = [0] * n ranges = [range(i + 1, n) for i in range(n)] result = 0 for digit in map(int, str): for j in ranges[digit]: result += digits[j] digits[digit] += 1 return result
This saves about 25%:
>>> timeit(lambda:excursion5(CASE), number=1) 0.497486931970343
The cumulative impact of these changes reduced the overall runtime by about 60%. Is that good enough to pass the time limit?
Since you know that there will only be 3 distinct values, you should use bucket sort. We will store the counts in a dictionary, and then build the string at the end to avoid the cost of repeated concatenation. This code is both simpler and faster.
def excursion(string):
digits = {'0':0, '1':0, '2':0}
for char in string:
digits[char] += 1
return '0'*digits['0'] + '1'*digits['1'] + '2'*digits['2']
n=0b10000001000000
print(len(excursion('2'*n+'1'*n+'0'*n)))
-
1\$\begingroup\$ The challenge is not to perform the sorting, but to count the number of swaps involved in a bubble sort. (If you wanted to do a counting sort,
itertools.Counter
would have resulted in an even simpler implementation.) \$\endgroup\$200_success– 200_success2018年02月05日 20:44:09 +00:00Commented Feb 5, 2018 at 20:44
Explore related questions
See similar questions with these tags.