4
\$\begingroup\$

I'm attempting to solve the problem "Synchronizing Lists" on Kattis.com.

In a nutshell, the problem involves being given two lists: the lists are then to be sorted smallest-to-largest, correspondingly indexed elements paired together, then the second list is output, in the original order of the corresponding elements of the first list.

While functional, my code runs afoul of the time limit when pitted against the second test case, after I submit it.

import sys
class Correspondence():
 def __init__(self, num1, num2, ID):
 self.num1 = num1
 self.num2 = num2
 self.ID = ID
def SynchronizeLists(case_size):
 list1 = []
 list2 = []
 correspondences = []
 results = []
 for i in range(case_size):
 list1.append(int(sys.stdin.readline()))
 for i in range(case_size):
 list2.append(int(sys.stdin.readline()))
 sorted_list1 = list1[:]
 sorted_list1.sort()
 sorted_list2 = list2[:]
 sorted_list2.sort()
 for i in range(len(sorted_list1)):
 correspondence = Correspondence(sorted_list1[i], sorted_list2[i], i)
 correspondences.append(correspondence)
 for i in list1:
 results.append([correspondence.num2 for correspondence in correspondences if correspondence.num1 == i][0])
 return results
output = []
case_size = int(sys.stdin.readline())
while case_size != 0:
 output.extend(SynchronizeLists(case_size))
 output.append('')
 case_size = int(sys.stdin.readline())
del output[-1]
for i in output:
 print(i)

cProfile tells me this:

 126 function calls in 1.270 seconds
 Ordered by: standard name
 ncalls tottime percall cumtime percall filename:lineno(function)
 2 0.000 0.000 0.000 0.000 <string>:11(SynchronizeLists)
 1 0.000 0.000 1.269 1.269 <string>:3(<module>)
 11 0.000 0.000 0.000 0.000 <string>:33(<listcomp>)
 1 0.000 0.000 0.000 0.000 <string>:5(Correspondence)
 11 0.000 0.000 0.000 0.000 <string>:6(__init__)
 2 0.000 0.000 0.000 0.000 codecs.py:318(decode)
 2 0.000 0.000 0.000 0.000 codecs.py:330(getstate)
 2 0.000 0.000 0.000 0.000 {built-in method _codecs.utf_8_decode}
 1 0.000 0.000 0.000 0.000 {built-in method builtins.__build_class__}
 1 0.001 0.001 1.270 1.270 {built-in method builtins.exec}
 2 0.000 0.000 0.000 0.000 {built-in method builtins.len}
 12 0.000 0.000 0.000 0.000 {built-in method builtins.print}
 46 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
 2 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects}
 25 1.269 0.051 1.269 0.051 {method 'readline' of '_io.TextIOWrapper' objects}
 4 0.000 0.000 0.000 0.000 {method 'sort' of 'list' objects}

I'm not worried about exec, as I suspect that's simply a consequence of how I'm using cProfile. It seems like my biggest time sink involves input: readline(), to be precise.

I've already tried swapping out all calls to readline() for calls to input(), but the results barely changed.

How can I speed this up?

asked Jun 27, 2018 at 17:52
\$\endgroup\$
0

2 Answers 2

4
\$\begingroup\$

Performance

The slowness comes from this bit:

for i in list1:
 results.append([correspondence.num2 for correspondence in correspondences if correspondence.num1 == i][0])

What do you think is the time complexity of this operation?

\$O(n^2)\$

The for i in list1 loop is a linear search, and the list comprehension inside the loop is also a linear search.

You can improve this by changing the logic, for example:

  • From list1, create a pair of values with the current indexes
  • Sort the list of pairs by the values
  • Sort list2 (or a copy of it)
  • Create a list of the target size
  • For i in 0 .. case_size, get the index to set from the sorted pairs, and the value to set from the sorted list2

Something like this:

pairs = zip(list1, range(case_size))
sorted_pairs = sorted(zip(range(case_size), list1), key=lambda p: p[1])
sorted_list2 = sorted(list2)
results = [0] * case_size
for i, p in enumerate(sorted_pairs):
 results[p[0]] = sorted_list2[i]

What is the time complexity of this operation?

\$O(n)\$ -- a linear loop with random accesses into arrays

What is the overall time complexity of the solution?

\$O(n \log n)\$ -- the slowest operation is the sorting of lists

Style

Instead of this:

for i in range(case_size):
 list1.append(int(sys.stdin.readline()))

You can use a list comprehension:

list1 = [int(sys.stdin.readline()) for _ in range(case_size)]

Similarly, in the earlier example I replaced the two-step list copy and sorting with a single call to sorted.


It's not recommended to have code automatically executing in global scope, it's better to wrap call code in an if __name__ == '__main__':, that way the code can be imported and run as needed, and unit tested.

answered Jun 27, 2018 at 19:31
\$\endgroup\$
3
\$\begingroup\$

As an alternative to accessing into an array, we can do what the word correspondence already suggests, namely define a mapping of values in list 1 to values in list 2. This assumes that this mapping is unique, so that no value in list 1 is repeated (which is guaranteed in the problem description).

Then we can simply do:

def synchronize(l1, l2):
 mapping = dict(zip(sorted(l1), sorted(l2)))
 return [mapping[x] for x in l1]

This will in total have a runtime of \$ \mathcal{O}(N\log N + N)\$ (due to having to sort both lists and then traversing one of them).

At least for the larger of the two example inputs, this is even faster:

In [24]: l1, l2
Out[24]: ([98, 23, 61, 49, 1, 79, 9], [1, 15, 32, 47, 68, 39, 24])
In [25]: %timeit synchronize(l1, l2)
2.75 μs ± 57.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [26]: %timeit synchronize_janos(l1, l2)
4.56 μs ± 64.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

For larger inputs the other answer by @janos dominates, though:

enter image description here

Both single-handedly beat your original approach, though:

enter image description here

(Note the log-scale on the y-axis and the shorter range in x so it actually runs in under a minute.)


As for the input function, I would use something simple like this:

def main():
 while True:
 n = int(input())
 if not n:
 break
 l1 = [int(input()) for _ in range(n)]
 l2 = [int(input()) for _ in range(n)]
 for x in synchronize(l1, l2):
 print(x)
 print()
answered Jun 27, 2018 at 19:47
\$\endgroup\$
1
  • \$\begingroup\$ Beautifully done! \$\endgroup\$ Commented Jun 28, 2018 at 20:15

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.