Synchronizing Lists

Question 1

I'm attempting to solve the problem "Synchronizing Lists" on Kattis.com.

In a nutshell, the problem involves being given two lists: the lists are then to be sorted smallest-to-largest, correspondingly indexed elements paired together, then the second list is output, in the original order of the corresponding elements of the first list.

While functional, my code runs afoul of the time limit when pitted against the second test case, after I submit it.

import sys
class Correspondence():
 def __init__(self, num1, num2, ID):
 self.num1 = num1
 self.num2 = num2
 self.ID = ID
def SynchronizeLists(case_size):
 list1 = []
 list2 = []
 correspondences = []
 results = []
 for i in range(case_size):
 list1.append(int(sys.stdin.readline()))
 for i in range(case_size):
 list2.append(int(sys.stdin.readline()))
 sorted_list1 = list1[:]
 sorted_list1.sort()
 sorted_list2 = list2[:]
 sorted_list2.sort()
 for i in range(len(sorted_list1)):
 correspondence = Correspondence(sorted_list1[i], sorted_list2[i], i)
 correspondences.append(correspondence)
 for i in list1:
 results.append([correspondence.num2 for correspondence in correspondences if correspondence.num1 == i][0])
 return results
output = []
case_size = int(sys.stdin.readline())
while case_size != 0:
 output.extend(SynchronizeLists(case_size))
 output.append('')
 case_size = int(sys.stdin.readline())
del output[-1]
for i in output:
 print(i)

cProfile tells me this:

 126 function calls in 1.270 seconds
 Ordered by: standard name
 ncalls tottime percall cumtime percall filename:lineno(function)
 2 0.000 0.000 0.000 0.000 <string>:11(SynchronizeLists)
 1 0.000 0.000 1.269 1.269 <string>:3(<module>)
 11 0.000 0.000 0.000 0.000 <string>:33(<listcomp>)
 1 0.000 0.000 0.000 0.000 <string>:5(Correspondence)
 11 0.000 0.000 0.000 0.000 <string>:6(__init__)
 2 0.000 0.000 0.000 0.000 codecs.py:318(decode)
 2 0.000 0.000 0.000 0.000 codecs.py:330(getstate)
 2 0.000 0.000 0.000 0.000 {built-in method _codecs.utf_8_decode}
 1 0.000 0.000 0.000 0.000 {built-in method builtins.__build_class__}
 1 0.001 0.001 1.270 1.270 {built-in method builtins.exec}
 2 0.000 0.000 0.000 0.000 {built-in method builtins.len}
 12 0.000 0.000 0.000 0.000 {built-in method builtins.print}
 46 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
 2 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects}
 25 1.269 0.051 1.269 0.051 {method 'readline' of '_io.TextIOWrapper' objects}
 4 0.000 0.000 0.000 0.000 {method 'sort' of 'list' objects}

I'm not worried about exec, as I suspect that's simply a consequence of how I'm using cProfile. It seems like my biggest time sink involves input: readline(), to be precise.

I've already tried swapping out all calls to readline() for calls to input(), but the results barely changed.

How can I speed this up?

Question 2

Performance

The slowness comes from this bit:

for i in list1:
 results.append([correspondence.num2 for correspondence in correspondences if correspondence.num1 == i][0])

What do you think is the time complexity of this operation?

\$O(n^2)\$

The for i in list1 loop is a linear search, and the list comprehension inside the loop is also a linear search.

You can improve this by changing the logic, for example:

From list1, create a pair of values with the current indexes
Sort the list of pairs by the values
Sort list2 (or a copy of it)
Create a list of the target size
For i in 0 .. case_size, get the index to set from the sorted pairs, and the value to set from the sorted list2

Something like this:

pairs = zip(list1, range(case_size))
sorted_pairs = sorted(zip(range(case_size), list1), key=lambda p: p[1])
sorted_list2 = sorted(list2)
results = [0] * case_size
for i, p in enumerate(sorted_pairs):
 results[p[0]] = sorted_list2[i]

What is the time complexity of this operation?

\$O(n)\$ -- a linear loop with random accesses into arrays

What is the overall time complexity of the solution?

\$O(n \log n)\$ -- the slowest operation is the sorting of lists

Style

Instead of this:

for i in range(case_size):
 list1.append(int(sys.stdin.readline()))

You can use a list comprehension:

list1 = [int(sys.stdin.readline()) for _ in range(case_size)]

Similarly, in the earlier example I replaced the two-step list copy and sorting with a single call to sorted.

It's not recommended to have code automatically executing in global scope, it's better to wrap call code in an if __name__ == '__main__':, that way the code can be imported and run as needed, and unit tested.

Question 3

As an alternative to accessing into an array, we can do what the word correspondence already suggests, namely define a mapping of values in list 1 to values in list 2. This assumes that this mapping is unique, so that no value in list 1 is repeated (which is guaranteed in the problem description).

Then we can simply do:

def synchronize(l1, l2):
 mapping = dict(zip(sorted(l1), sorted(l2)))
 return [mapping[x] for x in l1]

This will in total have a runtime of \$ \mathcal{O}(N\log N + N)\$ (due to having to sort both lists and then traversing one of them).

At least for the larger of the two example inputs, this is even faster:

In [24]: l1, l2
Out[24]: ([98, 23, 61, 49, 1, 79, 9], [1, 15, 32, 47, 68, 39, 24])
In [25]: %timeit synchronize(l1, l2)
2.75 μs ± 57.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [26]: %timeit synchronize_janos(l1, l2)
4.56 μs ± 64.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

For larger inputs the other answer by @janos dominates, though:

enter image description here

Both single-handedly beat your original approach, though:

enter image description here

(Note the log-scale on the y-axis and the shorter range in x so it actually runs in under a minute.)

As for the input function, I would use something simple like this:

def main():
 while True:
 n = int(input())
 if not n:
 break
 l1 = [int(input()) for _ in range(n)]
 l2 = [int(input()) for _ in range(n)]
 for x in synchronize(l1, l2):
 print(x)
 print()

Question 4

Beautifully done!

janos janos 113k15 gold badges154 silver badges396 bronze badges · Accepted Answer · 2018-06-27 19:31:49Z

Performance

The slowness comes from this bit:

for i in list1:
 results.append([correspondence.num2 for correspondence in correspondences if correspondence.num1 == i][0])

What do you think is the time complexity of this operation?

\$O(n^2)\$

The for i in list1 loop is a linear search, and the list comprehension inside the loop is also a linear search.

You can improve this by changing the logic, for example:

From list1, create a pair of values with the current indexes
Sort the list of pairs by the values
Sort list2 (or a copy of it)
Create a list of the target size
For i in 0 .. case_size, get the index to set from the sorted pairs, and the value to set from the sorted list2

Something like this:

pairs = zip(list1, range(case_size))
sorted_pairs = sorted(zip(range(case_size), list1), key=lambda p: p[1])
sorted_list2 = sorted(list2)
results = [0] * case_size
for i, p in enumerate(sorted_pairs):
 results[p[0]] = sorted_list2[i]

What is the time complexity of this operation?

\$O(n)\$ -- a linear loop with random accesses into arrays

What is the overall time complexity of the solution?

\$O(n \log n)\$ -- the slowest operation is the sorting of lists

Style

Instead of this:

for i in range(case_size):
 list1.append(int(sys.stdin.readline()))

You can use a list comprehension:

list1 = [int(sys.stdin.readline()) for _ in range(case_size)]

Similarly, in the earlier example I replaced the two-step list copy and sorting with a single call to sorted.

It's not recommended to have code automatically executing in global scope, it's better to wrap call code in an if __name__ == '__main__':, that way the code can be imported and run as needed, and unit tested.

Stack Exchange Network

Synchronizing Lists

2 Answers 2

Performance

Style

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Synchronizing Lists

2 Answers 2

Performance

Style

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions