1
\$\begingroup\$

I am trying to implement a recursive count of inversions in a list of integers by using merge sort and counting the number of split inversions on each recursion. My code works but runs painfully slowly above a list of about 100 integers, and I need to count for ~3 orders of magnitude greater. Can anyone see what I'm doing wrong that makes it so slow?

inversions = np.loadtxt("inversion_text.txt", int).tolist()
def sort_count_inversions(X):
 n = len(X)
 # For subproblems of size 1 there are no inversions and no sorting required
 if n == 1:
 return 0, X
 a = X[:n//2]
 b = X[n//2:]
 # Count the inversions in subproblems
 # Get back the count of inversions and the sorted subproblems
 count_a, sorted_a = sort_count_inversions(a)
 count_b, sorted_b = sort_count_inversions(b)
 # Count the split inversions between sorted sub problems
 # Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
 split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)
 # return the count of inversions in X and merge-sorted X
 count = split_count + count_a + count_b
 return count, a_b
def merge_count_split_inversions(X, Y):
 n = len(X) + len(Y)
 j = 0
 k = 0
 X_Y = []
 count_split = 0
 # Iterate through X and Y comparing value by value
 for i in range(n):
 # If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
 if j == len(X):
 X_Y.extend(Y[k:])
 elif k == len(Y):
 X_Y.extend(X[j:])
 # Add the smaller of the two compared elements to X_Y
 # If adding from Y, increment count of split inversions by the number of remaining elements in X
 elif X[j] < Y[k]: 
 X_Y.append(X[j])
 j += 1
 else:
 X_Y.append(Y[k])
 k += 1
 count_split += len(X) - j
 # Return the total count of split inversions and the result of merge_sorting X and Y (X_Y) 
 return count_split, X_Y 
sort_count_inversions(inversions)
200_success
145k22 gold badges190 silver badges478 bronze badges
asked May 15, 2018 at 13:35
\$\endgroup\$
3
  • 1
    \$\begingroup\$ Happy to take a look, but have you profiled your code yet? \$\endgroup\$ Commented May 15, 2018 at 13:43
  • \$\begingroup\$ Could you provide a sample of inversions? \$\endgroup\$ Commented May 15, 2018 at 13:46
  • 1
    \$\begingroup\$ You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness. \$\endgroup\$ Commented May 15, 2018 at 13:52

1 Answer 1

2
\$\begingroup\$

There's a bug in merge_count_split_inversions.

 for i in range(n):
 if j == len(X):
 add len(Y)-k elements to X_Y
 elif k == len(Y):
 add len(X)-j elements to X_Y
 elif X[j] < Y[k]:
 add 1 element to X_Y
 j += 1
 else:
 add 1 element to X_Y
 k += 1

This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.

answered May 15, 2018 at 14:25
\$\endgroup\$
2
  • \$\begingroup\$ Great thanks for the spot, all runs fine and quickly now :-) \$\endgroup\$ Commented May 16, 2018 at 17:28
  • \$\begingroup\$ @Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers . \$\endgroup\$ Commented May 16, 2018 at 17:46

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.