Recursive merge-sort inversion count algorithm very slow

Question 1

I am trying to implement a recursive count of inversions in a list of integers by using merge sort and counting the number of split inversions on each recursion. My code works but runs painfully slowly above a list of about 100 integers, and I need to count for ~3 orders of magnitude greater. Can anyone see what I'm doing wrong that makes it so slow?

inversions = np.loadtxt("inversion_text.txt", int).tolist()
def sort_count_inversions(X):
 n = len(X)
 # For subproblems of size 1 there are no inversions and no sorting required
 if n == 1:
 return 0, X
 a = X[:n//2]
 b = X[n//2:]
 # Count the inversions in subproblems
 # Get back the count of inversions and the sorted subproblems
 count_a, sorted_a = sort_count_inversions(a)
 count_b, sorted_b = sort_count_inversions(b)
 # Count the split inversions between sorted sub problems
 # Get back the count of split inversions and the the merge-sorted subproblems (sorted X)
 split_count, a_b = merge_count_split_inversions(sorted_a,sorted_b)
 # return the count of inversions in X and merge-sorted X
 count = split_count + count_a + count_b
 return count, a_b
def merge_count_split_inversions(X, Y):
 n = len(X) + len(Y)
 j = 0
 k = 0
 X_Y = []
 count_split = 0
 # Iterate through X and Y comparing value by value
 for i in range(n):
 # If we have reached the end of X or Y, there are no more inversions so append the remaining values to X_Y
 if j == len(X):
 X_Y.extend(Y[k:])
 elif k == len(Y):
 X_Y.extend(X[j:])
 # Add the smaller of the two compared elements to X_Y
 # If adding from Y, increment count of split inversions by the number of remaining elements in X
 elif X[j] < Y[k]: 
 X_Y.append(X[j])
 j += 1
 else:
 X_Y.append(Y[k])
 k += 1
 count_split += len(X) - j
 # Return the total count of split inversions and the result of merge_sorting X and Y (X_Y) 
 return count_split, X_Y 
sort_count_inversions(inversions)

Question 2

Happy to take a look, but have you profiled your code yet?

Question 3

Could you provide a sample of inversions?

Question 4

You say the algorithm works, but the sorted result that gets returned is much larger than the original input array. Whatever is causing that might also be the source of the slowness.

Question 5

There's a bug in merge_count_split_inversions.

 for i in range(n):
 if j == len(X):
 add len(Y)-k elements to X_Y
 elif k == len(Y):
 add len(X)-j elements to X_Y
 elif X[j] < Y[k]:
 add 1 element to X_Y
 j += 1
 else:
 add 1 element to X_Y
 k += 1

This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.

Question 6

Great thanks for the spot, all runs fine and quickly now :-)

Question 7

@Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers .

score 2 · Accepted Answer · 2018-05-15 14:25:40Z

There's a bug in merge_count_split_inversions.

 for i in range(n):
 if j == len(X):
 add len(Y)-k elements to X_Y
 elif k == len(Y):
 add len(X)-j elements to X_Y
 elif X[j] < Y[k]:
 add 1 element to X_Y
 j += 1
 else:
 add 1 element to X_Y
 k += 1

This should have the invariant that after the ith time round the loop it has added i elements to X_Y. Either the first two cases need to add a single element, or they need to break.

Great thanks for the spot, all runs fine and quickly now :-)
@Felix I have rolled back Rev 4 → 2. Please see What to do when someone answers .

Stack Exchange Network

Recursive merge-sort inversion count algorithm very slow

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Recursive merge-sort inversion count algorithm very slow

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions