4
\$\begingroup\$

I've become a bit rusty in Java. This is an attempt at brushing up my skills.

In computer science, merge sort is an efficient, general-purpose, comparison-based sorting algorithm.

Questions

  • How can I improve my test cases? (TestNG)
  • I've used Java-11 here, can I use better idioms?
  • Can this implementation improved for performance?
  • Can style / readability improved?

Code

Sorting.java

import java.util.Arrays;
import java.util.Comparator;
import java.util.Objects;
public class Sorting {
 /**
 * Day 08 - Merge Sort
 *
 * @param data data array
 * @param comparator comparator of given data type
 * @param ascending sort order
 * @param <T> data type
 * @return sorted array (in place)
 */
 public static <T> T[] mergeSort(T[] data, Comparator<T> comparator, boolean ascending) {
 Objects.requireNonNull(data);
 Objects.requireNonNull(comparator);
 if (data.length <= 1) return data;
 var actualComparator = ascending ? comparator : comparator.reversed();
 return mergeSort(data, actualComparator, 0, data.length - 1);
 }
 private static <T> T[] mergeSort(T[] data, Comparator<T> comparator, int start, int stop) {
 if (start >= stop) return data;
 int pivot = start + (stop - start) / 2;
 mergeSort(data, comparator, start, pivot);
 mergeSort(data, comparator, pivot + 1, stop);
 return merge(data, comparator, start, pivot, stop);
 }
 private static <T> T[] merge(T[] data, Comparator<T> comparator, int start, int pivot, int stop) {
 T[] left = Arrays.copyOfRange(data, start, pivot + 1);
 T[] right = Arrays.copyOfRange(data, pivot + 1, stop + 1);
 int lPos = 0;
 int rPos = 0;
 int pos = start;
 while (lPos < left.length && rPos < right.length) {
 if (comparator.compare(left[lPos], right[rPos]) <= 0) {
 data[pos++] = left[lPos++];
 } else {
 data[pos++] = right[rPos++];
 }
 }
 while (lPos < left.length) {
 data[pos++] = left[lPos++];
 }
 while (rPos < right.length) {
 data[pos++] = right[rPos++];
 }
 return data;
 }
}

TestMergeSorting.java

import org.testng.Assert;
import org.testng.annotations.Test;
public class TestMergeSorting {
 @Test
 public void testMergeSort() {
 Assert.assertEquals(asc(6, 5, 9, 3, 2, 1), ar(1, 2, 3, 5, 6, 9));
 Assert.assertEquals(asc(1, 1, 1, 1, 2, 1), ar(1, 1, 1, 1, 1, 2));
 Assert.assertEquals(asc(2, 1, 1, 1, 1, 2, 1), ar(1, 1, 1, 1, 1, 2, 2));
 Assert.assertEquals(asc(1, 1, 1, 1, 1, 2, 1), ar(1, 1, 1, 1, 1, 1, 2));
 Assert.assertEquals(asc(3, 2, 1), ar(1, 2, 3));
 Assert.assertEquals(asc(1, 2, 3), ar(1, 2, 3));
 Assert.assertEquals(asc(2, 0, -1, 2, 1), ar(-1, 0, 1, 2, 2));
 Assert.assertEquals(dsc(1, 2, 3), ar(3, 2, 1));
 Assert.assertEquals(dsc(3, 2, 1), ar(3, 2, 1));
 Assert.assertEquals(dsc(2, 0, -1, 2, 1), ar(2, 2, 1, 0, -1));
 Assert.assertEquals(asc(2, 1), ar(1, 2));
 Assert.assertEquals(asc(1), ar(1));
 Assert.assertEquals(asc(), ar());
 Assert.assertEquals(dsc(1), ar(1));
 Assert.assertEquals(dsc(), ar());
 }
 @Test(expectedExceptions = {NullPointerException.class})
 public void testNullExcept() {
 Sorting.mergeSort(null, Integer::compare, true);
 }
 private static Integer[] ar(Integer... objects) {
 return objects;
 }
 private static Integer[] asc(Integer... objects) {
 return Sorting.mergeSort(objects, Integer::compare, true);
 }
 private static Integer[] dsc(Integer... objects) {
 return Sorting.mergeSort(objects, Integer::compare, false);
 }
}
asked Dec 27, 2019 at 15:39
\$\endgroup\$

3 Answers 3

4
\$\begingroup\$

Naming

Prefer to use conventional names. That is, the same names used by the Arrays class in the Java standard library:

  • data => a
  • comparator => c
  • start => fromIndex
  • stop => toIndex

I've never seen the center element in merge sort being referred to as the pivot. I think a more fitting name is mid.

Inclusive start, exclusive end

It is better to use inclusive start indices and exclusive stop indices, matching the convention established by Arrays.sort and many other array functions in the Java standard library.

This way, if you make your other mergeSort overload public:

public static <T> void mergeSort(T[] a,
 int fromIndex, int toIndex,
 Comparator<T> c) {

It becomes very easy for people to use it because the signature matches the Arrays.sort(T[], int, int, Comparator) function.

Garbage generation

The two calls to Arrays.copyOfRange in merge creates temporary arrays which causes unnecessary garbage collection pressure. It is more efficient to preallocate a temporary buffer and then pass that one around.

I've rewritten your merge function to use such a temporary buffer. It makes the code a little more complex, but I think it is worth it.

Modified code with comments removed:

import java.util.Arrays;
import java.util.Comparator;
import java.util.Objects;
public class Sorting {
 public static <T> void mergeSort(T[] a,
 Comparator<? super T> c,
 boolean ascending) {
 Comparator<? super T> actualComparator = ascending ? c : c.reversed();
 mergeSort(a, 0, a.length, actualComparator);
 }
 public static <T> void mergeSort(T[] a,
 int fromIndex, int toIndex,
 Comparator<? super T> c) {
 Objects.requireNonNull(a);
 Objects.requireNonNull(c);
 if (fromIndex > toIndex) {
 throw new IllegalArgumentException(
 "fromIndex(" + fromIndex + ") > toIndex(" + toIndex + ")");
 }
 if (fromIndex < 0) {
 throw new ArrayIndexOutOfBoundsException(fromIndex);
 }
 if (toIndex > a.length) {
 throw new ArrayIndexOutOfBoundsException(toIndex);
 }
 T[] buf = Arrays.copyOf(a, a.length);
 mergeSortInternal(a, fromIndex, toIndex, c, buf);
 }
 private static <T> void mergeSortInternal(T[] a,
 int fromIndex, int toIndex,
 Comparator<? super T> c,
 T[] buf) {
 if (toIndex - fromIndex <= 1)
 return;
 int mid = fromIndex + (toIndex - fromIndex) / 2;
 mergeSortInternal(a, fromIndex, mid, c, buf);
 mergeSortInternal(a, mid, toIndex, c, buf);
 merge(a, fromIndex, mid, toIndex, c, buf);
 }
 private static <T> void merge(T[] a,
 int fromIndex, int mid, int toIndex,
 Comparator<? super T> c,
 T[] buf) {
 System.arraycopy(a, fromIndex, buf, fromIndex, toIndex - fromIndex);
 int lPos = fromIndex;
 int rPos = mid;
 int lEnd = mid;
 int rEnd = toIndex;
 int pos = fromIndex;
 while (lPos < lEnd && rPos < rEnd) {
 if (c.compare(buf[lPos], buf[rPos]) <= 0) {
 a[pos++] = buf[lPos++];
 } else {
 a[pos++] = buf[rPos++];
 }
 }
 while (lPos < lEnd) {
 a[pos++] = buf[lPos++];
 }
 while (rPos < rEnd) {
 a[pos++] = buf[rPos++];
 }
 }
}
answered Dec 28, 2019 at 2:16
\$\endgroup\$
5
  • \$\begingroup\$ Are you really proposing to rename the perfectly fine comparator to the meaningless c because in one implementation of the standard library they did the same? I assume he should also introduce the age-old overflow error they had for over a decade in the standard library as well? \$\endgroup\$ Commented Dec 28, 2019 at 20:05
  • \$\begingroup\$ Yes. One letter names are fine if it is clear what they mean. \$\endgroup\$ Commented Dec 28, 2019 at 21:17
  • \$\begingroup\$ I like your answer but I don't like a and c :) \$\endgroup\$ Commented Dec 28, 2019 at 22:26
  • \$\begingroup\$ @Björn So what does a stand for then? \$\endgroup\$ Commented Dec 28, 2019 at 23:01
  • \$\begingroup\$ a is the array to merge sort. \$\endgroup\$ Commented Dec 28, 2019 at 23:18
6
\$\begingroup\$

One thing I noticed, the sorting routine is modifying the array and returning the array. I would suggest either make the functions void or create a new array to do the merge in. A sort function with a void return type automatically tells the user that the sort is in place. Where as, a return type of an array, tells the user that a new array is created, but that the original array isn't touched.

answered Dec 28, 2019 at 0:12
\$\endgroup\$
2
\$\begingroup\$
  • If you are going to provide more than one variant of the method, then I recommend that you provide every possible variant. The indices are there or not (and defaults to the whole array). The comparator is optional (and defaults to the natural comparator). The ascending flag is optional (and defaults to true). Eight variants, or 16 if you want copying and in-place versions.
  • I believe the "right" array is unneeded. The original data is can't be overwritten.
  • It is worth remembering that below some size, bubble sort (or variations thereof) can be the fastest sort! (This may not matter as an exercise in brushing up skills, but it you want to actually use it...)
answered Dec 28, 2019 at 3:20
\$\endgroup\$
1
  • \$\begingroup\$ I think newer versions of Java comes with timsort which is better. :) This is only an exercise anyway 🥳 \$\endgroup\$ Commented Dec 28, 2019 at 16:05

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.