I want to merge two sorted lists of Integers but this is not the general mergesort case because the same number may appear in both lists. (However, each number can only appear once in a particular list.)
The easiest way I've found to do that is (mergeWithSet method):
- Add all numbers from list one to a treeSet
- Add all numbers from list two to the treeSet
- Create a new List from the treeSet
It works but I think the efficiency would be around O(n+m+(n+m)log(n+m)), i.e. O(Nlog(N)) and I think the this problem can be solved in O(n+m) taking advantage of the fact that both lists are already sorted. A quick change in the general case of merge works (mergeWithGet method) but in order to practice Iterators' use, I've written a mergeWithIterator method that only uses Iterators.
The question is:
- mergeWithGet is a neat method. Why is mergeWithIterator so complicated? Am I missing something?
- Should I see a big difference in efficiency for huge lists or in this scenario the three algorithms behave more or less the same?
- Any Java 8 stream solution? Just to check if parallelism works in this case
Thank in advance and any suggestion will be very welcome. The three methods work fine. The code:
import java.util.*;
public class MergeLists {
public static List<Integer> mergeWithIterator(List<Integer> list1, List<Integer> list2) {
List<Integer> sorted = new ArrayList<>();
Iterator<Integer> itList1 = list1.iterator();
Iterator<Integer> itList2 = list2.iterator();
Integer currentList1 = itList1.next();
Integer currentList2 = itList2.next();
while (itList1.hasNext() && itList2.hasNext()) {
if (currentList1 > currentList2) {
sorted.add(currentList2);
currentList2 = itList2.next();
} else if (currentList1 < currentList2) {
sorted.add(currentList1);
currentList1 = itList1.next();
} else {
sorted.add(currentList1);
currentList1 = itList1.next();
currentList2 = itList2.next();
}
}
//one (or both) of the currents have the last number of their list
//Special Case: Both lists have the same size
if (!itList1.hasNext() && !itList2.hasNext()) {
if (currentList1 > currentList2) {
sorted.add(currentList2);
sorted.add(currentList1);
} else if (currentList1 < currentList2) {
sorted.add(currentList1);
sorted.add(currentList2);
} else {
sorted.add(currentList1);
}
return sorted;
}
//General case:One list is longer than the other: we add from the longer till the number is
//greater than the one that is last in the other list, then the last and the rest of the longer list
Iterator<Integer> itLongestList;
Integer lastFromShortest;
Integer lastFromLongest;
if (itList1.hasNext()) {
itLongestList = itList1;
lastFromShortest = currentList2;
lastFromLongest = currentList1;
} else {
itLongestList = itList2;
lastFromShortest = currentList1;
lastFromLongest = currentList2;
}
while (lastFromLongest < lastFromShortest && itLongestList.hasNext()) {
sorted.add(lastFromLongest);
lastFromLongest = itLongestList.next();
}
if (lastFromShortest < lastFromLongest) {
sorted.add(lastFromShortest);
sorted.add(lastFromLongest);
} else if (lastFromShortest > lastFromLongest) {
sorted.add(lastFromLongest);
sorted.add(lastFromShortest);
} else {
sorted.add(lastFromShortest);
}
while (itLongestList.hasNext()) {
sorted.add(itLongestList.next());
}
return sorted;
}
public static List<Integer> mergeWithGet(List<Integer> list1, List<Integer> list2) {
List<Integer> sorted = new ArrayList<>();
int i = 0;
int j = 0;
while (i < list1.size() && j < list2.size()) {
if (list1.get(i) < list2.get(j))
sorted.add(list1.get(i++));
else if (list1.get(i) > list2.get(j)) {
sorted.add(list2.get(j++));
} else {
sorted.add(list1.get(i));
i++;
j++;
}
}
// Store remaining elements of first list
while (i < list1.size())
sorted.add(list1.get(i++));
// Store remaining elements of second list
while (j < list2.size())
sorted.add(list2.get(j++));
return sorted;
}
public static List<Integer> mergeWithSet(List<Integer> list1, List<Integer> list2) {
Set<Integer> sortedSet = new TreeSet<>();
sortedSet.addAll(list1);
sortedSet.addAll(list2);
return new ArrayList<>(sortedSet);
}
public static void main(String[] args) {
List<Integer> list1 = Arrays.asList(-30, 0, 10, 20, 40, 50);
List<Integer> list2 = Arrays.asList(0, 5, 15, 20, 40);
//0, 20 and 40 should appear only once in the result
System.out.println(mergeWithGet(list1, list2));
System.out.println(mergeWithIterator(list1, list2));
System.out.println(mergeWithSet(list1, list2));
}
}
i
3 Answers 3
I took a first try to shorten the iterator implementation while trying to keep it readeable. The difficulty I think comes in part in that with an iterator we actually consume elements, so you can not arbitrarily index elements. I came up with something like this:
public static List<Integer> mergeWithIterator(List<Integer> list1, List<Integer> list2) {
List<Integer> sorted = new ArrayList<>();
Iterator<Integer> iterator1 = list1.iterator();
Iterator<Integer> iterator2 = list2.iterator();
Integer element1 = null;
Integer element2 = null;
while (iterator1.hasNext() && iterator2.hasNext()) {
if (element1 == null) {
element1 = iterator1.next();
}
if (element2 == null) {
element2 = iterator2.next();
}
if (element1 < element2) {
sorted.add(element1);
element1 = null;
} else if (element1 > element2) {
sorted.add(element2);
element2 = null;
} else {
sorted.add(element1);
element1 = null;
element2 = null;
}
}
if (element1 != null) {
sorted.add(element1);
}
if (element2 != null) {
sorted.add(element2);
}
while (iterator1.hasNext()) {
sorted.add(iterator1.next());
}
while (iterator2.hasNext()) {
sorted.add(iterator2.next());
}
return sorted;
}
This of course assumes null
is an invalid element of your list (it would currently blow up anyway). You could even make some inline checks in order to remove the add remaining things after the loop, but I'd wager it will hinder the readeability. EDIT: tried it nonetheless:
public static List<Integer> mergeWithIterator(List<Integer> list1, List<Integer> list2) {
List<Integer> sorted = new ArrayList<>();
Iterator<Integer> iterator1 = list1.iterator();
Iterator<Integer> iterator2 = list2.iterator();
Integer element1 = iterator1.hasNext() ? iterator1.next() : null;
Integer element2 = iterator2.hasNext() ? iterator2.next() : null;
while (!(element1 == null && element2 == null)) {
if (element2 == null || (element1 != null && element1 < element2)) {
sorted.add(element1);
element1 = iterator1.hasNext() ? iterator1.next() : null;
} else if (element1 == null || (element2 != null && element2 < element1)) {
sorted.add(element2);
element2 = iterator2.hasNext() ? iterator2.next() : null;
} else {
sorted.add(element1);
element1 = iterator1.hasNext() ? iterator1.next() : null;
element2 = iterator2.hasNext() ? iterator2.next() : null;
}
}
return sorted;
}
A simple (but possibly naive) streams implementation could be:
public static List<Integer> mergeWithStreams(List<Integer> list1, List<Integer> list2) {
return Stream.concat(list1.stream(), list2.stream())
.sorted()
.distinct()
.collect(toList());
}
As a sidenote, you you could make the methods more generic by defining the types of the elements as implementing the Comparable
interface. This of course could invalidate the iterator implementation using null
.
About performance, I think you should measure. For sufficiently large lists the iterator implementation might indeed get that extra bit of speed, as it's complexity is indeed linear w.r.t. the inputs.
For streaming and parallel, I do not think I have enough knowledge to answer that. My guess is that it would depend a lot on the format of the input, but in general would not be an improvement performance wise. Measure, measure, measure! It definitely is an improvement readeability wise though.
Maybe an other nice way would be to return an implementation of a list using given lists (or copies of them) as the backing collections. I'd imagine the implementation would be quite simple (depending on requirements).
-
\$\begingroup\$ Tried much the same using a
SaneIterator<E>
providingE current()
. \$\endgroup\$greybeard– greybeard2018年03月03日 21:28:19 +00:00Commented Mar 3, 2018 at 21:28
Here's the neatest solution poor Java iterators allowed me:
import java.util.*;
import java.util.function.BiFunction;
public class SortedListMerger<T> {
public static void main(String[] args) {
Integer[] l1 = new Integer[] {1, 3, 5};
Integer[] l2 = new Integer[] {1, 3, 4};
SortedListMerger<Integer> m = new SortedListMerger<>((a, b)-> a * b, Comparator.comparingInt(a -> a));
ArrayList<Integer> d = new ArrayList<>();
m.merge(Arrays.asList(l1), Arrays.asList(l2), d);
for (int i : d)
System.out.println(i);
}
private BiFunction<T, T, T> ifEqual;
private Comparator<T> cmp;
public SortedListMerger(BiFunction<T, T, T> ifEqual, Comparator<T> cmp) {
this.ifEqual = ifEqual;
this.cmp = cmp;
}
public void merge(Iterable<T> src1, Iterable<T> src2, List<T> dest) {
Iterator<T> src1i = src1.iterator();
Iterator<T> src2i = src2.iterator();
T elem1 = src1i.hasNext() ? src1i.next() : null;
T elem2 = src2i.hasNext() ? src2i.next() : null;
T prev = null;
T curr;
while (elem1 != null || elem2 != null) {
if (elem1 != null && elem2 != null) {
curr = cmp.compare(elem1, elem2) <= 0 ? elem1 : elem2;
if (curr == elem1) {
elem1 = src1i.hasNext() ? src1i.next() : null;
} else {
elem2 = src2i.hasNext() ? src2i.next() : null;
}
} else if (elem1 != null) {
curr = elem1;
elem1 = src1i.hasNext() ? src1i.next() : null;
} else {
curr = elem2;
elem2 = src2i.hasNext() ? src2i.next() : null;
}
if (prev != null && cmp.compare(prev, curr) == 0) {
prev = ifEqual.apply(prev, curr);
} else {
dest.add(curr);
prev = curr;
}
}
}
}
the main method, merge, is only 30 lines
As an alternative to saner handling of Iterator
having a next()
, but no current()
method, consider procedural decomposition:
/** @return as accurate an approximation of the number of elements
* as can be expected to be near zero cost. */
static int cheapSize(@SuppressWarnings("rawtypes")
java.util.Collection c) {
return c instanceof java.util.RandomAccess ? c.size()
: c.isEmpty() ? 0 : 1;
}
/** Merges elements of ordered lists <code>list1</code> and
* <code>lista</code> keeping only one of every pair comparing equal.
* @return the merged list,
* or one of the original lists if the other was empty */
public static <T extends Comparable<T>> List<? extends T>
mergeWithIterator(List<T> list1, List<T> lista) {
final int n1, na;
if ((n1 = cheapSize(list1)) <= 0)
return lista;
if ((na = cheapSize(lista)) <= 0)
return list1;
return merge(list1.iterator(), lista.iterator(),
new java.util.ArrayList<>(n1+na) //, Math.min(n1, na)
);
}
/** Appends <code>next</code> and all elements iterated by
* <code>tail</code> to <code>head</code>.
* @return <code>head</code> */
static <T extends Comparable<T>> List<? extends T>
addAll(List<T> head, T next, Iterator<T> tail) {
head.add(next);
return addAll(head, tail);
}
/** Appends all elements iterated by <code>tail</code> to <code>head</code>.
* @return <code>head</code> */
static <T extends Comparable<T>> List<? extends T>
addAll(List<T> head, Iterator<T> tail) {
while (tail.hasNext())
head.add(tail.next());
return head;
}
/** Merges elements iterated by <code>it1</code> and
* <code>ita</code> keeping only one of every pair comparing equal.
* @return the merged list */
static <T extends Comparable<T>> List<? extends T>
merge(Iterator<T> ita, Iterator<T> it1, List<T>sorted//, int n
) {
T itema = ita.next(),
item1 = it1.next();
while (true) {
int cmp = itema.compareTo(item1);
if (0 < cmp) {
sorted.add(item1);
if (!it1.hasNext())
return addAll(sorted, itema, ita);
item1 = it1.next();
} else {
sorted.add(itema);
if (!ita.hasNext())
return 0 == cmp ? addAll(sorted, it1)
: addAll(sorted, item1, it1);
itema = ita.next();
if (0 == cmp) {
if (!it1.hasNext())
return addAll(sorted, itema, ita);
item1 = it1.next();
}
}
}
}
Explore related questions
See similar questions with these tags.
list1 = [1]
andlist2 = [1, 1]
. \$\endgroup\$Collection
. \$\endgroup\$