Here is a method that takes two lists (l1
and l2
) and pairs up elements that "match". As you can see at the end of the code block these lists consist of matching elements but in different indexes. The method pairUp
outputs matching elements paired up, and discards those without a pair.
I spent way too much time writing this method, and it feels clumsy and complex, not scala idiomatic.
How could I have done this simpler? and could I make it faster?
case class A(serial:Int) {
def matches(b:B):Boolean = this.serial == b.serial
}
case class B(serial:Int)
import scala.annotation.tailrec
@tailrec def pairUp(
list1:List[A],
list2:List[B],
i:Int=0,
pairs:List[(A, B)]=List.empty[(A,B)]
):List[(A,B)] = {
if (list1.isEmpty || list2.isEmpty) return pairs
else {
if (i == list2.length) // this list1 element has no match
return pairUp(
list1.tail, // so discard
list2,
0, // reset counter
pairs)
else {
if (list1.head matches list2.head) // these elements match
return pairUp(
list1.tail,
list2.tail,
0,
pairs :+ ((list1.head, list2.head)))
else
return pairUp(
list1,
list2.tail :+ list2.head, // list2 element to back
i + 1,
pairs)
}
}
}
val l1 = List(0, 1, 2, 3, 4, 5).map(A(_))
val l2 = List(1, 3, 0, 2, 4).map(B(_))
val pairs = pairUp(l1, l2)
// List((A(0),B(0)), (A(1),B(1)), (A(2),B(2)), (A(3),B(3)), (A(4),B(4)))
println(pairs)
-
1\$\begingroup\$ Simpler or faster, pick one. :) \$\endgroup\$Daniel C. Sobral– Daniel C. Sobral2014年01月30日 21:49:23 +00:00Commented Jan 30, 2014 at 21:49
5 Answers 5
I am not an expert at scala, but this works for me
def pairUp2( list1: List[A], list2: List[B]): List[(A, B)] = {
(for{
a <- list1
b <- list2
if a.serial == b.serial
} yield (a,b))
}
val pairs = pairUp2(l1, l2)
pairs: List[(A, B)] = List((A(0),B(0)), (A(1),B(1)), (A(2),B(2)), (A(3),B(3)), (A(4),B(4)))
The for comprehension, takes each element of list1 and an element of list2 and yields a tuple only if the serial values match.
-
\$\begingroup\$ Simplicity of this is refreshing. However, I'm trying to wrap my head around how to tell which method provides more speed and/or lower footprint. \$\endgroup\$Dominykas Mostauskis– Dominykas Mostauskis2014年01月29日 08:37:28 +00:00Commented Jan 29, 2014 at 8:37
-
\$\begingroup\$ @DominykasMostauskis This solution is O(nm), where n and m are the respective sizes of list1 and list2. \$\endgroup\$Daniel C. Sobral– Daniel C. Sobral2014年01月30日 22:40:33 +00:00Commented Jan 30, 2014 at 22:40
Here's a solution that is fairly compact and fast for large lists, but returns the two disordered:
def pairUp(list1: List[A], list2: List[B]): List[(A, B)] = {
val g1 = list1.groupBy(_.serial).toList
val g2 = list2.groupBy(_.serial)
g1.flatMap{ case (k,as) => g2.get(k).toList.flatMap(as zip _) }
}
If you want to keep them ordered, then it's a bit more bookkeeping:
def pairUp2(list1: List[A], list2: List[B]): List[(A, B)] = {
val g1 = list1.zipWithIndex.groupBy(_._1.serial).toList
val g2 = list2.groupBy(_.serial)
val temp = g1.flatMap{ case (k,as) =>
g2.get(k).toList.flatMap(as zip _)
}
temp.sortBy(_._1._2).map{ case ((a,_), b) => (a,b) }
}
Note that the direct O(n^2)
search will be faster if n
is small.
A version that's probably no faster but slightly less verbose. I'm sure there's a much better way but it's a start:
case class A(serial: Int) {
def matches(b: B): Boolean = this.serial == b.serial
}
case class B(serial: Int)
def pairUp2(list1: List[A], list2: List[B]): List[(A,B)] = {
def pairInc(l1: List[A], l2: List[B], pairs: List[(A,B)]): List[(A,B)] = l1 match {
case a :: rest => l2.indexWhere(b => a.matches(b)) match {
case -1 => pairInc(rest, l2, pairs)
case i => pairInc(rest, l2.patch(i, List.empty[B], 1), pairs :+ (a, l2(i)))
}
case Nil => pairs
}
pairInc(list1, list2, List.empty)
}
val l1 = List(0, 1, 2, 3, 4, 5).map(A)
val l2 = List(1, 3, 0, 2, 4).map(B)
val pairs = pairUp2(l1, l2, matchFn)
// List((A(0),B(0)), (A(1),B(1)), (A(2),B(2)), (A(3),B(3)), (A(4),B(4)))
println(pairs)
This assumes that duplicate values in one list but not the other will only pair once. Otherwise you could use foldLeft, like so:
def pairUp3(list1: List[A], list2: List[B]): List[(A,B)] = {
list1.foldLeft(List.empty[(A,B)]) { case (p, a) =>
list2.find(b => a.matches(b)) match {
case Some(b) => p :+ (a, b)
case None => p
}
}
}
If you first sort the lists, matching only takes a single traversal, making the solution O(n log(n)) (which is the time complexity of the sorts). I've also generalized over the lists' element types, only requiring projection functions proj*
that can convert A
and B
into some common type C
that can then be ordered (for sorting) and compared with ==
.
import scala.annotation.tailrec
def zipMatching[A, B, C](a: List[A], b: List[B], projA: A => C, projB: B => C)
(implicit ord: Ordering[C]): List[(A, B)] = {
import ord._ // Get access to `>`
@tailrec
def zipSortedMatching(
aSorted: List[A], bSorted: List[B], accum: List[(A, B)]
): List[(A, B)] = {
(aSorted, bSorted) match {
case (aHead +: aRest, bHead +: bRest) =>
val compA = projA(aHead)
val compB = projB(bHead)
if (compA == compB)
zipSortedMatching(aRest, bRest, accum :+ (aHead, bHead))
else if (compA > compB)
zipSortedMatching(aSorted, bRest, accum)
else
zipSortedMatching(aRest, bSorted, accum)
case _ =>
accum
}
}
zipSortedMatching(a.sortBy(projA), b.sortBy(projB), Nil)
}
case class AA(serial: Int)
case class BB(serial: Int)
val l1 = List(0, 1, 2, 3, 4, 5).map(AA(_))
val l2 = List(1, 3, 0, 2, 4).map(BB(_))
def projAA(a: AA) = a.serial
def projBB(b: BB) = b.serial
val pairs = zipMatching(l1, l2, projAA, projBB)
// pairs: List[(AA, BB)] = List((AA(0),BB(0)), (AA(1),BB(1)), (AA(2),BB(2)), (AA(3),BB(3)), (AA(4),BB(4)))
A small comment: it's awkward that matches
is a member function of A
and not B
. It would be cleaner to have matches
as an individual function.
-
\$\begingroup\$ This may be a valuable comment, but it should be posted as one. I.e. this doesn't qualify as an answer. \$\endgroup\$Dominykas Mostauskis– Dominykas Mostauskis2017年08月13日 09:56:31 +00:00Commented Aug 13, 2017 at 9:56
-
\$\begingroup\$ @Dominykas A code review is basically just a list of comments. My list has only one item. \$\endgroup\$toto2– toto22017年08月13日 16:45:14 +00:00Commented Aug 13, 2017 at 16:45
-
\$\begingroup\$ Fair enough, I retract my critique. However, SE doesn't let me retract my downvote, unless the answer is edited. \$\endgroup\$Dominykas Mostauskis– Dominykas Mostauskis2017年08月14日 16:52:31 +00:00Commented Aug 14, 2017 at 16:52