Showing posts with label groupby. Show all posts
Showing posts with label groupby. Show all posts

Wednesday, March 10, 2010

How to reverse a map

Suppose you wish to take a map and swap the keys with values. The stackoverflow question Elegant way to revers a map in scala offers some good suggestions

  1. scala> val nodupes = Map(1 -> "a", 2-> "b", 3 -> "c")
  2. nodupes: scala.collection.immutable.Map[Int,java.lang.String] = Map((1,a), (2,b), (3,c))
  3. // Scala 2.8+
  4. scala> nodupes map {_.swap}                          
  5. res4: scala.collection.immutable.Map[java.lang.String,Int] = Map((a,1), (b,2), (c,3))
  6. // Scala 2.7
  7. scala> Map() ++ (nodupes map {case (k,v) => (v,k)})  
  8. res5: scala.collection.immutable.Map[java.lang.String,Int] = Map((a,1), (b,2), (c,3))
  9. // watch out if the values have duplicates you will loose information:
  10. scala> val dupes = Map(1 -> "a", 2-> "b", 3 -> "b")  
  11. dupes: scala.collection.immutable.Map[Int,java.lang.String] = Map((1,a), (2,b), (3,b))
  12. scala> dupes map {_.swap}                          
  13. res6: scala.collection.immutable.Map[java.lang.String,Int] = Map((a,1), (b,3))
  14. // a way to not loose any data
  15. scala> dupes groupBy {_._2} map {case (key,value) => (key, value.unzip._1)}     
  16. res12: scala.collection.Map[java.lang.String,scala.collection.immutable.Iterable[Int]] = Map((a,List(1)), (b,List(2, 3)))
  17. // I wanted to do the following for performance:
  18. scala> dupes.view groupBy {_._2} map {case (key,value) => (key, value.unzip._1)}
  19. java.lang.UnsupportedOperationException: IterableView((1,a), (2,b), (3,b)).newBuilder
  20. at scala.collection.TraversableViewLike$class.newBuilder(TraversableViewLike.scala:40)
  21. at scala.collection.IterableLike$$anon1ドル.newBuilder(IterableLike.scala:363)
  22. at scala.collection.TraversableLike$$anonfun$groupBy1ドル.apply(TraversableLike.scala:370)
  23. // but as you can see a view cannot yet be grouped.  Perhaps in the future.

Friday, October 30, 2009

Groupby - collection processing

Iterator and Iterable have most of the most useful methods when dealing with collections. Fold, Map, Filter are probably the most common. But other very useful methods include grouped/groupBy, sliding, find, forall, foreach, and many more. I want to cover Iterable's groupBy method in this topic.

This is a Scala 2.8 and later method. It is similar to partition in that it allows the collection to be divided (or partitioned). Partition takes a method with returns a boolean and partitions the collection into two depending on a result. GroupBy takes a function that returns an object and returns a Map with the key being the return value. This allows an arbitrary number of partitions to be made from the collection.

Here is the method signature:
  1. def groupBy[K](f : (A) => K) : Map[K, This]

A bit of context is require to understand the three Type parameters A, K and This. This method is defined in a super class of collections called TraversableLike (I will briefly discuss this in the next topic.) TraversableLike takes two type parameters: the type of the collection and the type contained in the collection. Therefore in this method definition, 'This' refers to the collection type (List for example) and A refers to contained type (perhaps Int). Finally K refers to the type returned by the function and are the keys of the groups formed by the method.

Examples:
  1. scala> val groups = (1 to 20).toList groupBy {
  2.      | case i if(i<5) => "g1"
  3.      | case i if(i<10) => "g2"
  4.      | case i if(i<15) => "g3"
  5.      | case _ => "g4"
  6.      | }
  7. res4: scala.collection.Map[java.lang.String,List[Int]] = Map(g1 -> List(1, 2, 3, 4), g2 -> List(5, 6, 7, 8, 9), g3 -> List(10, 11, 12, 13, 14), g4 -> List(15, 16, 17, 18, 19, 20))
  8. scala> groups.keySet
  9. res6: scala.collection.Set[java.lang.String] = Set(g1, g2, g3, g4)
  10. scala> groups("g1")
  11. res7: List[Int] = List(1, 2, 3, 4)
  12. scala> val mods = (1 to 20).toList groupBy ( _ % 4 )
  13. mods: scala.collection.Map[Int,List[Int]] = Map(1 -> List(1, 5, 9, 13, 17), 2 -> List(2, 6, 10, 14, 18), 3 -> List(3, 7,
  14.  11, 15, 19), 0 -> List(4, 8, 12, 16, 20))
  15. scala> mods.keySet
  16. res9: scala.collection.Set[Int] = Set(1, 2, 3, 0)
  17. scala> mods(1)
  18. res11: List[Int] = List(1, 5, 9, 13, 17)
Subscribe to: Posts (Atom)

AltStyle によって変換されたページ (->オリジナル) /