First steps in Scala: does this look idiomatic?

Question 1

I an a newcomer in the Scala world, and I would like some advice from more experienced people to find out whether what I am writing goes in the direction of idiomatic code.

In particular, I have to write a simple algorithm that will do the following. We are given an integer number of, say, candies to be distributed, and a list of proportions, which are also integers. We should split the candies into groups, such that the proportions are respected as accurately as possible, given that a single candy cannot be broken up.

For instance, given 17 candies to distribute in proportions of (3, 2, 5) we should obtain (5, 4, 8).

Here is my code, including a quick ScalaCheck test.

import org.scalacheck.Prop._
import org.scalacheck.Gen
object Distribution {
 private class IntWithDivision(val self: Int) {
 def @/(other: Int): Float = self.toFloat / other
 }
 private implicit def divisibleInt(int: Int) = new IntWithDivision(int)
 def distribute(amount: Int, proportions: Seq[Int]): Seq[Int] = {
 val sum = proportions.sum
 val byDifect = proportions map { x => (amount * x @/ sum).toInt }
 val approximations = (byDifect, proportions).zipped map { (x, y) => x @/ y }
 val lowValuesIndices = (
 approximations.view.zipWithIndex
 sortBy { _._1 }
 take (amount - byDifect.sum)
 map { _._2 }
 ).force
 val remainders = (
 approximations.view.zipWithIndex
 map { x => if (lowValuesIndices contains x._2) 1 else 0 }
 ).force
 (byDifect, remainders).zipped map { _ + _ }
 }
 def main(args: Array[String]) {
 val pos = Gen.choose(1, 50000)
 val posList = Gen.containerOf[List, Int](pos)
 val propSum = forAll(pos, posList) { (amount: Int, proportions: List[Int]) =>
 (proportions.size > 0) ==> (amount == distribute(amount, proportions).sum)
 }
 propSum.check
 }
}

In particular, I would like advice on the following points:

The algorithm goes over the list of proportions and derived lists a few times. I would like to use more laziness, but using proportions.view at the beginning and then returning a force gives a type error.
The code is more convoluted than I would like. I think it may be simplified knowing the Collections API better.

Any help or suggestion is welcomed.

Question 2

Arguably it should be 5, 3, 9 -- the second group has a smaller piece than the last one, so it should be the last one to get it. What's the criteria for distributing remains?

Question 3

Well, the criterion I have adopted is to compute the ratio between the original proportions and the computed proportions and minimize the variability (max - min).

Question 4

In practice this means that I distribute the candies computing the exact proportion and rounding by difect. Then I distribute the leftovers starting from the kid for which the ratio assignedCandies/assignedProportion is lowest, and repeat. It is easy to see that in this way, the no kid will get more than one leftover, so I directly compute those kids having the lowest value and assing one candy to all of them.

Question 5

I hope now the algorithm makes a little more sense.

Question 6

Ok, I'm changing the answer now that I understand what you are doing.

The main problem here is @/ -- while Scala people, in general, don't mind special operators, they don't add operators just because they can either. You can replace @/ with the existing / just by adding .toFloat to any one of the terms.

Views aren't often used either, and it's important to have a very good understanding of how they work if you are going to use them, and it's not that easy to gain performance with them, since the machinery they use to support non-strictness is quite heavy, and not everything takes advantage of it. For example, sortBy will create a new collection before take and map are applied.

Views can gain when you have many mapping/slicing steps, and few elements of it are ever used. Most of the time, iterators will gain you much more performance, at the cost of the mutability problems iterators have.

If you want to reduce the number of times you iterate through the list of proportions, there's at least one place where you can simplify:

 val remainders = (
 approximations.view.zipWithIndex
 map { x => if (lowValuesIndices contains x._2) 1 else 0 }
 ).force
 (byDifect, remainders).zipped map { _ + _ }

Can be reduced to

 byDifect.zipWithIndex map {
 case (bd, i) => bd + (if (lowValuesIndices contains i) 1 else 0)
 }

One could also keep byDifect a Float, then either use it alone when computing approximation (instead of zipping stuff), or skip that altogether and put that computation on sortBy -- incurring the cost of computation O(nlogn) times instead of O(n) times. It would make the code shorter, but whether it would be faster or not is something I'd leave to a benchmark with a real application -- I'm guessing it would depend on actual sizes for proportions.

So, let's talk a bit about performance. Before Scala 2.10, if you want performance you should avoid methods added through implicits on critical paths. The code you wrote will probably get inlined by JIT. You can also reduce the number of computations by pre-computing amount / sum, and if you make that amount.toFloat / sum, then you don't need /@.

More specifically, views are not guarantees of speed, particularly if the computations are light, such as here. I'd not use them at all, unless I'm specifically optimizing the code.

Doing a fixed size of multiple passes on small data structures is often not a problem. You are not changing the complexity, just losing memory locality. If the data is bigger, you can incur in gc overheads, which are more substantial. If maximum performance is required, just drop immutability and go to mutable arrays.

Finally, contains is faster on Set than Seq -- and, in this particular case, a BitSet would be way faster. Call it apply, however, since contains is a general method on traversables, while set's apply is a fundamental operation. If one of them is less than optimized, it will be contains.

This is the most idiomatic beginner's code I have ever seen... do you come from another functional language?

Question 7

Of course we are not allowed to leave any candies, otherwise the problem would be trivial. In any case, I would be happy to know whether the way I have written the algorithm in Scala is idiomatic.

Question 8

@Andrea Ok, revised the answer.

Question 9

Thank you very much for your advice! I also do not like an exceeding use of operators in public APIs, but I figured in this case it was all kept private, just make some lines more readable. As you suggest, it is better to keep amount/sum as a float and avoid the issue altogether. As for my background, I mostly come from Python and Javascript, but I have dabbled with Clojure, Scheme and Haskell before.

Question 10

@DanielC.Sobral, I'd like to abuse of that answer to bring this question to your attention: codereview.stackexchange.com/questions/101339/kosaraju-in-scala. Thanks in advance if you have any time for that :-)

Daniel C. Sobral Daniel C. Sobral 9064 silver badges6 bronze badges · Accepted Answer · 2012-08-10 15:04:03Z

Ok, I'm changing the answer now that I understand what you are doing.

The main problem here is @/ -- while Scala people, in general, don't mind special operators, they don't add operators just because they can either. You can replace @/ with the existing / just by adding .toFloat to any one of the terms.

Views aren't often used either, and it's important to have a very good understanding of how they work if you are going to use them, and it's not that easy to gain performance with them, since the machinery they use to support non-strictness is quite heavy, and not everything takes advantage of it. For example, sortBy will create a new collection before take and map are applied.

Views can gain when you have many mapping/slicing steps, and few elements of it are ever used. Most of the time, iterators will gain you much more performance, at the cost of the mutability problems iterators have.

If you want to reduce the number of times you iterate through the list of proportions, there's at least one place where you can simplify:

 val remainders = (
 approximations.view.zipWithIndex
 map { x => if (lowValuesIndices contains x._2) 1 else 0 }
 ).force
 (byDifect, remainders).zipped map { _ + _ }

Can be reduced to

 byDifect.zipWithIndex map {
 case (bd, i) => bd + (if (lowValuesIndices contains i) 1 else 0)
 }

One could also keep byDifect a Float, then either use it alone when computing approximation (instead of zipping stuff), or skip that altogether and put that computation on sortBy -- incurring the cost of computation O(nlogn) times instead of O(n) times. It would make the code shorter, but whether it would be faster or not is something I'd leave to a benchmark with a real application -- I'm guessing it would depend on actual sizes for proportions.

So, let's talk a bit about performance. Before Scala 2.10, if you want performance you should avoid methods added through implicits on critical paths. The code you wrote will probably get inlined by JIT. You can also reduce the number of computations by pre-computing amount / sum, and if you make that amount.toFloat / sum, then you don't need /@.

More specifically, views are not guarantees of speed, particularly if the computations are light, such as here. I'd not use them at all, unless I'm specifically optimizing the code.

Doing a fixed size of multiple passes on small data structures is often not a problem. You are not changing the complexity, just losing memory locality. If the data is bigger, you can incur in gc overheads, which are more substantial. If maximum performance is required, just drop immutability and go to mutable arrays.

Finally, contains is faster on Set than Seq -- and, in this particular case, a BitSet would be way faster. Call it apply, however, since contains is a general method on traversables, while set's apply is a fundamental operation. If one of them is less than optimized, it will be contains.

This is the most idiomatic beginner's code I have ever seen... do you come from another functional language?

Of course we are not allowed to leave any candies, otherwise the problem would be trivial. In any case, I would be happy to know whether the way I have written the algorithm in Scala is idiomatic.
Thank you very much for your advice! I also do not like an exceeding use of operators in public APIs, but I figured in this case it was all kept private, just make some lines more readable. As you suggest, it is better to keep amount/sum as a float and avoid the issue altogether. As for my background, I mostly come from Python and Javascript, but I have dabbled with Clojure, Scheme and Haskell before.
@DanielC.Sobral, I'd like to abuse of that answer to bring this question to your attention: codereview.stackexchange.com/questions/101339/kosaraju-in-scala. Thanks in advance if you have any time for that :-)

Stack Exchange Network

First steps in Scala: does this look idiomatic?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

First steps in Scala: does this look idiomatic?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions