functional programming: impact of typedef-ing datatypes on code readability and maintenance

Question 1

In functional programming languages, such as Scala, data types and structures, are really important. I am in two minds about the use of type-defs in helping with the readability of the code manipulating non-trivial data structures.

Here is an example of a function that takes a generic collection in Scala, traverses it once in parallel and calculates its average value. Here I have used a type-def simply in order not to have (Int,Int) all over the place:

def average(xs:GenTraversable[Int]):Int={
 type IntTuple = (Int,Int)
 def addIntTuples(x:IntTuple,y:IntTuple):IntTuple=(x._1+y._1,x._2+y._2)
 val (sum,len)=xs.map(x=>(x,1))
 .aggregate((0,0))(addIntTuples,addIntTuples)
 sum/len
 }

Here is another version of the above function which tries to give the reader a better idea about what the function is doing by introducing typedefs indicating what the values in the tuple represent.

def readableAverage(xs:GenTraversable[Int]):Int={
 type Sum = Int
 type Len = Int
 type SumLen = (Sum,Len)
 def add(x:SumLen,y:SumLen):SumLen=(x._1+y._1,x._2+y._2)
 val (sum,len)=xs.map(x=>(x,1))
 .aggregate((0,0))(add,add)
 sum/len
 }

The second version is longer, but it perhaps gives the reader more of an insight into how the function operates. Question is: firstly, do you consider the second version actually more readable and insightful? If so, is the added benefit worth the increase in code length?

Question 2

Migrated from Code Review to Software Engineering because this question is primarily about the benefits of type aliases, with the code presented merely as an example.

Question 3

I strongly prefer the first version: addIntTuples does exactly what it says. It is a generic method that could even exist outside of this scope. This means that when I reason about the code, I can can think:

okay this function just adds pairs of Ints, simple, lets see what the rest does...

The other version forces specific meaning, that I need to appreciate before looking how it is actually used. Then I have to back and check:

What is this SumLen again? Ah.. just a tuple of these Sum and Len... What type was Sum again? Int or Double? Int, (why?) Okay, lets go back again...

This is of course exaggerated for small functions, but you can see it can become an issue for larger ones. I generally find type aliases that obscure the underlying type annoying.

When two approaches look of similar complexity, I always opt for the one that is the most generic. E.g. try to separate the essence of what a method does from utility-like methods. That means you can easily factor out a commonly used utility, and IMHO it makes code easier to reason.

EDIT:

The main benefit for having generic helper/util methods is that you communicate that there is "nothing to see here", no tricky business logic, just something that you wanted to hide/abstract from the actual interesting parts of the code.

Check this relevant SO answer that uses scalaz semigroup:

import scalaz._, Scalaz._
scala> (1, 2.5) |+| (3, 4.4)
res0: (Int, Double) = (4,6.9)

or the second answer that uses Numeric to create essentially the same thing that scalaz provides:

implicit class Tupple2Add[A : Numeric, B : Numeric](t: (A, B)) {
 import Numeric.Implicits._
 def |+| (p: (A, B)) = (p._1 + t._1, p._2 + t._2)
}
(2.0, 1) |+| (1.0, 2) == (3.0, 3)

These not only create reusable code, but do something more important: They communicate that there is nothing special there. E.g. there is nothing special about Int, it works with any type that has a Numeric, so that it can add them p._1 + t._1.

There is a very nice talk that touches this topic, Constraints Liberate, Liberties Constrain — Runar Bjarnason In a nutshell:

def f[T](a:T):T has only one valid implementation: def f[T](a:T):T = a. Being so generic, the method is constrained to a single valid implementation. def f(a:Int):Int has a Int.MaxValue * 2 valid implementations.

The takeaway message is that leaving your code needlessly specific to a particular use case opens it to multiple (and maybe incorrect) implementation and mental interpretations.

As for the type aliases, I don't really like them because they just give a different name to the same type, and the compiler will happily accept either. I like more value classes and tagged types http://eed3si9n.com/learning-scalaz/Tagged+type.html . Both create a different type from the original, e.g. Int, so the compiler will complain if you use e.g. a Len type at the place where it expects a Sum type.

Question 4

I agree. So essentially, the helper functions are either just slightly more sophisticated than simple lambdas, in which case type-defs are over-kill, or are perhaps bigger and potentially reusable in different contexts, hence typedefing with local specifics works against re-usability. So they are essentially not a very good idea. Would you therefore say type-defing is more of an object oriented practice and not exactly compatible with functional programming?

Question 5

Hi, have a look at my edit.

Question 6

+1 for value classes. Type aliases with compile-time checking are a really, really nice language feature.

Question 7

If I were going to go to those kinds of lengths to improve the readability of my code, I would just create a SumLen class

case class SumLen(sum: Int, len: Int) {
 def add(that: SumLen): SumLen = {SumLen(sum + that.sum, len + that.len)}
}

With adjustments to the syntax of the remaining code.

I'm sure that typedefs are great for some situations, but in this case I just don't feel like they pull their weight.

Question 8

You do have a point, but you are looking at it from a very OO mindset.In this case adding such a class would not be a very good idea as you'd have to create so many instances of this class while what you really want to do, is aggregating the existing structure. I am taking it as a given, that we are dealing with types first and foremost.

Question 9

Not so much an OO mindset as a practical mindset. My boss pays me to "make it work" and that means maintainability, which includes simplicity and readability and does not include premature optimization. We (and I mean all programmers, including myself) can get way too invested in our own cleverness. Sometimes, it really is just a nail, and all that is really needed is just a hammer.

Question 10

like I said, I do appreciate your point about pre-mature optimisation. What I was trying to say is that if we "were going to go to those kinds of lengths to improve the readability of code", we shouldn't add a class simply for readability's sake. the added burden of boxing-unboxing to this class, overshadows the benefit of the added readability.

Question 11

"you'd have to create so many instances of this class": Keep in mind that when you use a tuple, you create an instance of the Tuple2 class, so there is not a massive difference really.

Question 12

@V-Lamp I think the best trade-off would be implicit classes with tuple constructors if we want to go down this route. ie: implicit class SumLen(val sumLen: (Int,Int)) { def sum: A = weightedValue._1 def Len: B = weightedValue._2 }

V-Lamp V-Lamp 1761 bronze badge · Accepted Answer · 2017-04-10 13:34:37Z

I strongly prefer the first version: addIntTuples does exactly what it says. It is a generic method that could even exist outside of this scope. This means that when I reason about the code, I can can think:

okay this function just adds pairs of Ints, simple, lets see what the rest does...

The other version forces specific meaning, that I need to appreciate before looking how it is actually used. Then I have to back and check:

What is this SumLen again? Ah.. just a tuple of these Sum and Len... What type was Sum again? Int or Double? Int, (why?) Okay, lets go back again...

This is of course exaggerated for small functions, but you can see it can become an issue for larger ones. I generally find type aliases that obscure the underlying type annoying.

When two approaches look of similar complexity, I always opt for the one that is the most generic. E.g. try to separate the essence of what a method does from utility-like methods. That means you can easily factor out a commonly used utility, and IMHO it makes code easier to reason.

EDIT:

The main benefit for having generic helper/util methods is that you communicate that there is "nothing to see here", no tricky business logic, just something that you wanted to hide/abstract from the actual interesting parts of the code.

Check this relevant SO answer that uses scalaz semigroup:

import scalaz._, Scalaz._
scala> (1, 2.5) |+| (3, 4.4)
res0: (Int, Double) = (4,6.9)

or the second answer that uses Numeric to create essentially the same thing that scalaz provides:

implicit class Tupple2Add[A : Numeric, B : Numeric](t: (A, B)) {
 import Numeric.Implicits._
 def |+| (p: (A, B)) = (p._1 + t._1, p._2 + t._2)
}
(2.0, 1) |+| (1.0, 2) == (3.0, 3)

These not only create reusable code, but do something more important: They communicate that there is nothing special there. E.g. there is nothing special about Int, it works with any type that has a Numeric, so that it can add them p._1 + t._1.

There is a very nice talk that touches this topic, Constraints Liberate, Liberties Constrain — Runar Bjarnason In a nutshell:

def f[T](a:T):T has only one valid implementation: def f[T](a:T):T = a. Being so generic, the method is constrained to a single valid implementation. def f(a:Int):Int has a Int.MaxValue * 2 valid implementations.

The takeaway message is that leaving your code needlessly specific to a particular use case opens it to multiple (and maybe incorrect) implementation and mental interpretations.

As for the type aliases, I don't really like them because they just give a different name to the same type, and the compiler will happily accept either. I like more value classes and tagged types http://eed3si9n.com/learning-scalaz/Tagged+type.html . Both create a different type from the original, e.g. Int, so the compiler will complain if you use e.g. a Len type at the place where it expects a Sum type.

I agree. So essentially, the helper functions are either just slightly more sophisticated than simple lambdas, in which case type-defs are over-kill, or are perhaps bigger and potentially reusable in different contexts, hence typedefing with local specifics works against re-usability. So they are essentially not a very good idea. Would you therefore say type-defing is more of an object oriented practice and not exactly compatible with functional programming?
+1 for value classes. Type aliases with compile-time checking are a really, really nice language feature.

Stack Exchange Network

functional programming: impact of typedef-ing datatypes on code readability and maintenance

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

functional programming: impact of typedef-ing datatypes on code readability and maintenance

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions