Monday, May 9, 2011
Why Eager Languages Don't Have Products and Lazy Languages Don't Have Sums
In The Point of Laziness, Bob Harper mentioned a core duality between lazy and eager languages[1]. Eager languages have sum types but not product types and lazy languages have product types but no sum types. I've seen a few questions about what he means. I thought I'd try to explain. It all has to do with non-termination, our old friend "bottom" which I'll write ⊥. In this entire article I'm focusing on languages that don't allow any side effects other than ⊥ - that way I don't have to deal with ordering print statements or whatever. Also, I'm using an MLish notation that hopefully isn't too confusing.
Update: This comment by Dan Doel does a better job of explaining the issues than my whole article.
Product Types
A product type is a pair. The pair type (A,B) is the product of A and B, and is often written as A * B or A × B. A 3-tuple[2] (A,B,C) can be seen as another way of saying ((A,B),C), etc. so any n-Tuple is just a pretty way to write nested pairs. The simplest way to define exactly what a pair means is through projection functions "first" and "second" which pull out the first or second element. The equations we want are
first (a,b) == a second (a,b) == b
Pretty straight forward right? Well, those equations hold just fine in a lazy language even when a or b is ⊥. But in an eager language, first (a,⊥) == ⊥. Same deal with second. Oops. Hence lazy languages have product types in the sense that they completely meet these equations where eager languages don't.
That has a practical consequence. A compiler for an eager language can't necessarily simplify first(a, someComplicatedExpression) unless it can prove that someComplicatedExpression is never ⊥, and in general it can't do that. A lazy language compiler would always be free to throw away someComplicatedExpression.
Sum Types
A sum type is an alternative between two choices. A | B is the sum of the two things, A and B, and is frequently written A + B. Choices among three elements, like A | B | C can be considered as just a pretty form of (A | B) | C - a nesting of binary choices. The simplest sum type is good old Boolean, the choice between true or false. Here's one equation we might expect to hold for Booleans and something that examines booleans, the if expression[3]
((if condition then x else y), z) == if condition then (x,z) else (y,z)
Ignore the possibility of x, y, and z being ⊥ and focus on "condition". In an eager language, condition must be evaluated in either of the two forms above and either way you get ⊥ if condition is ⊥. But in a lazy language, the left side will compute (⊥, z) where he right side will compute ⊥. Pulling the second element from those two results gives differing results, proving they are unequal.
Again, this has consequences. An eager language can simplify the right side form to the left side form if it wants to. A lazy language can't.
Footnotes
- Strictly speaking I should say strict and non-strict instead of eager and lazy, but I'll go with Dr. Harper's wording because the difference isn't that important to the point.
- Or a record (aka a struct). A tuple can be seen as a record where the field labels are numbers or a record can be seen as a tuple with some convenient labels for the positions.
- If you are more familiar with C like languages that treat "if/else" as statements instead of expressions then pretend I'm using a ternary operator "condition ? x : y" instead of "if condition then x else y"
Monday, October 18, 2010
Phantom Types In Haskell and Scala
/*
Some literate Haskell and Scala to demonstrate
1) phantom types
2) I am a masochist
3) ???
4) profit!
The code is probably best viewed with a syntax colorizer
for one language or the other but I've colorized all my
comments.
> {-# LANGUAGE EmptyDataDecls #-}
> module RocketModule(test1, test2, createRocket, addFuel, addO2, launch) where
*/
object RocketModule {
/*
None of these data types have constructors, so there are
no values with these types. That's okay because I only
need the types at compile time. Hence "phantom" -
ethereal and unreal.
> data NoFuel
> data Fueled
> data NoO2
> data HasO2
*/
sealed trait NoFuel
sealed trait Fueled
sealed trait NoO2
sealed trait HasO2
/*
The Rocket data type takes two type parameters, fuel and
o2. But the constructor doesn't touch them. I don't
export the constructor so only this module can create
rockets.
> data Rocket fuel o2 = Rocket
*/
final case class Rocket[Fuel, O2] private[RocketModule]()
/*
createRocket produces a rocket with no fuel and no o2
> createRocket :: Rocket NoFuel NoO2
> createRocket = Rocket
*/
def createRocket = Rocket[NoFuel, NoO2]()
/*
addFuel takes a rocket with no fuel and returns one with
fuel. It doesn't touch the o2
> addFuel :: Rocket NoFuel o2 -> Rocket Fueled o2
> addFuel x = Rocket
*/
def addFuel[O2](x : Rocket[NoFuel, O2]) = Rocket[Fueled, O2]()
/*
Similarly, addO2 adds o2 without touching the fuel
> addO2 :: Rocket fuel NoO2 -> Rocket fuel HasO2
> addO2 x = Rocket
*/
def addO2[Fuel](x : Rocket[Fuel, NoO2]) = Rocket[Fuel, HasO2]()
/*
launch will only launch a rocket with both fuel and o2
> launch :: Rocket Fueled HasO2 -> String
> launch x = "blastoff"
*/
def launch(x : Rocket[Fueled, HasO2]) = "blastoff"
/*
This is just a pretty way of stringing things together,
stolen shamelessly from F#. Adding infix operations is
a bit verbose in Scala.
> a |> b = b a
*/
implicit def toPiped[V] (value:V) = new {
def |>[R] (f : V => R) = f(value)
}
/*
Create a rocket, fuel it, add o2, then
launch it
> test1 = createRocket |> addFuel |> addO2 |> launch
*/
def test1 = createRocket |> addFuel |> addO2 |> launch
/*
This compiles just fine, too. It doesn't matter which
order we put in the fuel and o2
> test2 = createRocket |> addO2 |> addFuel |> launch
*/
def test2 = createRocket |> addO2 |> addFuel |> launch
//This won't compile - there's no fuel
// > test3 = createRocket |> addO2 |> launch
// def test3 = createRocket |> addO2 |> launch
// This won't compile either - there's no o2
// > test4 = createRocket |> addFuel |> launch
// def test4 = createRocket |> addFuel |> launch
// This won't compile because you can't add fuel twice
// > test5 = createRocket |> addFuel |> addO2 |> addFuel |> launch
// def test5 = createRocket |> addFuel |> addO2 |> addFuel |> launch
}
Monday, March 8, 2010
Robert Fischer Finally Admits that Scala is Functional
Robert Fischer says
If your language doesn’t lead people to re-discover point free programming at least in the small, then the language really isn’t taking function manipulation and functional language type conceptions seriously.
Thus Fischer has finally recognized that Scala is functional.
val double : Int => Int = 2 * // point free function definition
List(1,2,3) map double /* removes one point from List(1,2,3) map {x
=> double(x)} */
Thank you Robert for finally getting to the truth.
Oh, the Underscore
Now, Robert may object because I didn't combine map and double into one super function, mapDouble. Okay
val mapDouble : List[Int] => List[Int] = _ map double mapDouble(List(1,2,3))
Crap, there's an underscore. Is it a "point"? Good question. And the answer is no. In fact, what it is is a hole in the expression. Oleg Kiselyov, somebody who actually knows about functional programming, says they're related to delimited continuations and what could be more functional than delimited continuations?
Also, Erlang Remains Unfunctional
For some reason Fischer really hates on Erlang. Point-free is definitely not Erlang's strong suit at all due to it's ability to overload on arity. I don't know why the Erlangers don't go picket his blog.
Post Script on Converses
David MacIver points out (correctly) that my post assumes that a statement implies its converse. To that I say "pffft, write your own damn parody, David!"
Friday, August 21, 2009
Getting to the Bottom of Nothing At All
Except when I'm being a total smart ass or self indulgent emo telling the Ruby community to f' off, most of what I write talks about practical stuff, often with a small dose of theory. But this time it's about theory with a small dose of practice.
Last time I talked about the way programmers in popular C derived languages C++, Java, and C# think of a void function as being one that returns nothing. I contrasted this with the functional view that such functions return something, they just return the unit value (), something that doesn't carry any information. To expand on this I want to talk about actually returning nothing.
In any Turing complete language it is possible to write a function that really truly never returns anything, not even a made-up singleton value like (). In C style languages the simplest such function might contain an infinite loop like
while(true);For my purposes a better way to explore the idea is with infinite recursion. Pretend your favorite C style language does tail call optimization[1] so we can ignore stack overflow issues and examine the following
X foo(){ return foo(); }The question is what should the X be? And the answer is that if you plug that line of code into C, C++, C#, or Java then X can be just about any type you want[2].
If that doesn't give you pause then consider this: your type system appears to be lying to you. The function is promising an X but has no hope of delivering. And its quite happy to promise you any X. If you write "String foo(){ return foo(); }" then your type system considers the expression "foo();" to have type String so that you can write "String myFoo = foo();". But guess what, foo doesn't return a String and myFoo will never get a String. Is the type system lying?
The answer is no, of course not. It's just proving something a bit weaker than you might think at first. Instead of proving that "foo() will compute something that is a String" it's proving that "foo() won't compute something that isn't a String." If your "Law of the Excluded Middle" sense just tweaked, then you're on the right page. You'd think there are only two cases to consider: either the function definitely computes a String or it definitely doesn't. But there's this third case where it doesn't compute anything, and since the type checker can't reliably detect non-termination (c.f. Turing Halting Problem) it has to do something a bit odd. First, it assumes that the foo() declaration is correct and does return a String. Then it goes looking for a contradiction such as returning an int. Inside the function the compiler sees that the return is the result of the expression foo(). But the compiler has already assumed that foo() returns a String. There's no contradiction between declared and result so all is happy. The word "tautology" should come to mind right about now. By assuming X is true the type checker has proved that X is true. This weakening is a practical consequence of the Turing Halting Problem and there are at least two good ways to think about it. But first some more examples of the phenomenon.
Exceptions and Exits and ...
I very casually suggested that we ignore stack overflow issues in the infinite recursion above. But exceptions are an important part of this story because they are, in some interesting way, related to non-termination. Consider this function (or its equivalent in your favorite language)X foo(){ throw new RuntimeException(); }Once again, X can be any type. And once again foo() does not in fact compute an X.Clearly these two definitions of foo are different things. Non-termination means we're stuck whereas an exception means we might be able to recover and try something else, or at least notify that there's a problem and shutdown cleanly. None-the-less they both behave similarly. First, they hijack the normal flow of control in such a way that it never returns back to the caller. And second they are both examples of a kind of weakening in the type system since any type can be substituted for X. Formally they are both examples of functions that diverge (functions that return normally are said to converge).
The Type
Here's a third example of a diverging function in Java. Translate as necessary for your language (in Java System.exit(n) stops the program immediately and returns n to the calling process).X foo(){ System.exit(1); return null; }Yet another case where foo() won't compute anything, it diverges. In fact, the return statement after the exit call is dead code. However, this is example is slightly different than the other examples because we had to create the dead code for the compiler to be happy[3] and, in fact for a different "X" like int the return value might have to be changed. That bit of dead code is closely related to the heart of this article.
Java can detect some kinds of dead code. If you write "X foo() { throw new RuntimeException(); return null; };" then Java recognizes that the return is unreachable and complains. In my System.exit(1) example Java didn't recognize that the call to exit() will never return so it required a following return statement. Obviously "throw" is a keyword and can get special attention that a mere library function can't but it would be useful to be able to let Java know that, like throw, exit() diverges.
One of the best ways to tell a compiler how a function behaves is by using types and, in type theory, there's a type that expresses just what we need. The type is called bottom (often written ⊥), and while there are different ways to look at the bottom type I'm going to go with a subtyping based view that should be accessible to C++, C#, and Java programmers.
If a language has subtyping and a function says it will return a type "X" then the function is allowed to return a "Y" instead as long as Y is a subtype of X. In my example of a function that just throws an exception the return type could be anything at all. So if we wanted System.exit(1) to indicate that it diverges the same way throw does, then its return type should be a subtype of all types. And indeed, that's exactly what bottom is.[4] bottom is a subtype of String, and int, and File, and List<Foo>, and every other type. Usefully, conventional type hierarchies are written with supertypes above subtypes, which makes a convenient mnemonic for "bottom" which needs to go below everything on such a hierarchy.
Now, if you're used to OO thinking then you expect a value with a certain subtype to in some sense be substitutable everywhere that a supertype is expected. But how can any one object behave like a String, an int, a File, etc? Remember that bottom indicates divergence: an expression with type bottom can never compute a value. So if exit()'s return type was bottom it would be totally safe to write "String foo() { return System.exit(1); }" while another bit of code could have "int bar() {return System.exit(1); }."
Making it Useful, A Divergence to Scala Divergence
Occasionally it might be useful to indicate that a function diverges. Examples are functions like System.exit(1), or functions that always throw an exception, perhaps after logging something or doing some useful calculation to create the exception. But interestingly out of all the statically typed languages that have any following outside of pure research only Scala has an explicit bottom type, which it calls Nothing. The reason Scala has a bottom type is tied to its ability to express variance in type parameters.For some reason a lot of programmers run screaming into the night when you say "covariance" or "contravariance." It's silly. I won't get into all the details of variance, but I will say that in Scala the declaration "class List[+T] {...}" means that List[Subtype] is a subtype of List[Supertype]. No screaming necessary. And List[+T] brings me to one extremely practical use of bottom - what type should an empty List have?
Well, an empty List can have type List[String] or List[Foo] or List[int]. T can be whatever. And what's a subtype of whatever for all values of whatever? You guessed it: bottom (Nothing). Indeed, Scala has one constant called Nil whose type is List[Nothing]: a subtype of List[String], List[int], and List[whatever]. It all ties up in a bow when you consider that a List[T] has a method called head which returns the first element of the list as type T. But an emtpy list has no first value, it must throw an exception. And sure enough, head in List[Nothing] has type Nothing.
C# 4.0 is supposed to be getting definition site variance similar to Scala's but using the clever mnemonic keywords "in" and "out". I haven't heard anything yet on whether it will also add a bottom type, but it would make a lot of sense.
Java has usage site variance using wildcards. You can say "List<? extends Supertype> x" to indicate that x can have a List<Supertype> or a List<Subtype>. The bottom type would be useful in Java, too, although not as compelling since wildcards are so verbose people rarely use them even when they would make sense. Plus, Java folk tend to mutate everything and List[Nothing] in part makes sense in Scala because Scala Lists are immutable.
C++ does not have any simple way to express this kind of variance so the bottom type is even less compelling in C++ than it is in Java.
Back On Track
Haskell and languages in the ML family don't have an explicit bottom type. Their type systems don't have subtyping so adding bottom as a subtype would confuse things. None-the-less, they do have a nice way to express bottom that can be teleported back to Java, C#, and C++ (but not C). Recall that bottom is a subtype of all types. Another way of saying that is that if a function returns type bottom then for all types A, the function returns something compatible with A so why not express that directly? In Haskell the "error" function takes a string and throws an exception.Prelude> :type error error :: [Char] -> aIn Haskell, a lower case identifier in type position is always a type parameter, and [Char] means "list of Char", aka String. So for all types a, if "error" doesn't diverge then it will take a String and return an a. That can pretty much be expressed directly in Java
public static <A> A error(String message) { throw new RuntimeException(message); }or C++
class message_exception : public std::exception {
public:
explicit message_exception(const std::string& message) : message_(message) {}
virtual ~message_exception() throw() {}; virtual const char * what() const throw() { return message_.c_str(); };
private:
const std::string message_;
};
template <typename A>
A error(const std::string& message) {throw message_exception(message); }
And for either language, usage would be something like
int divide(int x, int y) {
if (y == 0) {
return error<int>("divide by zero"); // drop the "<int>" in Java
} else {
return x / y;
}
}Haskell also has a function called "undefined" that simply throws an exception with the message "undefined." It's useful when you want get started writing some code without fully specifying it.
Prelude> :type undefined undefined :: a
The function isn't as interesting as the type, which promises that for any type a "undefined" can compute an a or diverge. Since "undefined" can't possible just produce a value of any arbitrary type, it has no choice but to diverge. The same idea can be added to Java
public static <A> A undefined() {return error("undefined"); } or C++template <typename A>
A undefined() { return error<A>("undefined"); }In either language it might be used as
string IDontKnowHowToComputeThis(int input) {
return undefined<string>(); // again, make appropriate changes for Java
} Given the requirement to write the "return" keyword in C#, Java, and C++ I'm not sure how practical something like a generified error function really is as compared to having it return an exception and making the user write 'throw error("blah")', nor whether "undefined" is that much more useful than just throwing an UndefinedException, but this does illustrate the relationship between the bottom type and functions that, in Haskell terms, compute "forall a.a" or in C#/Java/C++ terms return the type parameter A without taking an A as an argument.
Also, as always care should be taken when transliterating from one language to another. Java would allow a function like "undefined" to return a null instead of diverging. C++ would allow it to return anything all and would only fail to compile if it were used in an incompatible way. That contrasts with languages like Haskell and ML in which the only way to implement "undefined :: a" is to make it diverge in some form or fashion[5].
The Bottom Value
I've spent some time talking about the bottom type as having no values. But it does have expressions like "undefined()" and that leads to a rather philosophical notion of how to think of bottom as having a value. Sorta. Skip this section if you don't like total gray beard philosophizing. If you're brave, then stick with me and imagine a subset of your favorite C derived language that does not allow side effects. No mutation, no IO, and no exceptions. In this strange world functions either just compute values or they diverge by not halting. In such a world the order of evaluation mostly isn't important as long as data dependencies are met - in f(a(), b()), you can compute a() before b() or b() before a(), whatever, they can't interfere with each other. Doesn't matter. The only time order of evaluation is forced on us is when there are data dependencies, so "a() + b()" must compute a() and b() (or b() and a()) before their sum can be computed. Nice.Well, almost. Order of evaluation can matter for expressions that diverge. Let me give an example.
int infiniteLoop() {return infiniteLoop();}
int f(int x) {return 42;}
int result = f(infiniteLoop());Because f ignores its argument, if we call "f(infiniteLoop());" there are two possible outcomes. If "infiniteLoop()" is eagerly evaluated before f is called then the program will diverge. On the other hand, if the expression "infiniteLoop()" is lazily remembered as being potentially needed later then f can successfully return 42 and the diverging expression can be forgotten just like it never happened (because it didn't).
We've gone to the pain of eliminating side effects from our language so it's a little irritating to have to keep thinking about evaluation order just to deal with divergence, so we perform a little mental trick; a little white lie. Imagine that functions like infiniteLoop above don't get stuck, they compute a value called ⊥, which is the only value of type bottom.
Now, since the bottom type is a subtype of all types and ⊥ is the bottom value, then it follows that every type must be extended with ⊥. Boolean = {⊥, true, false}, int = {⊥, 0, 1, -1, 2, -2, ...}, and unit = {⊥, ()}. That means we need some rules for ⊥ and how it interacts with everything else. In the vast majority of languages including Java, C#, and C++ but also impure functional languages like F# and OCaml doing just about anything with ⊥ can only compute ⊥. In other words, for all functions f, f(⊥) = ⊥. If you write f(infiniteLoop()) in those languages then the result is ⊥. This kind of rule is called "strictness".
In contrast, Haskell is often called a "lazy language," meaning that expressions aren't evaluated until they're needed. That's not quite technically correct. The Haskell spec just says that it is "non-strict." The spec doesn't care when expressions are evaluated so long as programs allow ⊥ to slide as far as possible. An expression like f(infiniteLoop) must evaluate to 42. Haskell basically forces an expression involving ⊥ to evaluate to ⊥ only when the argument must be used[6]. The distinction between "lazy" and "non-strict" is subtle, but by being "non-strict" rather than "lazy" a Haskell compiler can use eager evaluation anytime it can prove that doing so doesn't change behavior in the face of ⊥. If a function always uses its first argument in a comparison, then Haskell is free to use eager evaluation on that argument. Since Haskell truly does forbid side effects(unlike our imagined neutered language above), the choice of evaluation strategy is up to the compiler and invisible except for performance consequences[7].
C++, Java, and C# have just a tiny bit of non-strictness. In these languages "true || ⊥" is true and "false && ⊥" is false. If these languages were totally strict then "true || ⊥" would be ⊥. Users of these languages call this behavior "short circuiting" and it's done for performance reasons rather than being a philosophical goal, but it's still a curious departure from their normal rules.
There you have it. The bottom value ⊥ is a clever mental hack to allow purely declarative functional code to be reasoned about without injecting sequencing into the logic. It allows people to talk about the difference between a purely declarative strict language and a purely declarative non-strict language without getting into details of evaluation order. But since we're talking about languages that aren't so purely declarative, we can take off our philosopher hats and return back to the world where side effects are unrestricted, bottom is a type with no values and divergence means that flow of control goes sideways.
Tuples and Aesthetics
Last time I talked about the unit type and how, if you interpret types as logical propositions, the unit type behaves as a "true" and tuple types act as logical and "∧." I also talked about an algebraic interpretation where unit type acts like 1 and tuple types act like multiplication "×". So the type (A,B) can also be written as A∧B or A×B. The type (A,unit) is isomorphic to A, A×1 = A, and A∧True <=> A.Bottom has similar interpretations as 0 or False. The type (A,bottom) is isomorphic to the bottom type because you can't compute any values of type (A,bottom). A×0 = 0, and A∧False <=> False. Nice how it all hangs together, eh?
Bottom behaves like False in another way. In logic if you assume that False is true you can prove anything. Similarly in type systems the bottom type allows the programmer to promise to do even the impossible. For instance, here's a Java function signature that promises to convert anything to anything.
public static <A,B> B convert(A arg) {...}If you ignore dynamic casting and null (they're both weird in different ways) there's no meaningful way to implement that function except by diverging. More on this in an upcoming episode.Conclusion
I somehow don't think anybody will be running into the office saying "I just read an article on the bottom type, so now I know how to solve our biggest engineering challenge." But the bottom type is still interesting. It's a kind of hole in your static type system that follows inevitably from the Turing Halting Problem[8]. It says that a function can't promise to compute a string, it can only promise to not compute something that isn't a string. It might compute nothing at all. And that in turn leads to the conclusion that in a Turing complete language static types don't classify values (as much as we pretend they do) they classify expressions.Footnotes
- gcc does do tail call optimization under some circumstances.
- Oddly, in C, C#, and Java X can't be "void" because its an error to return an expression of type void. C++ does allow void to make writing templates a bit easier.
- Yes, I can make it a void function and not have a return. But again that would illustrate how exit() is different from a throw: throw doesn't require the return type to be void.
- Note I'm saying "subtype" here and not "subclass." int, List<String>, and List<File> are 3 different types. In C++, Java, and C# int doesn't have a class. In Java and C# List<String> and List<File> both come from the same class. Yet bottom must be a subtype of all 3 so it can't be a subclass.
- A plea to my readers: I'm too lazy to see if "forall a.a" is bottom in F# or C#, or if, like Java, null would be allowed. My bet is that null won't be allowed because only Java has the peculiar notion that type parameters must be bound to reference types.
- Very loosely speaking. Please see the Haskell Report for all the gory details about when Haskell implementations are allowed to diverge.
- Haskell has another clever hack where throwing an exception isn't an effect, but catching one is. Since order of evaluation is unspecified in Haskell, the same program with the same inputs could conceivably cause different exceptions. Exceptions are theoretically non-deterministic in Haskell.
- A language that isn't Turing complete can prove termination and does not need a bottom type, but such a language isn't powerful enough to have a program that can interpret the language.
Thursday, July 23, 2009
Void vs Unit
I suspect most people who come to one of the statically typed functional languages like SML, OCaml, Haskell, F#, or Scala will have only seen static typing in one of the popular C derived languages like C++, Java, or C#. Some differences in type system become clear fast. One particular spot is the relationship and difference between void and unit.
So imagine your favorite C derived language. Consider your language's equivalent of the following function definition [1]
void printHello(String name) { printf("hello %s", name); }What does the word "void" mean? To any C, C++, Java, C# programmer it means that printHello doesn't return anything. Simple.
But a functional programmer[2] sees it quite differently. This article will explain the different point of view and why it's useful.
What's a Function?
It turns out to be very hard to write a single definition of "function" that covers all the ways the word has been used. But to a functional programmer at the very least a function is something that maps inputs to outputs. In the impure functional languages that function may also have an effect like printing "hello." I'll stick with the impure definition in this article.
Now, when a functional programmer talks about mapping inputs to outputs he or she doesn't mean taking a string as an input and outputting it on the console. Instead he or she is talking about arguments and returns of a function. If something is to have any pretense at being a function, or at least a function that returns normally, then it must return something.
printHello clearly does return normally so a functional programmer insists that it returns something. But a functional programmer also must recognize that it doesn't return anything "meaningful."
Thus rather than the concept "void," a functional programmer thinks of unit. In some languages the unit type is written as (), but for clarity I'm going to stick with "unit." Now, unit is a type, conceptually not much different from say "int". But unit, as its name somewhat implies, has exactly one value conventionally represented as an empty pair of parentheses ().
In OO terms () is a "singleton". There's only one instance of it. It's also immutable. In a very real sense it carries no information. You can't write a program that behaves differently based on what a function of type unit returns - it always returns the same ().
A Performance Diversion
At this point a little guy in the back of your head might be asking "doesn't returning something meaningless just waste cycles?" Well, it's important to distinguish between abstraction and implementation. The unit value, (), is an abstraction that the implementation is free to optimize away. If you write in Scala
def printHello : Unit = {
println("hello")
()
}Then the compiler will generate the equivalent of the original void based code. If in Scala you write
print(printHello)
Then the compiler might generate something like
printHello print( () )
Because () is a singleton it doesn't matter at the machinecode/bytecode level that it's not "actually" returned. The compiler can figure it out with no performance hit at all.
Making It Useful
Okay, blah, blah, blah, functional think this, unit that. Whatever. What's the point? () is a value with no information, unit is a type with only that value...what's the practical use?
Well, it turns out that its useful for writing generic[3] code. Java and C++, somewhat famously, do not have first class functions. But they can be faked. Here's the Java
interface Function<Arg, Return> {
Return apply(Arg arg);
}
Function<String, Integer> stringLength = new Function<String,Integer> {
Integer apply(String arg) {
return arg.length();
}
}In C++ you can do it essentially the same way with a template and operator() overloading[4]
template <typename Arg, typename Result>
struct Function {
virtual Result operator()(Arg arg) = 0;
};
struct StringLength : Function<string, int> {
int operator()(string arg) {
return string.length();
}
}
Function<int, string> *stringLength = new StringLength();C# has a function type and lambdas so it's very easy to write.
Func<string, int> stringLength = s => s.Length;
So, if stringLength is an object of type Function<String, int> / Function<int, string> * / Func<string, int> then I can write the following
int result = stringLength.apply("hello"); // Java
int result = (*stringLength)("hello"); // C++
int reulst = stringLength("hello"); // C#So here's the question: how would I convert the printHello function earlier into a Function<String,X>/Function<string, X> */Funct<string, X>. What type is X?
If you've been playing along so far you recognize that it "should be" void, but Java and C# don't allow it. It's invalid to write Function<String, void> in either language.
Let's pretend you could. After all, C++ will let you write a class PrintHello : Function<std::string, void>. But there's still a problem even in C++. Generic code might expect a result and none of these languages let you denote a value of type void. For instance (in Java) imagine thise simple bit of reusable code.
public static <A,B> List<B> map(List<A> in, Function<A,B> f) {
List<B> out = new ArrayList<B>(in.size());
for (A a : in) {
out.add(f.apply(a));
}
return out;
}C++ has std::transform and C# has IEnumerable#Map, which are more or less the same idea.
Anyway, given something like the above calling map/transform/Map with a list of strings and the printHello object should result in a List<void> of the same length as the original list. But what's in the list? The answer, in a functional language, is a bunch of references to (). Tada, the meaningless is useful for writing generic code.
C++ will allow a Function<std::string, void> but any attempt to call something like map with it will fail to compile. C++ doesn't have a () value so our generic code has become slightly less generic.
The Work Arounds
Java, C++, and C# programmers do solve this problem all the time. One common option is to not use Function<string, void> but some other concept like "Action<string>" to mean a function that takes a string, performs a side effect, and returns nothing useful. Then you essentially duplicate "map" into something called perhaps "foreach" that expects an Action<T> and returns void. That probably makes sense in this case since a list of references to a meaningless value is perhaps silly, but it also means a great deal of code duplication if this kind of higher order programming is common. For instance, you can't write just one compose function that creates a function from two other functions, you also have to write a compose that takes an action and a function to create an action.
Another answer is to make the printHello function object have type Function<String,Object>/Function<string,void*>/Func<string,object> and return a null. That's pretty disgusting.
Perhaps the best option is to create your own Unit enumeration with only a single value. That does leave Unit nullable in Java, but railing against nullability in Java is another blog post.
enum Unit { unit }; // C++, Java, or C#
As for good old C, well, the type system doesn't do parametric polymorphism so generic code in C is problematic no matter what you do. Still, with function pointers it's conceivable you might find a use for a unit value. Typically a C programmer would use 0.
A Quick Aside About Purity
For the most part I've casually used effects everywhere. Most functional languages like SML, OCaml, F#, and Scala are impure and let the programmer get away with that. Some functional languages, e.g. Haskell without unsafe* extensions, do not. They must track effects using the type system. But even with this caveat, the unit type is still useful to, for instance, distinguish from a function that fetches a string from the keyboard which would have might have type IO String vs one that prints to the console which might have type String -> IO Unit.
Conclusion
Void and unit are essentially the same concept. They're used to indicate that a function is called only for its side effects. The difference is that unit actually has a value. That means that unit functions can be used generically wherever a generic first class function is required. The C derived languages miss out by treating void functions as functions that don't return anything instead of as functions that don't return anything meaningful.
Next time I'll talk about functions that really, truly return nothing.
Postscript on Aesthtics and Tuples
Mostly I've talked about the practical value of the unit type in this article, but I do love it when convenient practice and beautiful theory line up. So here's an attack from a different direction.
Many functional languages have tuples. A tuple is an ordered fixed size collection of values of (possibly) different types[5]. Tuple types are usually written with parens. For instance, (Foo, Bar) is the type of pairs (2-tuples) with one Foo, one Bar. The same type is sometimes written as Foo × Bar, which makes sense too. You can see the type as a set that consists of a kind of Cartesian product of the Foo and Bar sets. For instance if Foo has only two possible values {f1, f2} and Bar has only two possible values {b1, b2} then the type Foo × Bar has 4 possible values, {(f1, b1), (f1, b2), (f2, b1), and (f2, b2)}.
You can have tuples that with higher numbers of members like triples and 4 tuples and such.
What about a 1-tuple? Well, the type Foo can be seen as a 1-tuple. There's would be no useful difference between the type Foo and the type (Foo).
The question to ask is, given that we have 3-tuples, 2-tuples, and 1-tuples does it make sense to have a 0 tuple? The answer, dear reader, is unit. The unit value is a tuple with no members. That's why Haskell calls both the type and the value "()" - it's like the type (A,B) without the A,B. And just as an empty list and empty set aren't quite "nothing" neither is the empty tuple.
And here's another interesting thing, type theorists will also call the unit type "1" (not to be confused with the value 1). To see why, look at the type (Foo, unit) which can also be written as Foo × 1. If Foo only has 2 possible values {f1, f2} then the entire set of possible values for the type Foo × 1 is {(f1, ()), (f2, ())}. It's the same 2 values extended with a little bit of "meaningless" information that can be stripped off or added back in as desired. Formally speaking it's isomorphic to the type Foo. To abuse notation slightly, Foo × 1 = Foo. So unit is not only the type that has only 1 value, it's the type that behaves like a 1 at the type level.
Footnotes
- In C of course it would be char *name, in C++ it might be char * or I might use the string and iostream libraries. Whatever. The differences aren't that relevant to this discussion.
- To save myself a lot of time I'm using phrases like "functional language" to mean the statically typed functional languages that have been influenced to a large extent by ML. Some of what I'm talking about does translate into some dynamically typed languages as well, although they often "pun" things so that there isn't a unit value that's distinct from nil, null, false or some other singleton value.
- I use the phrase "generic" here to mean what the functional community means by "parametrically polymorphic" or just "polymorphic." However, my target audience is more likely to associate polymorphism with OO dispatch, so "generic" it is.
- In C++ you can also do it without the pure virtual base class and just use template structural typing based on operator(), but shortly it'll be easier for me to describe a problem if I use a base class to give the concept a name. I could also allocate on the stack but that slightly confuses my point. I'm going this route to keep things symmetrical among the languages. Besides, this isn't a tutorial on why C++ isn't Java. On a related note it's a shame C++0x won't have concepts.
- This is quite different from a Python tuple which is really just a list.
Tuesday, April 14, 2009
But, But...You Didn't Mutate Any Variables
In a private email, somebody asked of my previous article "Okay, I can see that there must be state since you've created a state machine. But you aren't mutating any variables so where does the state come from?" It's a good question. If you feel you "got it" then skip this article. But if you're still hesitant or think I'm misusing the word "state" please read on.
First, let me remind that I contend that state is 1) something responding to the same inputs differently over time as far as another observer is concerned and 2) an abstraction and not the representation. None-the-less, state always ends up with some representation. For instance, files can represent state. And even users of programs can hold state (quick, I'm in a drunken state and I plan to write a long rant, who am I?).
I want to ignore all but memory based state and show that, ultimately, even the Erlang code must manipulate mutable memory and that message sending effective leaks a little bit of that detail. Now, on Reddit an offended Erlanger called me a Java fanboy. As any of my loyal readers know, it is true that I think Java is the most incredibly awesome language EVER and I can hardly stop writing glowing things about it. But for this article I'll struggle against my natural instincts and use Ruby, Haskell, assembly, and Erlang.
I'll use Ruby to illustrate some points about state and representation. First the tired old state diagram and then some Ruby
class SockMachineSimple def initialize @state = :zerocoins end def insertcoin case @state when :zerocoins @state = :onecoin :nothing when :onecoin @state = :twocoins :nothing when :twocoins :coin end end def pushbutton if @state == :twocoins @state = :zerocoins :tubeofsocks else :nothing end end end
If you don't read Ruby then Ruby control flow structures generally result in the value of their last contained expression so explicit returns aren't needed. :foo is a symbol (kinda like a string). @state declares a field (which is private) named state. I called it state because it so clearly matches to the state space specified by the diagram. But here's another way to write SockMachine with a completely different representation
class SockMachineComplex def initialize @count = 0 end def insertcoin if @count % 3 == 2 :coin else @count = @count + 1 :nothing end end def pushbutton if @count % 3 == 2 @count = @count + 1 :tubeofsocks else :nothing end end end
Instances of this class keep track of the total number of coins or button pushes ever accepted and use the modulus operator to decide what to do about it. The representation is quite different from that of SockMachineSimple, but to anybody using this class the difference is mostly irrelevant - it still conforms to the same state diagram. Internally it has a very large number of states but externally it still has the same three. And here's one last stateful machine
class SockMachineDelegated def initialize @machine = SockMachineComplex.new end def insertcoin @machine.insertcoin end def pushbutton @machine.pushbutton end end
By looking at the source it would appear that this version never mutate any variables after being initialized, but of course it does by calling methods on SockMachineComplex. Hiding the manipulation of representation is not the same thing as making something stateless. And now one last example, one that is totally different because it is not stateful.
class SockMachineStateless def pushbutton machine = SockMachineComplex.new machine.insertcoin machine.insertcoin machine.pushbutton end end
Internally, pushbutton uses a stateful machine but externally it is not stateful. Short of using a debugger you can't observe the state changes in SockMachineStateless's internal machine. Actually, I guess you could monkey patch SockMachineComplex to do some kind of IO or callback to expose the workings so maybe I shouldn't have used Ruby to illustrate my OO points. Too late now.
Hiding State
Okay, but that's OO stuff. Erlang is a functional language and the code never appeared to have any mutable variables or nested mutable objects. So what gives? Well, to illustrate Erlang I'm going to use Haskell, another functional language. Haskell's great advantage to this exposition is that it's really, really easy to get at some nice clean assembly.
So here's a very silly Haskell function for determining if a positive Int is even or odd. Its implementation is silly, but it allows me to make a point.
flurb :: Int -> String flurb n = if n == 0 then "even" else if n == 1 then "odd" else flurb $ n - 2
This is a pure function - it mutates nothing. Yet when compiled with ghc -S -O to get AT&T syntax assembly, the function looks like this
Main_zdwflurb_info: movl (%ebp),%eax cmpl 1,ドル%eax jl .Lczq cmpl 1,ドル%eax jne .Lczo movl $rxA_closure,%esi addl 4,ドル%ebp andl $-4,%esi jmp *(%esi) .Lczo: addl $-2,%eax movl %eax,(%ebp) jmp Main_zdwflurb_info .Lczq: testl %eax,%eax jne .Lczo movl $rxy_closure,%esi addl 4,ドル%ebp andl $-4,%esi jmp *(%esi)
GHC has compiled the flurb function down to a loop and the immutable n variable is represented with the mutable eax register [1]. Some pseudo Haskell/assembly mix might help illustrate
flurb n = movl n, eax -- copy n into the eax register Main_zdwflurb_info: if eax == 0 then "even" else if eax == 1 then "odd" else addl $-2, eax -- decrement eax by 2 jmp Main_zdwflurb_info -- jump to the top of the loop
As you can see, the Haskell code does use mutable "variables" at runtime. This is a common optimization technique in functional languages for dealing with direct tail recursion. But all that machinery is hidden from you as a programmer so just like SockMachineStateless the end result is stateless unless you peel back the covers with a debugger.
Finally, Some Damn Erlang
All right, I've written Ruby and Haskell, generated some assembly, and then written in an impossible chimeric language. But my original question was about Erlang. Here's a simple Erlang counter actor
-module(counter). -export([create/0, increment/1]). create() -> spawn(fun() -> loop(0) end). increment(I) -> I ! self(), receive X -> X end. loop(N) -> receive From -> From ! N, loop(N + 1) end.
Again, no variables are mutated. But if I assume that Erlang does the same kind of direct tail call optimization that Haskell does, the pseudo Erlang/Assembly loop looks like this
loop(N) -> movl N, eax % copy n into the eax register .looptop: receive From -> From ! eax, % send the current value in eax back to the original sender inc eax % increment eax by 1 jmp .looptop % jump back to the top of the loop end.
It's still a loop like the Haskell one, but there's an important difference. Each message receive sends back the current value of the mutable eax register then mutates it. So this behaves a bit like SockMachineDelegated - the original code didn't appear directly manipulate any mutable variables, but there was mutation under the covers and unlike SockMachineStateless but like SockMachineDelegated this change of state is visible beyond the local confines. [2]
Now, there are other ways to deal with recursion. I don't know Erlang's implementation, but it doesn't matter. Something is being mutated somewhere and that change is being made visible by messages being sent back. Typically non tail calls mutate the stack pointer so that it points to new stack frames that hold the current value of arguments, or then again some arguments might stay in registers. Tail calls that aren't direct tail recursion might mutate the stack region pointed to by an unchanging stack pointer. Whatever, it always involves mutation, and it's always hidden from the programmer when using pure functions. But actor loops are not pure functions. The sends and receives are side effects that can allow a little tiny bit of the mutable machine representation to be mapped to state that an observer can witness.
But Really Where Are The Mutable Variables?
So all that's fine, but it doesn't really show where the state is in my original Erlang code. The code didn't have an N to stick in a mutable register or a mutable stack frame. Here's the core of it.
zerocoins() ->
receive
{coin, From} ->
From ! nothing,
onecoin();
{button, From} ->
From ! nothing,
zerocoins()
end.
onecoin() ->
receive
{coin, From} ->
From ! nothing,
twocoins();
{button, From} ->
From ! nothing,
onecoin()
end.
twocoins() ->
receive
{coin, From} ->
From ! coin,
twocoins();
{button, From} ->
From ! tubeofsocks,
zerocoins()
end.For this final answer I'm afraid I have to do a bit of hand waving. The explanation is even more abstract than the mutable variable as register/stack local explanation. You see, no matter what, your code always mutates one register: the instruction pointer. Its job is to point to the next instruction to be executed. As the CPU executes instructions the IP moves around, typically just being bumped up a bit for everything but jumps which move it more substantially.
In purely functional code the IP moves around but these moves can't be observed by other parts of the program as they are happening. In other words, in purely functional code, the changing IP is invisible. It's a completely black box unless you use a low level debugger. It's very much like SockMachineStateless where the internals were mutable but invisible to the outside world.
But with message receives the IP can be induced to move around based on things communicated from outside the function. The IPs current instruction can then define in large part what a message received by a process will do. If the IP points at the receive in the "zerocoins()" function then that's one behavior and if it it points to the receive in the "twocoins()" function then that's another. The different behaviors can be observed by other actors via messages sent back to them. When an actor sends a SockMachine a coin or buttonpress message it may indirectly be causing the IP to be mutated. And when it gets back nothing, coin, or tubeofsocks it's really being told, in a very indirect way, something about the value of the IP.
Conclusion
State is not the same thing as representation. None-the-less, with an omniscient view of things you can always find the representation of state. It might be in bits stored on a drive, it might be in neurons in a user's head, or it might be in a computer's memory and registers. The representation might not be directly visible in the source you're looking at but hidden in low level machinery. It might just be the instruction pointer pointing at different parts of your code base.
If you equate state with representation you end up with the very un-useful property that everything is stateful since at some level of representation every bit of executing code on your computer involves stateful processes. This view robs you of the ability to think about high level languages, especially purely functional ones like Haskell, at an appropriate level of abstraction.[3]
Purely functional code ensures that that the state represented by registers and stack pointers and instruction pointers is kept local. But side effects like message sends and receives allow that low level machinery to be used as the representation of shared mutable state even if you don't see it in your code.
In the end, it's far more useful in most cases to think of state from the point of view of an observer than the point of view of its implementation. The SockMachine actor was stateful regardless of how that state was represented simply because other actors could observe it changing state. Digging down into how send and receive allow a modicum of access to the instruction pointer just isn't a useful model for thinking about stateful actors normally. So the short answer to the original question is "Who cares how the state was represented? It was shared mutable state."
Foot notes
- Actually, it appears to be moving n back and forth between memory pointed to by eap and eax, which seems oddly wasteful given that it never branches out of this function. It also does 3 tests when only 2 should be necessary.
- Technically the Erlang code probably can't keep the current value of N in a register when calling receive since receive may cause a task switch to another process, so assume that receive copies registers out to memory before executing then swaps them back in when its done.
- Some people equate state with data, but that's just as bad. Code is data so all code must be stateful - another useless conclusion.
Monday, April 6, 2009
Erlang Style Actors Are All About Shared State
Actors have become quite the popular topic. Besides Erlang, there's a famous library implementation in Scala and at least 3 for Java. But the "no shared state" propaganda is setting people up for failure. In last week's exciting episode I defined both what state is and what it isn't. It is the fact that something responds differently to the same inputs over time. It is not the details of how that is represented. Based on that I can now show why saying that Erlang style actors don't share state is totally wrong headed - that, in fact, if you don't want to share state the actor model is a very clunky way of doing things.
Before I go on, let me emphasize that I like Erlang. I like actors, too. It's a very nifty model. But it's not a stateless model. Let me show what I'm ranting against:
- "The actor model consists of a few key principles: No shared state"
- "Erlang is an inherently stateless system."
- "Erlang does not have [...] 'mutable state'"
The problem is that Erlang style actors can have shared, mutable (changeable), state. The model is all about shared state. If you didn't need to share mutable state then there are better alternatives than actors. And, importantly, even when you are using actors well you must think about all of the problems of shared mutable state. All of them. People who believe the propaganda are in for rude shocks.
The Situation
Last time I described a simple sock tube dispensing machine. Insert two coins, press the button, get a tube of socks. Insert too many coins and you get the extra coins returned. Push the button before you've inserted enough coins and you get nothing. Here's the diagram.
Imagine that Alice and Bob work at Crap Corp. ("Making High Quality Crap Since 1913"). Once a day they each like to saunter down to the break room and purchase a nice warm tube of socks. But, this is Crap Corp. and the machines don't take regular coins but Crap Corp. Coins instead. Each CCC weighs 40 pounds (very roughly 18.1436948 kilograms).
Naturally, Alice and Bob don't want to carry 80 pounds of crappy tokens around with them so they each laboriously drag a token down to a machine, insert it, walk back to their cubicle, grab another and repeat. Now, if Alice and Bob take their sock breaks at very different times that's probably going to work fine. But if they tend to overlap bad things happen. It's possible for Alice to insert her first coin, Bob to insert his first coin, Alice to insert her second coin and get an coin back (BONUS! she cries happily, experiencing the greatest joy she's ever experienced at Crap Corp.) So Alice pushes the button, gets her tube of socks, and merrily skips back to her cube. Well, maybe not skip exactly, but whatever you do when you're ecstatically happy while carrying 40 pounds of crap.
Then Bob shows up, inserts his second coin, eagerly smashes the button to get his well deserved tube of socks and ... gets nothing. Feeling cheated he pounds on the machine, kicks it, shakes it, and rocks it back and forth. Inevitably, the machine tips over, falls on Bob, and crushes him with all the tons of Crap Corp. Coins that have been inserted over the weeks. A tragic ending for somebody who just wanted some socks.
Now, that outcome isn't guaranteed even if Bob and Alice start about the same time. On the way to inserting her first coin Alice could be waylaid by the boss looking for his TPS report. As Alice patiently explains that a) TPS reports were never her job and b) they were discontinued three years ago and c) her eyes are on her face not her chest, Bob could have merrily taken the two coin trips and safely received a tube of socks without ever knowing the mortal injury he narrowly avoided.
Finally, Some Damn Code
Any time something unwanted can happen as the result of unpredictable delays, scheduler priorities, workload, etc you have a race condition. What could be more unwanted than being crushed by a vending machine? And what could be more unpredictable than a pointy haired boss? We can write this up exactly in Erlang.
In a file called sockmachine.erl. First, a little standard module definition and export business.
-module(sockmachine). -export([start/0, insertcoin/1, pushbutton/1, test/2]).
Here are the guts of the machine. zerocoins(), onecoin(), and twocoins() are the states of the machine. When one is called it blocks, waiting for an message in its inbox. Based on the message it gets it responds with {nothing} if nothing happens, {coin} if it needs to return a coin, or {tubeofsocks} for the win. It also then calls the appropriate function for the next state - which might be the same state. These are all private functions not exported by the module. Note, there are more clever ways to write this - but for explanatory purposes I like this.
zerocoins() ->
receive
{coin, From} ->
From ! {nothing},
onecoin();
{button, From} ->
From ! {nothing},
zerocoins()
end.
onecoin() ->
receive
{coin, From} ->
From ! {nothing},
twocoins();
{button, From} ->
From ! {nothing},
onecoin()
end.
twocoins() ->
receive
{coin, From} ->
From ! {coin},
twocoins();
{button, From} ->
From ! {tubeofsocks},
zerocoins()
end.
Start spawns a new sock machine actor in the zerocoins state
start() -> spawn(fun() -> zerocoins() end).
insertcoin and pushbutton are rpc style convenience functions that insert a coin or push the button. Or did I get that backwards? Well, whichever, they each return whatever they recieve as a message back from the machine.
insertcoin(Machine) ->
Machine ! {coin, self()},
receive X -> X
end.
pushbutton(Machine) ->
Machine ! {button, self()},
receive X -> X
end.Test spawns as many concurrent test loops as requested to simultaneously pound one machine.
test(Machine, Processes) ->
if
Processes > 0 ->
spawn(fun() -> testloop(Machine, 100) end),
test(Machine, Processes - 1);
true ->
io:format("all test processes launched~n")
end.
Testloop repeatedly walks through the cycle of inserting 2 coins and pushing the button. It calls helper functions that mirror the state of the sock machine to show what it expects to happen at each step, complaining when things don't go well.
testloop(Process, Machine, Count) ->
if
Count > 0 -> testzerocoins(Process, Machine,Count);
true -> io:format("[~w] testing completed~n", [Process])
end.
testzerocoins(Process, Machine, Count) ->
case insertcoin(Machine) of
{nothing} -> testonecoin(Process, Machine,Count);
{coin} ->
io:format("[~w] BONUS COIN!~n", [Process]),
testtwocoins(Process, Machine,Count)
end.
testonecoin(Process, Machine, Count) ->
case insertcoin(Machine) of
{nothing} -> testtwocoins(Process, Machine,Count);
{coin} ->
io:format("[~w] BONUS COIN!~n", [Process]),
testtwocoins(Process, Machine,Count)
end.
testtwocoins(Process, Machine, Count) ->
case pushbutton(Machine) of
{tubeofsocks} -> io:format("[~w] Got my socks.~n", [Process]);
{nothing} -> io:format("[~w] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH~n", [Process])
end,
testloop(Process, Machine, Count - 1).
Now fire up erl, compile, start a machine, and test it with only 1 running test loop
1> c(sockmachine).
{ok,sockmachine}
2> Machine = sockmachine:start().
<0.38.0> 3> sockmachine:test(Machine,1). all test processes launched ok [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] testing completed Ah, sweet, sweet success! But now run another test with 2 concurrent test loops. 1 = Bob, 2 = Alice...or was that the other way around?.
4> sockmachine:test(Machine,2). all test processes launched [2] BONUS COIN! [1] BONUS COIN! ok [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] testing completed [1] testing completed
It's a litany of socks, bonus coins and crushed intestines. On my machine it's an oddly predictable litany, but in a real, distributed Erlang app it would be much more interesting and random litany as network delays would emulate pointy haired bosses even better than Erlang's scheduler.
Some Last Notes
With Erlang style programming, actors are the central unit of statefulness. Multiple actors can share access to one stateful actor. Hence shared state, race conditions, and ruptured spleens. Am I saying that Erlang or actors are bad? No, in fact I quite like them. What the Erlang model does very nicely is separate that which must be stateful because it is concurrent from that which is more pure computation. By making state so much more painful to write than "foo.x = foo.x + 1" the actor model encourages you to think about the consequences of sharing it. It also cleanly mirrors the mechanics of distributed computing and asynchronous IO. It's nice, but it's not stateless.
One last note. I started with "actors are all about shared state." Naturally one might ask "well, what about stateless actors - actors that don't change state or depend on state via IO?" Certainly those are viable uses of actors. But that's no longer concurrency, that's parallelism and IMHO futures, data flow variables, and Haskell's data parallelism are all cleaner ways to deal with parallelism. Someday soon I hope to write about them. In the meantime, the whole point of having the complexity of message passing instead of those simpler mechanisms is precisely to deal with the complexity of concurrent state.
One really last note. Sadly, simple straight up actors don't automatically compose very well. You can design a set of actors that interact correctly for one use case but that don't interact at all well when plugged into a different system. This is another aspect that actors share with traditional manual locks and shared mutable memory. To date the best known way to deal with composable state is transactions (optimistic, pessimistic, distributed, compensating, software, hardware, database, whatever). There are very nice transactional capabilities available for Erlang, but this is yet another area where the "no shared state" mantra can lead people to think that actors are the entire answer without needing anything else.
Try not to get crushed and good luck with your socks!
Postscript
It has been suggested in the comments that when people say that Erlang style actors don't share state they mean it doesn't share memory. First, I clearly defined state in the previous article as being different from its representation. But, just as importantly, I think that saying "we don't share memory" is a distinction without much relevance. It's mostly an implementor's point of view that doesn't reflect how a user must think about actors.
Here's some Erlang code for simple shared mutable variables. This isn't recommended Erlang usage, but it doesn't break the model in any way.
-module(variable).
-export([malloc/0, fetch/1, store/2]).
malloc() -> spawn(fun() -> loop(0) end).
loop(Value) ->
receive
{fetch, From} ->
From ! Value,
loop(Value);
{store, NewValue} ->
loop(NewValue)
end.
fetch(Variable) ->
Variable ! {fetch, self()},
receive X -> X end.
store(Variable, Value) ->
Variable ! {store, Value}.And here's me using it.
1> c(variable).
{ok,variable}
2> X = variable:malloc().
<0.38.0>
3> variable:store(X,42).
{store,42}
4> variable:fetch(X).
42
5> variable:store(X, variable:fetch(X) + 1).
{store,43}
6> variable:fetch(X).
43And, since these variables are actors they are just as easy to share as my sock machine example.
Friday, July 18, 2008
Java Is Too Academic
Occasionally you'll hear people praising Java because it's incredibly readable; there's only one way to do things; and other languages are too academic. But I have proof that they're wrong. After you view the code below I think you'll agree with me that with Java:
- It's far too easy to write code that's illegible to most programmers.
- There are far too many ways to do things.
- It's far too academic.
I have reason to believe other popular languages like Ruby and Python allow this sort of thing, too. When will we finally stop inventing such academic languages and let real world working programmers get some real world work done?
The proof that Java is too academic and obscure? Right here: factorial using a fixed point operation in an applicative order lambda calculus. "Obscure" and "academic" don't even begin to describe this code*.
public interface Lambda {
public Object apply(Object x);
}
public class Factorial {
public static void main(String[] args) {
final int x = new Integer(args[0]);
System.out.println(x + "! = " +
((Lambda)(new Lambda() {
public Object apply(final Object f) {
return new Lambda() {
public Object apply(final Object x) {
return ((Lambda)f).apply(new Lambda(){
public Object apply(final Object y) {
return ((Lambda)((Lambda)x).apply(x)).apply(y);
}
});
}
}.apply(new Lambda() {
public Object apply(final Object x) {
return ((Lambda)f).apply(new Lambda(){
public Object apply(final Object y) {
return ((Lambda)((Lambda)x).apply(x)).apply(y);
}
});
}
});
}
}.apply(new Lambda(){
public Object apply(final Object f) {
return new Lambda() {
public Object apply(final Object x1) {
final int x = (Integer)x1;
return x == 0 ? 1 : x * (Integer)((Lambda)f).apply(x -1);
}
};
}
}))).apply(x));
}
}
* "Obscure" and "academic" may not begin to describe it, but "perverse" starts getting close
Monday, June 23, 2008
Lambda in the Sun
Announcing the creation of Southern California Functional Programmers (SoCalFP), a group for people in LA, Orange County, and San Diego to meet in person and/or virtually to discuss, debate, present, and learn about functional programming concepts and techniques in various languages.
You might ask, "why a functional programming group in Southern California?"
Well, SoCal, wake up and smell the lambda. There's increasing interest in bastions of functional programming like Haskell and various Lisps; popular mainstream languages like Ruby and Python have lambda (or lambda like) capabilities; hybrid OO/functional languages like F# and Scala are generating buzz; C# 3.0 has embraced core functional ideas like closures and monads; and even staid, conservative Java may get some functional goodness in the next version. Perhaps most importantly, programmers can't ignore the oncoming multi-core freight train and Erlang has shown that concurrency and functional programming go together like peanut butter and chocolate.
If you're intrigued come visit our main site and join our mailing list.
Tuesday, November 6, 2007
Monads are Elephants Part 4
Until you experience an adult elephant first hand you won't really understand just how big they can be. If monads are elephants then so far in this series of articles I've only presented baby elephants like List and Option. But now it's time to see a full grown adult pachyderm. As a bonus, this one will even do a bit of circus magic.
Functional Programming and IO
In functional programming there's a concept called referential transparency. Referential transparency means you can call a particular function anywhere and any time and the same arguments will always give the same results. As you might imagine, a referentially transparent function is easier to use and debug than one that isn't.
There's one area where referential transparency would seem impossible to achieve: IO. Several calls to the same readLine console function may result in any number of different strings depending on things like what the user ate for breakfast. Sending a network packet may end in successful delivery or it might not.
But we can't get rid of IO just to accomplish referential transparency. A program without IO is just a complicated way to make your CPU hot.
You might guess that monads provide a solution for referentially transparent IO given the topic of this series but I'm going to work my way up from some simple principles. I'll solve the problem for reading and writing strings on the console but the same solution can be extended to arbitrary kinds of IO like file and network.
Of course, you may not think that referentially transparent IO is terribly important in Scala. I'm not here to preach the one true way of purely functional referential transparency. I'm here to talk about monads and it just so happens that the IO monad is very illustrative of how several monads work.
The World In a Cup
Reading a string from the console wasn't referentially transparent because readLine depends on the state of the user and "user" isn't one of its parameters. A file reading function would depend on the state of the file system. A function that reads a web page would depend on the state of the target web server, the Internet, and the local network. Equivalent output functions have similar dependencies.
All this could be summed up by creating a class called WorldState and making it both a parameter and a result for all IO functions. Unfortunately, the world is a big place. My first attempt to write a WorldState resulted in a compiler crash as it ran out of memory. So instead I'll try for something a bit smaller than modeling the whole universe. That's where a bit of circus magic comes in.
The slight-of-hand I'll use is to model only a few aspects of the world and just pretend WorldState knows about the rest of the world. Here are some aspects that would be useful
- The state of the world changes between IO functions.
- The world's state is what it is. You can't just create new ones whenever you want (val coolWorldState = new WorldState(){def jamesIsBillionaire = true}).
- The world is in exactly one state at any moment in time.
Property 3 is a bit tricky so let's deal with properties 1 and 2 first.
Here's a rough sketch for property 1
//file RTConsole.scala
object RTConsole_v1 {
def getString(state: WorldState) =
(state.nextState, Console.readLine)
def putString(state: WorldState, s: String) =
(state.nextState, Console.print(s) )
}
getString and putString use functions defined in scala.Console as raw primitive functions. They take a world state and return a tuple consisting of a new world state and the result of the primitive IO.
Here's how I'll implement property 2
//file RTIO.scala
sealed trait WorldState{def nextState:WorldState}
abstract class IOApplication_v1 {
private class WorldStateImpl(id:BigInt)
extends WorldState {
def nextState = new WorldStateImpl(id + 1)
}
final def main(args:Array[String]):Unit = {
iomain(args, new WorldStateImpl(0))
}
def iomain(
args:Array[String],
startState:WorldState):(WorldState, _)
}
WorldState is a sealed trait; it can only be extended within the same file. IOApplication defines the only implementation privately so nobody else can instantiate it. IOApplication also defines a main function that can't be overridden and calls a function named iomain that must be implemented in a subclass. All of this is plumbing that is meant to be hidden from programmers that use the IO library.
Here's what hello world looks like given all this
// file HelloWorld.scala
class HelloWorld_v1 extends IOApplication_v1 {
import RTConsole_v1._
def iomain(
args:Array[String],
startState:WorldState) =
putString(startState, "Hello world")
}
That Darn Property 3
The 3rd property said that the world can only be in one state at any given moment in time. I haven't solved that one yet and here's why it's a problem
class Evil_v1 extends IOApplication_v1 {
import RTConsole_v1._
def iomain(
args:Array[String],
startState:WorldState) = {
val (stateA, a) = getString(startState)
val (stateB, b) = getString(startState)
assert(a == b)
(startState, b)
}
}
Here I've called getString twice with the same inputs. If the code was referentially transparent then the result, a and b, should be the same but of course they won't be unless the user types the same thing twice. The problem is that "startState" is visible at the same time as the other world states stateA and stateB.
Inside Out
As a first step towards a solution, I'm going to turn everything inside out. Instead of iomain being a function from WorldState to WorldState, iomain will return such a function and the main driver will execute it. Here's the code
//file RTConsole.scala
object RTConsole_v2 {
def getString = {state:WorldState =>
(state.nextState, Console.readLine)}
def putString(s: String) = {state: WorldState =>
(state.nextState, Console.print(s))}
}
getString and putString no longer get or put a string - instead they each return a new function that's "waiting" to be executed once a WorldState is provided.
//file RTIO.scala
sealed trait WorldState{def nextState:WorldState}
abstract class IOApplication_v2 {
private class WorldStateImpl(id:BigInt)
extends WorldState {
def nextState = new WorldStateImpl(id + 1)
}
final def main(args:Array[String]):Unit = {
val ioAction = iomain(args)
ioAction(new WorldStateImpl(0));
}
def iomain(args:Array[String]):
WorldState => (WorldState, _)
}
IOApplication's main driver calls iomain to get the function it will execute, then executes that function with an initial WorldState. HelloWorld doesn't change too much except it no longer takes a WorldState.
//file HelloWorld.scala
class HelloWorld_v2 extends IOApplication_v2 {
import RTConsole_v2._
def iomain(args:Array[String]) =
putString("Hello world")
}
At first glance we seem to have solved our problem because WorldState is nowhere to be found in HelloWorld. But it turns out it's just been buried a bit.
Oh That Darn Property 3
class Evil_v2 extends IOApplication_v2 {
import RTConsole_v2._
def iomain(args:Array[String]) = {
{startState:WorldState =>
val (statea, a) = getString(startState)
val (stateb, b) = getString(startState)
assert(a == b)
(startState, b)
}
}
}
Evil creates exactly the kind of function that iomain is supposed to return but once again things are broken. As long as the programmer can create arbitrary IO functions he or she can see through the WorldState trick.
Property 3 Squashed For Good
All we need to do is prevent the programmer from creating arbitrary functions with the right signature. Um...we need to do what now?
Okay, as we saw with WorldState it's easy to prevent programmers from creating subclasses. So let's turn our function signature into a trait.
sealed trait IOAction[+A] extends Function1[WorldState, (WorldState, A)] private class SimpleAction[+A]( expression: => A) extends IOAction[A]...
Unlike WorldState we do need to create IOAction instances. For example, getString and putString are in a separate file but they would need to create new IOActions. We just need them to do so safely. It's a bit of a dilemma until we realize that getString and putString have two separate pieces: the piece that does the primitive IO and the piece that turns the input world state into the next world state. A bit of a factory method might help keep things clean, too.
//file RTIO.scala
sealed trait IOAction_v3[+A] extends
Function1[WorldState, (WorldState, A)]
object IOAction_v3 {
def apply[A](expression: => A):IOAction_v3[A] =
new SimpleAction(expression)
private class SimpleAction [+A](
expression: => A) extends IOAction_v3[A] {
def apply(state:WorldState) =
(state.nextState, expression)
}
}
sealed trait WorldState{def nextState:WorldState}
abstract class IOApplication_v3 {
private class WorldStateImpl(id:BigInt)
extends WorldState {
def nextState = new WorldStateImpl(id + 1)
}
final def main(args:Array[String]):Unit = {
val ioAction = iomain(args)
ioAction(new WorldStateImpl(0));
}
def iomain(args:Array[String]):IOAction_v3[_]
}
The IOAction object is just a nice factory to create SimpleActions. SimpleAction's constructor takes a lazy expression as an argument, hence the "=> A" annotation. That expression won't be evaluated until SimpleAction's apply method is called. To call SimpleAction's apply method, a WorldState must be passed in. What comes out is a tuple with the new WorldState and the result of the expression.
Here's what our IO methods look like now
//file RTConsole.scala
object RTConsole_v3 {
def getString = IOAction_v3(Console.readLine)
def putString(s: String) =
IOAction_v3(Console.print(s))
}
And finally our HelloWorld class doesn't change a bit
class HelloWorld_v3 extends IOApplication_v3 {
import RTConsole_v3._
def iomain(args:Array[String]) =
putString("Hello world")
}
A little thought shows that there's no way to create an Evil IOApplication now. A programmer simply has no access to a WorldState. It has become totally sealed away. The main driver will only pass a WorldState to an IOAction's apply method, and we can't create arbitrary IOAction subclasses with custom definitions of apply.
Unfortunately, we've got a combining problem. We can't combine multiple IOActions so we can't do something as simple as "What's your name", Bob, "Hello Bob."
Hmmmm, IOAction is a container for an expression and monads are containers. IOAction needs to be combined and monads are combinable. Maybe, just maybe...
Ladies and Gentleman I Present the Mighty IO Monad
The IOAction.apply factory method takes an expression of type A and returns an IOAction[A]. It sure looks like "unit." It's not, but it's close enough for now. And if we knew what flatMap was for this monad then the monad laws would tell us how to create map using it and unit. But what's flatMap going to be? The signature needs to look like def flatMap[B](f: A=>IOAction[B]):IOAction[B]. But what does it do?
What we want it to do is chain an action to a function that returns an action and when activated causes the two actions to occur in order. In other words, getString.flatMap{y => putString(y)} should result in a new IOAction monad that, when activated, first activates the getString action then does the action that putString returns. Let's give it a whirl.
//file RTIO.scala
sealed abstract class IOAction_v4[+A] extends
Function1[WorldState, (WorldState, A)] {
def map[B](f:A => B):IOAction_v4[B] =
flatMap {x => IOAction_v4(f(x))}
def flatMap[B](f:A => IOAction_v4[B]):IOAction_v4[B]=
new ChainedAction(this, f)
private class ChainedAction[+A, B](
action1: IOAction_v4[B],
f: B => IOAction_v4[A]) extends IOAction_v4[A] {
def apply(state1:WorldState) = {
val (state2, intermediateResult) =
action1(state1);
val action2 = f(intermediateResult)
action2(state2)
}
}
}
object IOAction_v4 {
def apply[A](expression: => A):IOAction_v4[A] =
new SimpleAction(expression)
private class SimpleAction[+A](expression: => A)
extends IOAction_v4[A] {
def apply(state:WorldState) =
(state.nextState, expression)
}
}
// the rest remains the same
sealed trait WorldState{def nextState:WorldState}
abstract class IOApplication_v4 {
private class WorldStateImpl(id:BigInt) ...
The IOAction factory and SimpleAction remain the same. The IOAction class gets the monad methods. Per the monad laws, map is just defined in terms of flatMap and what we're using as unit for now. flatMap defers all the hard work to a new IOAction implementation called ChainedAction.
The trick in ChainedAction is its apply method. First it calls action1 with the first world state. This results in a second world state and an intermediate result. The function it was chained to needs that result and in return the function generates another action: action2. action2 is called with the second world state and the tuple that come out is the end result. Remember that none of this will happen until the main driver passes in an initial WorldState object.
A Test Drive
At some point you may have wondered why getString and putString weren't renamed to something like createGetStringAction/createPutStringAction since that's in fact what they do. For an answer, look at what happens when we stick 'em in our old friend "for".
object HelloWorld_v4 extends IOApplication_v4 {
import RTConsole_v4._
def iomain(args:Array[String]) = {
for{
_ <- putString(
"This is an example of the IO monad.");
_ <- putString("What's your name?");
name <- getString;
_ <- putString("Hello " + name)
} yield ()
}
}
It's as if "for" and getString/putString work together to create a mini language just for creating a complex IOActions.
Take a Deep Breath
Now's a good moment to sum up what we've got. IOApplication is pure plumbing. Users subclass it and create a method called iomain which is called by main. What comes back is an IOAction - which could in fact be a single action or several actions chained together. This IOAction is just "waiting" for a WorldState object before it can do its work. The ChainedAction class is responsible for ensuring that the WorldState is changed and threaded through each chained action in turn.
getString and putString don't actually get or put Strings as their names might indicate. Instead, they create IOActions. But, since IOAction is a monad we can stick it into a "for" statement and the result looks as if getString/putString really do what they say the do.
It's a good start; we've almost got a perfectly good monad in IOAction. We've got two problems. The first is that, because unit changes the world state we're breaking the monad laws slightly (e.g. m flatMap unit === m). That's kinda trivial in this case because it's invisible. But we might as well fix it.
The second problem is that, in general, IO can fail and we haven't captured that just yet.
IO Errors
In monadic terms, failure is represented by a zero. So all we need to do is map the native concept of failure (exceptions) to our monad. At this point I'm going to take a different tack from what I've been doing so far: I'll write one final version of the library with comments inline as I go.
The IOAction object remains a convenient module to hold several factories and private implementations (which could be anonymous classes, but it's easier to explain with names). SimpleAction remains the same and IOAction's apply method is a factory for them.
//file RTIO.scala
object IOAction {
private class SimpleAction[+A](expression: => A)
extends IOAction[A] {
def apply(state:WorldState) =
(state.nextState, expression)
}
def apply[A](expression: => A):IOAction[A] =
new SimpleAction(expression)
UnitAction is a class for unit actions - actions that return the specified value but don't change the world state. unit is a factory method for it. It's kind of odd to make a distinction from SimpleAction, but we might as well get in good monad habits now for monads where it does matter.
private class UnitAction[+A](value: A)
extends IOAction[A] {
def apply(state:WorldState) =
(state, value)
}
def unit[A](value:A):IOAction[A] =
new UnitAction(value)
FailureAction is a class for our zeros. It's an IOAction that always throws an exception. UserException is one such possible exception. The fail and ioError methods are factory methods for creating zeroes. Fail takes a string and results in an action that will raise a UserException whereas ioError takes an arbitrary exception and results in an action that will throw that exception.
private class FailureAction(e:Exception)
extends IOAction[Nothing] {
def apply(state:WorldState) = throw e
}
private class UserException(msg:String)
extends Exception(msg)
def fail(msg:String) =
ioError(new UserException(msg))
def ioError[A](e:Exception):IOAction[A] =
new FailureAction(e)
}
IOAction's flatMap, and ChainedAction remain the same. Map changes to actually call the unit method so that it complies with the monad laws. I've also added two bits of convenience: >> and <<. Where flatMap sequences this action with a function that returns an action, >> and << sequence this action with another action. It's just a question of which result you get back. >>, which can be pronounced "then", creates an action that returns the second result, so 'putString "What's your name" >> getString' creates an action that will display a prompt then return the user's response. Conversely, <<, which can be called "before" creates an action that will return the result from the first action.
sealed abstract class IOAction[+A]
extends Function1[WorldState, (WorldState, A)] {
def map[B](f:A => B):IOAction[B] =
flatMap {x => IOAction.unit(f(x))}
def flatMap[B](f:A => IOAction[B]):IOAction[B]=
new ChainedAction(this, f)
private class ChainedAction[+A, B](
action1: IOAction[B],
f: B => IOAction[A]) extends IOAction[A] {
def apply(state1:WorldState) = {
val (state2, intermediateResult) =
action1(state1);
val action2 = f(intermediateResult)
action2(state2)
}
}
def >>[B](next: => IOAction[B]):IOAction[B] =
for {
_ <- this;
second <- next
} yield second
def <<[B](next: => IOAction[B]):IOAction[A] =
for {
first <- this;
_ <- next
} yield first
Because we've got a zero now, it's possible to add a filter method by just following the monad laws. But here I've created two forms of filter method. One takes a user specified message to indicate why the filter didn't match whereas the other complies with Scala's required interface and uses a generic error message.
def filter(
p: A => Boolean,
msg:String):IOAction[A] =
flatMap{x =>
if (p(x)) IOAction.unit(x)
else IOAction.fail(msg)}
def filter(p: A => Boolean):IOAction[A] =
filter(p, "Filter mismatch")
A zero also means we can create a monadic plus. As some infrastructure for creating it, HandlingAction is an action that wraps another action and if that action throws an exception then it sends that exception to a handler function. onError is a factory method for creating HandlingActions. Finally, "or" is the monadic plus. It basically says that if this action fails with an exception then try the alternative action.
private class HandlingAction[+A](
action:IOAction[A],
handler: Exception => IOAction[A])
extends IOAction[A] {
def apply(state:WorldState) = {
try {
action(state)
} catch {
case e:Exception => handler(e)(state)
}
}
}
def onError[B >: A](
handler: Exception => IOAction[B]):
IOAction[B] =
new HandlingAction(this, handler)
def or[B >: A](
alternative:IOAction[B]):IOAction[B] =
this onError {ex => alternative}
}
The final version of IOApplication stays the same
sealed trait WorldState{def nextState:WorldState}
abstract class IOApplication {
private class WorldStateImpl(id:BigInt)
extends WorldState {
def nextState = new WorldStateImpl(id + 1)
}
final def main(args:Array[String]):Unit = {
val ioaction = iomain(args)
ioaction(new WorldStateImpl(0));
}
def iomain(args:Array[String]):IOAction[_]
}
RTConsole stays mostly the same, but I've added a putLine method as an analog to println. I've also changed getString to be a val. Why not? It's always the same action.
//file RTConsole.scala
object RTConsole {
val getString = IOAction(Console.readLine)
def putString(s: String) =
IOAction(Console.print(s))
def putLine(s: String) =
IOAction(Console.println(s))
}
And now a HelloWorld application to exercise some of this new functionality. sayHello creates an action from a string. If the string is a recognized name then the result is an appropriate (or inappropriate) greeting. Otherwise it's a failure action.
Ask is a convenience method that creates an action that will display a specified string then get one. The >> operator ensures that the action's result will be the result of getString.
processsString takes an arbitrary string and, if it's 'quit' then it creates an action that will say goodbye and be done. On any other string sayHello is called. The result is combined with another action using 'or' in case sayHello fails. Either way the action is sequenced with the loop action.
Loop is interesting. It's defined as a val just because it can be - a def would work just as well. So it's not quite a loop in the sense of being a recursive function, but it is a recursive value since it's defined in terms of processString which in turn is defined based on loop.
The iomain function kicks everything off by creating an action that will display an intro then do what the loop action specifies.
Warning: because of the way the library is implemented this loop will eventually blow the stack. Do not use it in production code. Read the comments to see why.
object HelloWorld extends IOApplication {
import IOAction._
import RTConsole._
def sayHello(n:String) = n match {
case "Bob" => putLine("Hello, Bob")
case "Chuck" => putLine("Hey, Chuck")
case "Sarah" => putLine("Helloooo, Sarah")
case _ => fail("match exception")
}
def ask(q:String) =
putString(q) >> getString
def processString(s:String) = s match {
case "quit" => putLine("Catch ya later")
case _ => (sayHello(s) or
putLine(s + ", I don't know you.")) >>
loop
}
val loop:IOAction[Unit] =
for {
name <- ask("What's your name? ");
_ <- processString(name)
} yield ()
def iomain(args:Array[String]) = {
putLine(
"This is an example of the IO monad.") >>
putLine("Enter a name or 'quit'") >>
loop
}
}
Conclusion for Part 4
In this article I've called the IO monad 'IOAction' to make it clear that instances are actions that are waiting to be performed. Many will find the IO monad of little practical value in Scala. That's okay, I'm not here to preach about referential transparency. However, the IO monad is one of the simplest monads that's clearly not a collection in any sense.
Still, instances of the IO monad can be seen as containers. But instead of containing values they contain expressions. flatMap and map in essence turn the embedded expressions into more complex expressions.
Perhaps a more useful mental model is to see instances of the IO monad as computations or functions. flatMap can be seen as applying a function to the computation to create a more complex computation.
In the last part of this series I'll cover a way to unify the container and computation models. But first I want to reinforce how useful monads can be by showing an application that uses an elephantine herd of monads to do something a bit more complicated.