The definition of a Y combinator in F# is
let rec y f x = f (y f) x
f expects to have as a first argument some continuation for the recursive subproblems. Using the y f as a continuation, we see that f will be applied to successive calls as we can develop
let y f x = f (y f) x = f (f (y f)) x = f (f (f (y f))) x etc...
The problem is that, a priori, this scheme precludes using any tail call optimization : indeed, there might be some operation pending in the f's, in which case we can't just mutate the local stack frame associated with f.
So :
- on the one end, using the Y combinator require an explicit different continuation than the function itself.
- on the othe to apply TCO, we would like to have no operation pending in f and only call f itself.
Do you know of any way in which those two could be reconciled ? Like a Y with accumulator trick, or a Y with CPS trick ? Or an argument proving that there is no way it can be done ?
2 Answers 2
Do you know of any way in which those two could be reconciled?
No, and with good reason, IMHO.
The Y-combinator is a theoretical construct and is only needed to make lambda calculus turing complete (remember, there are no loops in lambda calculus, nor do lambdas have names we could use for recursion).
As such, the Y combinator is truly fascinating.
But: Nobody actually uses the Y-combinator for actual recursion! (Except maybe for fun, to show that it really works.)
Tail-call optimization, OTOH, is, as the name says, an optimization. It adds nothing to the expresiveness of a language, it is only because of practical considerations like stack space and performance of recursive code that we care about it.
So your question is like: Is there hardware support for beta reduction? (Beta reduction is how lambda expressions are reduced, you know.) But no functional language (as far as I am aware of) compiles its source code to a representation of lambda expressions that will be beta reduced at runtime.
-
2The Y-combinator is like retying a knot that keeps getting untied after each use. Most systems short cut this and tie the knot at the meta-level such that it never needs to be retied.Dan D.– Dan D.2013年07月17日 16:27:08 +00:00Commented Jul 17, 2013 at 16:27
-
1As to the last paragraph, consider Haskell which at its heart uses graph reduction to do lazy evaluation. But my favorite is optimal reduction which always takes the path in the Church-Rosser lattice with the least reductions to full normal form. Such as appears in Asperti and Guerrini's The Optimal Implementation of Functional Programming Languages. Also see BOHM 1.1.Dan D.– Dan D.2013年07月17日 16:29:20 +00:00Commented Jul 17, 2013 at 16:29
-
@DanD. Thanks for the links, I'll try them later in a postscript aware browser. Sure there is something to learn for me. But, are you sure that compiled haskell does graph reduction? I doubt this.Ingo– Ingo2013年07月17日 16:35:58 +00:00Commented Jul 17, 2013 at 16:35
-
1Actual it does use graph reduction: "GHC compiles to the spineless tagless G-machine (STG). This is a notional graph reduction machine (i.e., a virtual machine that performs graph reductions as described above)." From... For more on the STG machine, see Simon Peyton Jones's Implementing lazy functional languages on stock hardware: the Spineless Tagless G-machine.Dan D.– Dan D.2013年07月17日 17:10:00 +00:00Commented Jul 17, 2013 at 17:10
-
@DanD. in the same article you linked, it reads further down that GHC "does a number of optimisations on that representation, before finally compiling it into real machine code (possibly via C using GCC)."Ingo– Ingo2013年07月17日 19:51:18 +00:00Commented Jul 17, 2013 at 19:51
I'm not completely sure about this answer, but it is the best I could come up with.
The y combinator is inherently lazy, in strict languages the laziness must be manually added through extra lambdas.
let rec y f x = f (y f) x
Your definition looks like it requires laziness in order to terminate, or the (y f)
argument would never finish evaluating and would have to evaluated whether or not f
used it. TOC in a lazy context is more complicated, and furthermore the result of (y f)
is repeated function composition without application with the x
. I'm not sure this need take O(n) memory where n is the depth of the recursion, but I doubt you could achieve the same type of TOC as is possible with something like (switching to Haskell because I don't actually know F#)
length acc [] = acc
length acc (a:b) = length (acc+1) b
If you aren't aware of it already, the difference between foldl
and foldl'
in Haskell may shead some light on the situation. foldl
is written as would be done in an eager language. But instead of being TOC'd, it is actually worse than foldr
because the acculator stores a potentially enormous thunk that cannot be partially evaluated. (This is related to why both foldl and foldl' do not work on infinite lists.) Thus in more recent versions of Haskell, foldl'
was added which forces the evaluation of the accumulator each time the function recurs to ensure no enormous thunk is created. I am sure http://www.haskell.org/haskellwiki/Foldr_Foldl_Foldl%27 can explain this better than I.
Explore related questions
See similar questions with these tags.
f
. We can see thaty
could tailcallf
with a thunk(y f)
, but as you sayf
might have some pending operation. I think it would be interesting to know if there's a separate combinator that is more tailcall friendly. I wonder if this question would get better attention on the CS Stackexchange site?