Showing posts with label tail recursion. Show all posts
Showing posts with label tail recursion. Show all posts

Tuesday, August 5, 2025

Recursive Prompting

What if we give the LLM the ability to prompt itself? I added a “tool” to the LLM prompt that allows the LLM to prompt itself by calling the promptLLM function with a string.

I guess it isn't surprising that this creates an infinite loop. The tool appears to have a higher affinity than the token prediction engine, so the LLM will always try to call the tool rather than do the work itself. The result is that the LLM calls the tool, which calls the LLM, which calls the tool, which calls the LLM, etc.

We can easily fix this by not binding the tool in the recursive call to the LLM. The recursive call will not have the tool, so it will engage in the token prediction process. Its results come back to the tool, which passes them back to the calling LLM, which returns the results to us.

Could there be a point to doing this, or is this just some recursive wankery that Lisp hackers like to engage in? Actually, this has some interesting applications. When the tool makes the recursive call, it can pass a different set of generation parameters to the LLM. This could be a different tool set or a different set of system instructions. We could erase the context on the recursive call so that the LLM can generate "cold" responses on purpose. We could also use this to implement a sort of "call-with-current-continuation" on the LLM where we save the current context and then restore it later.

The recursive call to the LLM is not tail recursive, though. Yeah, you knew that was coming. If you tried to use self prompting to set up an LLM state machine, you would eventually run out of stack. A possible solution to this is to set up the LLM client as a trampoline. You'd have some mechanism for the LLM to signal to the LLM client that the returned string is to be used to re-prompt the LLM. Again, you'd have to be careful to avoid infinite self-calls. To avoid accumulating state on a tail call, the LLM client would have to remove the recent context elements so that the "tail prompt" is not a continuation of the previous prompt.

Recursive prompting could also be used to search the prompt space for prompts that produce particular desired results.

If you had two LLMs, you could give each the tools needed to call the other. The LLM could consult a different LLM to get a “second opinion” on some matter. You could give one an “optimistic” set of instructions and the other a “pessimistic” set.

The possibilities for recursive prompting are endless.

Monday, April 7, 2025

Are You Tail Recursive?

I was trying to write a procedure that would test whether tail recursion was enabled or not. It turns out that this is semi-decidable. You can only tell if it is not tail recursive by blowing the stack, but you can never be sure that somebody didn’t make a really, really big stack.

But for the sake of argument, let’s say assume that 228 recursive calls is enough to blow the stack. Also, let’s assume that the system will correctly signal an error and be able to recover from the blown stack once it is unwound. Then we can test for tail recursion as follows:

(defconstant +stack-test-size+ (expt 2 28))
(defun test-self-tail-recursion (x)
 (or (> x +stack-test-size+)
 (test-self-tail-recursion (1+ x))))
(defun test-mutual-tail-recursion (x)
 (or (> x +stack-test-size+)
 (test-mutual-tail-recursion-1 (1+ x))))
(defun test-mutual-tail-recursion-1 (x)
 (or (> x +stack-test-size+)
 (test-mutual-tail-recursion (1+ x))))
(defun self-tail-recursive? ()
 (ignore-errors (test-self-tail-recursion 0)))
(defun mutually-tail-recusive? ()
 (ignore-errors (test-mutual-tail-recursion 0)))

Naturally, your optimize settings are going to affect whether or not you have tail recursion enabled, and even if this code says it is enabled, it doesn’t mean that a different compilation unit compiled at a different time with different optimizations would be tail recursive as well. But if you run this in your default environment and it returns NIL, then you can be pretty sure your default is not tail recursive. If it returns T, however, then there are at least some conditions under which your system is tail recursive.

Use of certain features of Common Lisp will “break” tail recursion, for example special variable binding, multiple-value-bind, unwind-protect, and catch, and basically anything that marks the stack or dynamically allocates on the stack. If control has to return to the current function after a call, then it is not tail recursive, but if control can return directly to the caller of the current function, then the compiler should be able to make a tail recursive call.

Most, but not all, Common Lisp implementations can be configured to enable tail recursion, and it is usually possible to disable it if desired. If you want tail recursion, but your implementation cannot provide it, you are SOL. If you don’t want tail recursion, but your implementation provides it by default, there are a number of things you can do to disable it. Usually, you can set the debugging options to retain every stack frame, or you can set the debug optimization to disable tail recursion. If the recursive call is not in tail position, then it is incorrect to tail call it, so you can disable tail recursion by moving the recursive call out of tail position. For example, you could write a without-tail-call macro:

(defmacro without-tail-call (form)
 (let ((value (gensym)))
 `((lambda (,value) ,value) ,form)))
;; Use like this:
(defun factorial (x &optional (acc 1))
 (if (zerop x)
 (progn (cerror "Return the factorial" "Computed a factorial")
 acc)
 (without-tail-call
 (factorial (1- x) (* acc x)))))

Running this program will drop us into the debugger because of the cerror and we can inspect the stack.

> (factorial 100)
Computed a factorial
 [Condition of type SIMPLE-ERROR]
Restarts:
 0: [CONTINUE] Return the factorial
 1: [RETRY] Retry SLY mREPL evaluation request.
 2: [*ABORT] Return to SLY’s top level.
 3: [ABORT] abort thread (#<THREAD tid=179 "sly-channel-1-mrepl-remote-1" RUNNING {1000BA8003}>)
Backtrace:
 0: (FACT 0 2432902008176640000)
 1: (FACT 1 2432902008176640000)
 2: (FACT 2 1216451004088320000)
 3: (FACT 3 405483668029440000)
 4: (FACT 4 101370917007360000)
 5: (FACT 5 20274183401472000)
 6: (FACT 6 3379030566912000)
 ...

As requested, each recursive calls pushes a stack frame.

But what if the compiler “optimizes” ((λ (x) x) (foo)) to simply be (foo)? That is a questionable “optimization”. Lisp is a call-by-value language, meaning that the arguments to a function are fully evaluated before the function is applied to them. The call to (foo) should be fully evaluated before applying (λ (x) x) to it. The “optimization” essentially treats (foo) as unevaluted and then tail calls it after applying the lambda expression. This is call-by-name evaluation. While anything is possible, an “optimization” that sometimes changes the function call semantics from call-by-value to call-by-need is dubious.

Friday, January 17, 2025

Iteration

Iteration is simply that special case of recursion that doesn’t accumulate storage in the long term. It’s a notable special case because computer storage is finite, and you want to be able to write agorithms that are bound by constant space.

There are two common strategies that computer languages use to approach iteration. Functional languages like Scheme and Haskell make sure that normal function calls do not accumulate storage per se. Function calls can be used to direct the control flow, and if they direct the control flow in a loop, you will iterate. Most other languages achieve iteration via special iteration constructs that you must use if you want to iterate. Each of these approaches has its own advantages and disadvantages.

The advantage of using special iteration constructs are these:

  • It is clear that you are iterating.
  • Special constructs are usually optimized for iteration and have particular compiler support to make them efficient.
  • Special constructs are constrained so that you cannot accidentally write non-iterative code.

The disadvantage of using special iteration constructs are these:

  • Special constructs are drawn from a fixed set of constructs that are built in to the language. If you want to iterate differently, you are out of luck.
  • Special constructs usually do not cross function boundaries. Iteration must reside in a single function.
  • You have to decide beforehand that you want to iterate and choose an iteration construct.
  • Special constructs are usually imperative in nature and operate via side effects.

The alternative approach used by functional languages is to make the language implementation tail recursive. This has these advantages:

  • Iteration is automatic. You don’t have to decide that you want to iterate, it just happens when it can.
  • Iteration can cross function boundaries.
  • You can write your own iteration constructs and build them out of ordinary functions.
  • Iteration can be done purely functionally, without side effects.

The disadvantages of using tail recursion for iteration are these:

  • It is not obvious that you are iterating or intended to.
  • You have to be careful to place all the iteration in tail position or you will blow the stack. Beginner programmers often have difficulty recognizing which calls are tail calls and can find it hard to avoid blowing the stack.
  • Small, innocent looking changes in the code can change its behavior to be non tail recursive, again blowing the stack.
  • The stack no longer contains a complete call history. If you rely on the stack as a call history buffer for debugging, you may find debugging more difficult.

The code in an iteration can be classified as being part of the machinery of iteration — the part that sets up the itertion, tests the ending conditional, and advances to the next iteration — or part of the logic of the iteration — the specific part that you are repeating. The machinery of the iteration is usually the same across many iterations, while the logic of the iteration is idiomatic to the specific instance of iteration. For example, all iterations over a list will have a null test, a call to CDR to walk down the list, and a call to CAR to fetch the current element, but each specific iteration over a list will do something different to the current element.

There are several goals in writing iterative code. One is to have efficient code that performs well. Another is to have clear code that is easy to understand, debug, and maintain. You choose how to iterate based on these considerations. For the highest performing code, you will want detailed control over what the code is doing. You may wish to resort to using individual assignments and GOTO statements to squeeze the last clock cycles out of an inner loop. For the clearest code, you will want to use a high degree of abstraction. A clever compiler can generate efficient code from highly abstracted code, and experienced programmers know how to write abstract code that can be compiled to efficient code.

Here are some examples of iteration strategies Lisp. To make these examples easy to compare I chose a simple problem to solve: given a list of numbers, return both a list of the squares of the numbers and the sum of the squares. This is a simple problem that can be solved in many ways.

Tagbody and Go

A tagbody is a block of code that is labeled with tags. You can jump to a tag with a go statement. This is a very low level form of iteration that is not used much in modern Lisp programming. Here is an example of a tagbody:

(defun iteration-example-with-tagbody (numbers)
 (let ((squares ’())
 (total 0)
 (nums numbers))
 (tagbody
 start
 (if (null nums)
 (go end))
 (let ((square (* (car nums) (car nums))))
 (setq squares (cons square squares))
 (incf total square))
 (setq nums (cdr nums))
 (go start)
 end
 (values (nreverse squares) total))))

This is like programming in assembly code. The go instructions turn into jumps. This code is very efficient, but it is not particularly clear. The machinery of the iteration is mixed in with the logic of the iteration, making it hard to see what is going on. The code is not very abstract.

State Machine via Mutual Tail Recursion

Here we use tail recursion to iterate. The compiler will turn the tail recursive call into a jump and the variable rebinding into assignments, so this code will be about as efficient as the tagbody code above.

(defun iteration-example-tail-recursive (numbers &optional (squares ’()) (total 0))
 (if (null numbers)
 (values (nreverse squares) total)
 (let ((square (* (car numbers) (car numbers))))
 (iteration-example-tail-recursive
 (cdr numbers)
 (cons square squares)
 (+ total square)))))

This state machine only has one state, so it is not a very interesting state machine. The ultimate in iteration control is to write an iterative state machine using mutually tail recursive functions. The compiler will generate very efficient code for this, and you can write the code in a very abstract way. Here is an example of a state machine that simulates the action of a turnstile:

(defun turnstile (actions)
 "State machine to simulate a turnstile with actions ‘push’, ‘coin’, and ‘slug’."
 (locked-state actions ’() ’()))
(defun locked-state (actions collected return-bucket)
 (cond ((null actions) (list collected return-bucket))
 ((eql (car actions) ’coin)
 (unlocked-state (cdr actions) collected return-bucket))
 ((eql (car actions) ’push)
 (locked-state (cdr actions) collected return-bucket)) ;; Ignore push in locked state
 ((eql (car actions) ’slug)
 (locked-state (cdr actions) collected (append return-bucket ’(slug)))) ;; Return slug
 (t (locked-state (cdr actions) collected return-bucket))))
(defun unlocked-state (actions collected return-bucket)
 (cond ((null actions) (list collected return-bucket))
 ((eql (car actions) ’push)
 (locked-state (cdr actions) (append collected ’(coin)) return-bucket))
 ((eql (car actions) ’coin)
 (unlocked-state (cdr actions) collected (append return-bucket ’(coin)))) ;; Return coin
 ((eql (car actions) ’slug)
 (unlocked-state (cdr actions) collected (append return-bucket ’(slug)))) ;; Return slug
 (t (unlocked-state (cdr actions) collected return-bucket))))
;; Example usage:
(turnstile ’(coin push coin push)) ;; => ((coin coin) ())
(turnstile ’(push coin push)) ;; => ((coin) ())
(turnstile ’(coin coin push push)) ;; => ((coin) (coin))
(turnstile ’(push)) ;; => (NIL NIL)
(turnstile ’(coin push push)) ;; => ((coin) ())
(turnstile ’(coin coin coin push)) ;; => ((coin) (coin coin))
(turnstile ’(slug coin push)) ;; => ((coin) (slug))
(turnstile ’(coin slug push)) ;; => ((coin) (slug))
(turnstile ’(slug slug push coin push)) ;; => ((coin) (slug slug))

The iteration machinery is still interwoven with the logic of the code. We’re still finding calls to null and cdr sprinkled around the code. Nonetheless, structuring iterative code this way is a big step up from using a tagbody and go. This is my go-to method for compex iterations that cannot easily be expressed as a map or reduce.

Loop Macro

Common Lisp’s loop macro is a very powerful iteration construct that can be used to express a wide variety of iteration patterns.

defun loop-iteration-example (numbers)
 (loop for num in numbers
 for square = (* num num)
 collect square into squares
 sum square into total
 finally (return (values squares total))))

Call me a knee-jerk anti-loopist, but this doesn’t look like Lisp to me. It has some major problems:

  • It is highly imperative. To understand what is going on, you have to follow the code in the order it is written. You need to have a mental model of the state of the loop at each point in the iteration. Running into a loop when reading functional code takes you out of the zen of functional programming.
  • The bound variables are not lexical, they are scattered around the code. You have to carefully examine each clause to figure out what variables are being bound.
  • You need a parser to walk the code. There is nothing that delimits the clauses of the loop; it is a flat list of random symbols and forms. You couldn’t easily write a program that takes a loop form and transforms it in some way.

Do and Friends

The do macro, and its friends dolist, dotimes, and do*, etc., are the most common iteration constructs in Common Lisp.

(defun iteration-example-with-do (numbers)
 (let ((squares ’())
 (total 0))
 (do ((nums numbers (cdr nums)))
 ((null nums) (values (nreverse squares) total))
 (let ((square (* (car nums) (car nums))))
 (setq squares (cons square squares))
 (incf total square)))))

The do macros have some drawbacks:

  • They are imperative. The body of a do loop ultimately must have some sort of side effect or non-local exit to “get a value out”. Notice how we bind accumulator variables in an outer scope and assign them in the inner one. This is a common pattern in a do loop.
  • They do not compose. You can nest a dotimes inside a dolist, e.g., but you cannot run a dotimes in parallel with a dolist.
  • They are incomplete. There is no do-array or do-string, for example.

But at least you can parse them and transform them. They are structured, and you can write a program that walks the clauses of a do loop and does something with them.

Map and Reduce

Map and reduce abstract the machinery of iteration away from the logic of the iteration through use of a monoid (a higher order function). The resulting code is clear and concise:

(defun iteration-example-with-map-reduce (numbers)
 (let* ((squares (map ’list (lambda (num) (* num num)) numbers))
 (total (reduce #’+ squares)))
 (values squares total)))

The looping is implicit in the mapcar and reduce functions. You can usually make the assumption that the language implemetors have optimized these functions to be reasonably efficient.

I often see programmers writing looping code when a perfectly good library function exists that does the same thing. For example, it is common to want to count the number of items in a sequence, and Commmon Lisp supplies the count function just for this purpose. There is no need to write a loop.

Common Lisp provides a filter function, but it is called remove-if-not.

The drawback of using these functions is that large intermediate sequences can be created. In our example code, the entire list of squares is constructed prior to reducing it with #’+. Of course the entire list is one of the return values, so you need it anyway, but if you only needed the sum of the squares, you would prefer to sum it incrementally as you go along rather than constructing a list of squares and then summing it. For small sequences, it doesn’t make a difference.

Series

The series macro suite attempt to bring you best of both worlds. You write series expressions that look like sequence functions, but the macro recognizes that you are iterating and generates efficient incremental code.

(defun iteration-example-with-series (numbers)
 (let ((squares (map-fn ’integer (lambda (n) (* n n)) (scan ’list numbers)))
 (values (collect ’list squares)
 (collect-sum squares))))

This code is very similar to the sequence case, but the series macro will generate code that does not construct the entire list of squares before summing them. It will sum them incrementally as it goes along.

Series will expand into a tagboy. For example, the above code will expand into something like this:

(COMMON-LISP:LET* ((#:OUT-1015 NUMBERS))
 (COMMON-LISP:LET (#:ELEMENTS-1012
 (#:LISTPTR-1013 #:OUT-1015)
 (SQUARES 0)
 #:SEQ-1018
 (#:LIMIT-1019
 (COMMON-LISP:MULTIPLE-VALUE-BIND (SERIES::X SERIES::Y)
 (SERIES::DECODE-SEQ-TYPE (LIST ’QUOTE ’LISTS))
 (DECLARE (IGNORE SERIES::X))
 SERIES::Y))
 (#:LST-1020 NIL)
 (#:SUM-1023 0))
 (DECLARE (TYPE LIST #:LISTPTR-1013)
 (TYPE INTEGER SQUARES)
 (TYPE (SERIES::NULL-OR SERIES::NONNEGATIVE-INTEGER) #:LIMIT-1019)
 (TYPE LIST #:LST-1020)
 (TYPE NUMBER #:SUM-1023))
 (TAGBODY
 #:LL-1026
 (IF (ENDP #:LISTPTR-1013)
 (GO SERIES::END))
 (SETQ #:ELEMENTS-1012 (CAR #:LISTPTR-1013))
 (SETQ #:LISTPTR-1013 (CDR #:LISTPTR-1013))
 (SETQ SQUARES ((LAMBDA (N) (* N N)) #:ELEMENTS-1012))
 (SETQ #:LST-1020 (CONS SQUARES #:LST-1020))
 (SETQ #:SUM-1023 (+ #:SUM-1023 SQUARES))
 (GO #:LL-1026)
 SERIES::END)
 (COMMON-LISP:LET ((SERIES::NUM (LENGTH #:LST-1020)))
 (DECLARE (TYPE SERIES::NONNEGATIVE-INTEGER SERIES::NUM))
 (SETQ #:SEQ-1018 (MAKE-SEQUENCE ’LISTS (OR #:LIMIT-1019 SERIES::NUM)))
 (DO ((SERIES::I (1- SERIES::NUM) (1- SERIES::I)))
 ((MINUSP SERIES::I))
 (SETF (ELT #:SEQ-1018 SERIES::I) (POP #:LST-1020))))
 (VALUES #:SEQ-1018 #:SUM-1023)))

90% of the time, the series macro will produce very efficient code, but 10% of the time the macro loses its lunch. It takes a little practice to get use to when the series macro will work and to write code that the series macro can handle.

Conclusion

There are many ways to iterate in Lisp, some are more efficient than others, some are more abstrac than others. You choose the way that suits your needs. I like the abstraction of the series macro, but I will also use a library function like count when it is appropriate. When I need tight control, I’ll write a state machine.

Friday, January 3, 2025

REBOL 1.0 Was Slow

Rebol 1.0 was slow. I paid little attention to speed in the implementation — I was concerned with correctness. The intepreter was intended to be a reference implementation, with well-defined behavior on every edge case. My intent was to add a compiler at a later date.

Once source of slowness was the liberal use of first-class continuations in the interpreter. Rebol 1.0 used a “Cheney on the MTA” interpretation strategy, where no function ever returned a value and the stack simply got deeper and deeper. When the stack overflowed, a stack garbage collection was triggered. Since most of the stack was garbage, this was a fast operation (I used a garbage collector that used time proportional to live storage). With such an implementation, first-class continuations were trivial to implement — all continuations were first-class, it was just a question of whether you surfaced them to the user. I didn’t have an ideological belief either way, but there they were, so why not? Many control flow constructs that would otherwise require an ad hoc implementation can be easily implemented with first-class continuations.

Rebol had return statements that would return control to the caller from within the function. 99% of the time, the caller is sitting on the stack just above the current frame. But 1% of the time, the user would do something weird like create a lexical closure over the return statement and pass it downward. Like as not he didn’t deliberately do this, but rather used some library that was implemented in continuation-passing style. If this happened, the return statement might have to unwind an arbitrary amount of stack. To implement this, I captured the current continuation at the entry of each function and bound it to the implicit “return” variable. Invoking return invoked the continuation and returned control to the caller. The advantage of doing it this way was that return statements had the correct semantics under all circumstances. There were no special rules governing use of return and no code had to have special cases for unexpected returns.

A similar thing happened in the implementation of break and continue in loops. These were implemented by capturing the continuation at the entry of the loop and binding it to the implicit break variable, and capturing the continuation on each iteration and binding it to the implicit continue variable. Because these were first-class continuations, they could be used to restart the loop after it exited. That wasn’t a requirement. I was perfectly happy to stipulate that break and continue only work while a loop is in progress, but in Rebol 1.0, they’d continue to work after the loop finished.

Worrying about continuations captured in lexical closures may seem weird, but it’s a real issue. It is common to introduce implicit lexical contours in a program: even a let expression does it. You would like to be able to use break and continue in the body of a let expression in a loop. Some Rebol constructs were implemented by implicitly macroexpanding the code into a call to a helper function. break and continue would work across function call boundaries, so there were no limitations on introducing helper functions within a loop.

A more traditional language has a handful of ad hoc iteration constructs that are implemented with special purpose code. The special purpose code knows it is a loop and can be optimized for this. break and continue statements have a special dependency on the enclosing loop.

Rebol 1.0 was properly tail recursive, so there was no special implementation of loops. They were ordinary functions that happened to call themselves. Non-standard iteration constructs could be created by the user by simply writing code that called itself. break and continue just surfaced the interpreter’s continuation to the user. As a consequence, loops in Rebol 1.0 were implemented completely in Rebol code but had signifcant interpreter overhead.

Rebol 2.0 and later are not properly tail recusive. As a consequence, special looping constructs are required to be written in C to support iteration. Common iteration constucts such as for and while are provided and do not have interpreter overhead, but if you want a non-standard iteration construct, there is no way to achieve it. You have to re-write your code to use one of the built-in iteration constructs or go without and risk blowing the stack.

My intent was to eventually write a compiler for Rebol. I wrote a prototype called Sherman that compiled to MIT-Scheme and was supported by the MIT-Scheme runtime library. Loops compiled with Sherman ran quickly as expected.

Wednesday, June 26, 2024

Goto not that harmful

I use goto all the time. When I have a loop, I goto the top of the loop after each iteration. When a function delegates to another function, I just goto the other function.

Gotos that just transfer control have the problem that the context is implicit. It isn’t obvious from the code what parts of the context are expected to persist across the control transfer. But if you explicitly pass arguments along with your control transfer, then you can see exactly what is carried across the control transfer.

A tail call isn’t “optimized” to a goto, it is a goto. It is a goto that passes arguments. Gotos that pass arguments aren’t harmful.

Friday, May 24, 2024

Why I Want Tail Recursion

The reason I want tail recursion is not to write loops (I can do that with a while loop), but to write in continuation passing style if I need to. Continuation passing style allows you to implement any control flow pattern you can imagine, not just the ones intrinsic to the language. You don’t want to use it all the time, but it’s a valuable fallback when you need some ad hoc advanced control flow. Without tail recursion, any non-trivial use of continuation passing style risks blowing the stack.

Iteration is just the special case of linear recursion that doesn’t accumulate state. 99 percent of the time, you know beforehand that you are looping and can use a looping construct, but sometimes you have the general case where whether you loop or not depends on the data at runtime. If you have tail recursion, you just write the code recursively and the tail recursion mechanism will turn it into a loop if it notices that you aren’t accumulating state.

If your language has lambda and tail recursion, it can implement any other control flow that might have been overlooked by the language designer. If it doesn’t, you're limited to the control flow the language designer bothered to implement.

Thursday, May 23, 2024

Rewrite rules

I like rewrite rules. They are simple to understand and easy to implement. If you have a rewrite rule (or a set of them) that makes incremental progress in making an expression "simpler" in some sense, you can keep applying it (or them) repeatedly until you have finished the calculation.

If a function directly returns the result of calling another function, we say that the call is in "tail position". We can view this as a simple rewrite rule. Consider the fact-mul function which computes the factorial of a number n and multiplies it by another number m:

fact-mul (n, m) = n! * m

We can define two rewrite rules for this function. If n is 0, then the factorial is 1, so we just rewrite this as m. If n is not 0, we can rewrite this as fact-mul (n-1, n*m). This translates directly into Lisp:

(defun fact-mul (n m)
 (if (zerop n)
 m
 (fact-mul (- n 1) (* n m))))

The rewrite rule doesn't have to rewrite into the same function. For instance, we can rewrite odd? (x) to even? (x-1) and even? (x) to odd? (x-1):

(defun odd? (x)
 (if (zerop x)
 nil
 (even? (1- x))))
(defun even? (x)
 (if (zerop x)
 t 
 (odd? (1- x))))

Yes, this is bog-simple tail recursion, but somehow it seems even simpler to me if I think of it as a rewrite rule.

Even if you don't have tail recursion, or if you have disabled it, it still works, but the number of rewrites is limited by the stack depth.

Monday, August 30, 2021

Tail recursion and fold-left

fold-left has this basic recursion:

(fold-left f init ()) = init
(fold-left f init (a . d)) = (fold-left f (f init a) d)
A straightforward implementation of this is
(defun fold-left (f init list)
 (if (null list)
 init
 (fold-left f (funcall f init (car list)) (cdr list))))
The straightforward implementation uses a slightly more space than necessary. The call to f occurs in a subproblem position, so there the stack frame for fold-left is preserved on each call and the result of the call is returned to that stack frame.

But the result of fold-left is the result of the last call to f, so we don't need to retain the stack frame for fold-left on the last call. We can end the iteration on a tail call to f on the final element by unrolling the loop once:

(defun fold-left (f init list)
 (if (null list)
 init
 (fold-left-1 f init (car list) (cdr list))))
(defun fold-left-1 (f init head tail)
 (if (null tail)
 (funcall f init head)
 (fold-left-1 f (funcall f init head) (car tail) (cdr tail))))

There aren't many problems where this would make a difference (a challenge to readers is to come up with a program that runs fine with the unrolled loop but causes a stack overflow with the straightforward implementation), but depending on how extreme your position on tail recursion is, this might be worthwhile.

Monday, April 19, 2021

η-conversion and tail recursion

Consider this lambda expression: (lambda (x) (sqrt x)). This function simply calls sqrt on its argument and returns whatever sqrt returns. There is no argument you could provide to this function that would cause it to return a different result than you would get from calling sqrt directly. We say that this function and the sqrt function are extensionally equal. We can replace this lambda expression with a literal reference to the sqrt function without changing the value produced by our code.

You can go the other way, too. If you find a literal reference to a function, you can replace it with a lambda expression that calls the function. This is η-conversion. η-reduction is removing an unnecessary lambda wrapper, η-expansion is introducting one.

η-conversion comes with caveats. First, it only works on functions. If I have a string "foo", and I attempt to η-expand this into (lambda (x) ("foo" x)), I get nonsense. Second, a reduction strategy that incorporates η-reduction can be weaker than one that does not. Consider this expression: (lambda (x) ((compute-f) x)). We can η-reduce this to (compute-f), but this makes a subtle difference. When wrapped with the lambda, (compute-f) is evaluated just before it is applied to x. In fact, we won't call (compute-f) unless we invoke the result of the lambda expression somewhere. However, once η-reduced, (compute-f) is evaluated at the point the original lambda was evaluated, which can be quite a bit earlier.


When a function foo calls another function bar as a subproblem, an implicit continuation is passed to bar. bar invokes this continuation on the return value that it computes. We can characterize this continuation like this:

kbar = (lambda (return-value)
 (kfoo (finish-foo return-value)))
this just says that when bar returns, we'll finish running the code in foo and further continue by invoking the continuation supplied to foo.

If foo makes a tail call to bar, then foo is just returning what bar computes. There is no computation for foo to finish, so the continuation is just

kbar = (lambda (return-value)
 (kfoo return-value))
But this η-reduces to just kfoo, so we don't have to allocate a new trivial continuation when foo tail calls bar, we can just pass along the continuation that was passed to foo.

Tail recursion is equivalent to η-reducing the implicit continuations to functions where possible. A Scheme aficionado might prefer to say avoiding η-expanding where unnecessary.

This is a mathematical curiosity, but does it have practical significance? If you're programming in continuation passing style, you should be careful to η-reduce (or avoid η-expanding) your code.

Years ago I was writing an interpreter for the REBOL language. I was getting frustrated trying to make it tail recursive. I kept finding places in the interpreter where the REBOL source code was making a tail call, but the interpreter itself wasn't, so the stack would grow without bound. I decided to investigate the problem by rewriting the interpreter in continuation passing style and seeing why I couldn't η-convert the tail calls. Once in CPS, I could see that eval took two continuations and I could achieve tail recursion by η-reducing one of them.

Thursday, February 6, 2020

Dispatching

There are times when you are faced with a complex piece of control flow
try {
 if (condition())
 {
 ... block 1 ...
 }
 else 
 {
 switch (someValue())
 {
 case CASE_A:
 ... block 2 ...
 break;
 case CASE_B:
 ... block 3 ...
 break;
 default:
 ... block 4 ...
 break;
 }
 }
} catch (SomeException someException) {
 ... block 5 ...
}
and you want to abstract the control flow — all the conditionals, switches, and try/catches — away from the code that does the work — the various blocks. In fact, here I've abstracted it away by putting "... block n ..." in place of the blocks of code.

If I were writing this in Scheme or Common Lisp, I'd consider using continuation passing style. I'd write a function dispatch-on-foo that would perform the dispatch, but then invoke one of several first-class procedures passed in as arguments
(defun dispatch-on-foo (foo bar case1 case2 case3 case4 case5)
 (if (... complex conditional ...) 
 (funcall case1)
 (handler-case (case some-value
 ((case-a) (funcall case2))
 ((case-b) (funcall case3))
 (t (funcall case4)))
 (error (condition) (funcall case5)))))
At the call site, I'd write
(dispatch-on-foo <arg1> <arg2>
 (lambda ()
 ... block 1 ...)
 (lambda ()
 ... block 2 ...)
 (lambda ()
 ... block 3 ...)
 (lambda ()
 ... block 4 ...)
 (lambda ()
 ... block 5 ...))
This is a win when the complexity of the dispatch is enough that you don't want to replicate it at every call site. Notice how the nested blocks of code have been pulled up to the same level and linearized. Granted, you've cluttered up the call site with lambda expressions, but as Steele pointed out, you can think of these as anonymous go tags: dispatch-on-foo in essence will end up doing a jump to one of these tags and execute the block there, skipping and ignoring the other blocks. Once you get used to thinking in this way, the lambdas disappear just like the parens do for a seasoned Lisp hacker. They just look like jump targets or case labels, and the call site looks a lot like a case expression. It is a bit more powerful than an ordinary case expression because you could arrange for dispatch-on-foo to funcall the appropriate closure on an argument (and have the lambda expression take an argument of course).

You could do something analagous with Java 8's lambdas, but on the rare occasion I've wanted to do something similar in Java 7. The problem is that Java 7 doesn't have lambda expressions. The solution is to change these anonymous lambdas into named callback methods. First we define a generic interface with our callbacks:
interface DispatchOnFooCases<T> {
 T caseOne (void);
 T caseTwo (void);
 T caseThree (void);
 ... etc. ...
 }
then we define the dispatch method:
<T> T dispatchOnFoo (FooClass foo, BarClass bar, DispatchOnFooCases dispatchOnFooCases)
{
 try {
 if (conditional())
 return dispatchOnFooCases.caseOne();
 else
 switch (someValue()) {
 case CASE_A:
 return dispatchOnFooCases.caseTwo();
 case CASE_B:
 return dispatchOnFooCases.caseThree();
 default:
 return dispatchOnFooCases.caseFour();
 }
 } catch (SomeException someException) {
 return dispatchOnFooCases.CaseFive();
 }
}
finally, at the call site, we write this:
{
 int value =
 dispatchOnFoo<int> (foo, bar,
 new DispatchOnFooCases<int> ()
 {
 @Override
 int caseOne (void)
 {
 ... block 1 ...
 return aValue;
 }
 @Override
 int caseTwo (void)
 {
 ... block 2 ...
 return aDifferentValue;
 }
 ... etc. ...
 });
}
The good news is that we've accomplished our goal of abstracting the complex conditional dispatch from the code that does the real work — the method bodies at the call site.

There is, unfortunately, a fair amount of bad news. First, if you thought lambda expressions introduced clutter, then this is a serious amount of clutter. Between @Overrides, type declarations, interfaces, and methods, there is just a lot of extra stuff you have to type. It still might be worth the clutter if the dispatch conditions are complex enough. They just need to be that much more complex to justify all this machinery. (We've actually done the work the compiler would do to allocate and pass a “multi-closure”.) There are cases where this pays off, though.

The second piece of bad news is that Java is not (in general) tail recursive. This means that the call to dispatchOnFoo and the callback to one of the cases both introduce a new stack frame. So although the case methods run in the same lexical environment as where they are defined, they are running two stack frames deeper. This won't make much of a difference unless you try to loop by recursively calling the code. In that case, you need to be very careful to limit the amount of recursion or you will overflow the stack. It is best to avoid recursion as much as possible in the bodies of the cases.

You probably won't need to resort to this doing this. It can be a case of the cure being worse than the disease. The complexity of introducing callbacks can exceed the complexity of the conditional you are trying to abstract. But this is an interesting way to abstract a very complex conditional and can come in handy when you can justify using it. I have actually used this technique in production code to separate some complex control flow from the code that did the actual work.

Tuesday, January 21, 2020

But what if you really want to push a stack frame?

If you really don't want tail recursion, the solution is simple: don't put the call in “tail position”. We define a rather trivial function dont-tail-call and use it like this:
(dont-tail-call
 (foo x))
The semantics of Scheme are that the arguments are evaluated before the call to the function, so the call to foo is required to occur before the call to dont-tail-call which necessitates allocation of a continuation.

But what about a really clever compiler that somehow “optimizes” away the call to dont-tail-call? Such an “optimization” is highly questionable because it changes the observable semantics of the program so it is no longer call-by-value. A compiler that performed that transformation wouldn't be a Scheme compiler because Scheme is a call-by-value language.

But what if we had really clever compiler and we disregard the change in semantics? We can easily thwart the compiler by deferring the definition of dont-tail-call until run time. Even the cleverest compiler cannot see into the future.

The definition of dont-tail-call is left to the reader, as is how to defer it's definition until run time.

Afraid of Tail Recursion

It's well known fact among proponents of tail recursion that some people just don't get it. They view tail recursion at best as a quirky compiler optimization that turns some recursive calls into loops. At worst, they see it as some sort of voodoo, or a debugging pessimization. They see little value in it. Some have outright disdain for it.

Tail recursion isn't just about turning recursive calls into loops. It's about changing how you look at function calling. Tail recursion just happens to fall out of this new viewpoint.

Most programmers, I think, view function calls as if they were akin to a short vacation. You pack up the arguments in your luggage, travel to the destination, unpack your luggage, do some stuff, repack your luggage with some souvenirs, return home, unpack everything and resume life where you left off. Your souvenirs are the return value.

Should you need a vacation from your vacation, you do the same thing: pack up the arguments in your luggage, travel to your new destination, unpack your luggage, do some stuff, repack your luggage with some souvenirs, return to your original vacation spot and resume your original vacation.

Tail recursion aficionados realize that the journey itself is the important part of the function call, and that a vacation includes two journeys. On the first journey you pack up the arguments, including the return ticket, in your luggage, use the outbound ticket to journey to the destination, unpack your luggage, and start doing stuff. When you run out of stuff to do, you make the second journey. You fetch the return ticket, repack your luggage, take the ticket to wherever it leads (presumably back home), unpack everything, and resume doing whatever you were doing there.

But perhaps you want to visit grandma instead of going directly home. Then we change the script slightly. When you run out of things to do on your vacation, you pack up your luggage with your souvenirs and the return ticket, then you journey to grandma's house, where you unpack and start doing stuff. Eventually you are done visiting grandma, so then you fetch the return ticket, repack your luggage, take the ticket to wherever it leads, unpack everything, and resume doing stuff there. It's a three-legged journey. You don't go from grandma's back to the vacation resort — there's nothing left for you to do there. You take the return ticket directly home.

Viewing things this way, a function call involves packaging the arguments in a suitable way, deallocating any temporary storage, and then making an unconditional transfer to the function, where we unpack the arguments and resume execution of the program. It is simply “a goto that passes arguments”.*

A function return is simply “a goto that passes a return value”. It involves packaging the return value in a suitable way, deallocating any temporary storage, and then making an unconditional transfer to the return address, where we resume execution of the program.

A tail recursive function call is simply “a goto that passes arguments”. It involves packaging the arguments in a suitable way, deallocating any temporary storage and then making an unconditional transfer to the function, where we resume execution of the program.

Do we really deallocate temporary storage before every control transfer? Certainly a return pops the topmost stack frame, and as often implemented, a tail recursive function call deallocates its stack frame or replaces it before transferring control, but a non tail recursive call? It does so as well, it's just that it also has to pack those values into a new continuation for the return trip. We use an implementation trick to avoid the absurdity of actually moving these values around: we move the base of frame pointer instead. Voila, we simultaneously deallocate the stack frame and allocate the continuation with the right values already in place.

Deallocating storage before each control transfer is an important part of the protocol. We're making a unconditional transfer to a destination with the hope, but no guarantee, that we'll come back, so we'd better tidy up before we leave. This ensures that we won't leave a trail of garbage behind us on each transfer which would accumulate and adversely affect the space complexity of our program.

Once you view a function call and return as not being a single sequence, but each one a separate, and virtually identical sequence, then tail recursion becomes a natural consequence. Tail recursion isn't a special case of function call, it is the same thing as a function call, the only difference being whether a new continuation (the "return ticket") is allocated in order to come back. Even function returns are the same thing, the only difference being that destination is (usually) determined dynamically rather than statically. Tail recursion isn't just another optimization, it's the result of treating inter-procedural control transfer symmetrically.

Another natural consequence is greatly increased options for control flow. Instead of a strict call/return pattern, you can make "three-legged" trips, or however many legs you need. You can make loops that incorporate one, two, or even a dynamically changing number of functions. You can replace compiler-generated returns with user-provided function calls (continuation-passing style) and implement arbitrarily complex control and data flow like multiple return values, exception handling, backtracking, coroutines, and patterns that don't even have names yet. And of course you can mix and match these patterns with the standard call and return pattern as well.

The phrase "tail recursion" is shorthand for this symmetric view of interprocedural control flow and is meant to encompass all these consequences and control flow options that spring from it. It's not about simply turning recursive functions into loops.

People who are afraid of tail recursion seem unable to see any value in taking up the symmetric viewpoint despite the fact that it opens up a whole new set of control flow techniques (in particular continuation-passing style). They find the notion that a procedure call is “a goto that passes arguments” “nonsensical”. A lot of good research has come from taking this nonsense seriously.


*The view that a function call is simply a “a goto that passes arguments” was developed by Steele in his “Lambda papers”.

The important point of cleaning up before the control transfer was formalized by Clinger in “Proper Tail Recursion and Space Efficiency”.

Someone — it might have been Clinger, but I cannot find a reference — called tail recursion “garbage collection for the stack”. The stack, being so much more limited in size than the heap, needs it that much more. Indeed Clinger notes the tight connection between tail recursion and heap garbage collection and points out that heap garbage collection is hampered if the stack is retaining pointers to logically dead data structures. If the dead structures are large enough, garbage collection can be rendered useless. Yet many popular languages provide garbage collection but not tail recursion.

The only difference between a call and return is that typically the call is to a statically known location and the return address is dynamically passed as a "hidden" argument. But some compilers, like Siskind's Stalin compiler, statically determine the return address as well.

The only difference between a function call and a tail recursive function call is when you need to return to the caller to complete some work. In this case, the caller needs to allocate a new continuation so that control is eventually returned. If there is no further work to be done in the caller, it doesn't create a new continuation, but simply passes along the one that was passed to it.

Many compilers have been written that handle function calls, tail recursive function calls, and returns identically. They only change what code they emit for handling the continuation allocation. These compilers naturally produce tail recursive code.

Most machines provide special purpose support for a LIFO stack. It is tempting to use the stack for allocation of continuations because they are almost always allocated and deallocated in LIFO order, and a stack gives higher performance when this is the case. Many compilers do in fact use the stack for continuations and argument passing. Some, like Winklemann's Chicken compiler follow Baker's suggestion and treat the stack as an "nursery" for the heap. Others avoid using the stack altogether because of the extra complexity it entails. And others cannot use the stack because of constraints placed on stack usage by OS conventions or the underlying virtual machine model.
Subscribe to: Comments (Atom)

AltStyle によって変換されたページ (->オリジナル) /