Iteration is simply that special case of recursion that doesn’t
accumulate storage in the long term. It’s a notable special case
because computer storage is finite, and you want to be able to write
agorithms that are bound by constant space.
There are two common strategies that computer languages use to
approach iteration. Functional languages like Scheme and Haskell
make sure that normal function calls do not accumulate
storage per se. Function calls can be used to direct the
control flow, and if they direct the control flow in a loop, you
will iterate. Most other languages achieve iteration via
special iteration constructs that you must use if you want to
iterate. Each of these approaches has its own advantages and
disadvantages.
The advantage of using special iteration constructs are these:
- It is clear that you are iterating.
- Special constructs are usually optimized for iteration and have
particular compiler support to make them efficient.
- Special constructs are constrained so that you cannot accidentally
write non-iterative code.
The disadvantage of using special iteration constructs are
these:
- Special constructs are drawn from a fixed set of constructs
that are built in to the language. If you want to iterate differently, you
are out of luck.
- Special constructs usually do not cross function boundaries.
Iteration must reside in a single function.
- You have to decide beforehand that you want to iterate and
choose an iteration construct.
- Special constructs are usually imperative in nature and operate
via side effects.
The alternative approach used by functional languages is to make
the language implementation tail recursive. This has these
advantages:
- Iteration is automatic. You don’t have to decide that you
want to iterate, it just happens when it can.
- Iteration can cross function boundaries.
- You can write your own iteration constructs and build them out
of ordinary functions.
- Iteration can be done purely functionally, without side
effects.
The disadvantages of using tail recursion for iteration are
these:
- It is not obvious that you are iterating or intended to.
- You have to be careful to place all the iteration in tail
position or you will blow the stack. Beginner programmers often
have difficulty recognizing which calls are tail calls and can find
it hard to avoid blowing the stack.
- Small, innocent looking changes in the code can change its
behavior to be non tail recursive, again blowing the stack.
- The stack no longer contains a complete call history. If you
rely on the stack as a call history buffer for debugging, you may
find debugging more difficult.
The code in an iteration can be classified as being part of the
machinery of iteration — the part that sets up the itertion,
tests the ending conditional, and advances to the next iteration
— or part of the logic of the iteration — the specific
part that you are repeating. The machinery of the iteration is
usually the same across many iterations, while the logic of the
iteration is idiomatic to the specific instance of iteration. For
example, all iterations over a list will have a null test, a call
to CDR to walk down the list, and a call to CAR to fetch the current
element, but each specific iteration over a list will do something
different to the current element.
There are several goals in writing iterative code. One is to have
efficient code that performs well. Another is to have clear code
that is easy to understand, debug, and maintain. You choose how to
iterate based on these considerations. For the highest performing
code, you will want detailed control over what the code is doing.
You may wish to resort to using individual assignments
and GOTO statements to squeeze the last clock cycles
out of an inner loop. For the clearest code, you will want to use a
high degree of abstraction. A clever compiler can generate
efficient code from highly abstracted code, and experienced
programmers know how to write abstract code that can be compiled to
efficient code.
Here are some examples of iteration strategies Lisp. To make these
examples easy to compare I chose a simple problem to solve: given a
list of numbers, return both a list of the squares of the numbers
and the sum of the squares. This is a simple problem that can be
solved in many ways.
Tagbody and Go
A tagbody is a block of code that is labeled with
tags. You can jump to a tag with a go statement. This
is a very low level form of iteration that is not used much in
modern Lisp programming. Here is an example of
a tagbody:
(defun iteration-example-with-tagbody (numbers)
(let ((squares ’())
(total 0)
(nums numbers))
(tagbody
start
(if (null nums)
(go end))
(let ((square (* (car nums) (car nums))))
(setq squares (cons square squares))
(incf total square))
(setq nums (cdr nums))
(go start)
end
(values (nreverse squares) total))))
This is like programming in assembly code. The go
instructions turn into jumps. This code is
very efficient, but it is not particularly clear. The
machinery of the iteration is mixed in with the logic of the
iteration, making it hard to see what is going on. The code is not
very abstract.
State Machine via Mutual Tail Recursion
Here we use tail recursion to iterate. The compiler will turn the
tail recursive call into a jump and the variable rebinding into
assignments, so this code will be about as efficient as
the tagbody code above.
(defun iteration-example-tail-recursive (numbers &optional (squares ’()) (total 0))
(if (null numbers)
(values (nreverse squares) total)
(let ((square (* (car numbers) (car numbers))))
(iteration-example-tail-recursive
(cdr numbers)
(cons square squares)
(+ total square)))))
This state machine only has one state, so it is not a very
interesting state machine. The ultimate in iteration control is to
write an iterative state machine using mutually tail recursive
functions. The compiler will generate very efficient code for this,
and you can write the code in a very abstract way. Here is an
example of a state machine that simulates the action of a turnstile:
(defun turnstile (actions)
"State machine to simulate a turnstile with actions ‘push’, ‘coin’, and ‘slug’."
(locked-state actions ’() ’()))
(defun locked-state (actions collected return-bucket)
(cond ((null actions) (list collected return-bucket))
((eql (car actions) ’coin)
(unlocked-state (cdr actions) collected return-bucket))
((eql (car actions) ’push)
(locked-state (cdr actions) collected return-bucket)) ;; Ignore push in locked state
((eql (car actions) ’slug)
(locked-state (cdr actions) collected (append return-bucket ’(slug)))) ;; Return slug
(t (locked-state (cdr actions) collected return-bucket))))
(defun unlocked-state (actions collected return-bucket)
(cond ((null actions) (list collected return-bucket))
((eql (car actions) ’push)
(locked-state (cdr actions) (append collected ’(coin)) return-bucket))
((eql (car actions) ’coin)
(unlocked-state (cdr actions) collected (append return-bucket ’(coin)))) ;; Return coin
((eql (car actions) ’slug)
(unlocked-state (cdr actions) collected (append return-bucket ’(slug)))) ;; Return slug
(t (unlocked-state (cdr actions) collected return-bucket))))
;; Example usage:
(turnstile ’(coin push coin push)) ;; => ((coin coin) ())
(turnstile ’(push coin push)) ;; => ((coin) ())
(turnstile ’(coin coin push push)) ;; => ((coin) (coin))
(turnstile ’(push)) ;; => (NIL NIL)
(turnstile ’(coin push push)) ;; => ((coin) ())
(turnstile ’(coin coin coin push)) ;; => ((coin) (coin coin))
(turnstile ’(slug coin push)) ;; => ((coin) (slug))
(turnstile ’(coin slug push)) ;; => ((coin) (slug))
(turnstile ’(slug slug push coin push)) ;; => ((coin) (slug slug))
The iteration machinery is still interwoven with the logic of
the code. We’re still finding calls to null
and cdr sprinkled around the code. Nonetheless,
structuring iterative code this way is a big step up from using a
tagbody and go. This is my go-to method
for compex iterations that cannot easily be expressed as
a map or reduce.
Loop Macro
Common Lisp’s loop macro is a very powerful iteration
construct that can be used to express a wide variety of iteration
patterns.
defun loop-iteration-example (numbers)
(loop for num in numbers
for square = (* num num)
collect square into squares
sum square into total
finally (return (values squares total))))
Call me a knee-jerk anti-loopist, but this doesn’t look like Lisp
to me. It has some major problems:
- It is highly imperative. To understand what is going on, you
have to follow the code in the order it is written. You need to
have a mental model of the state of the loop at each point in the
iteration. Running into a
loop when reading functional
code takes you out of the zen of functional programming.
- The bound variables are not lexical, they are scattered around
the code. You have to carefully examine each clause to figure out what
variables are being bound.
- You need a parser to walk the code. There is nothing that
delimits the clauses of the loop; it is a flat list of random
symbols and forms. You couldn’t easily write a program that takes a
loop form and transforms it in some way.
Do and Friends
The do macro, and its friends dolist,
dotimes, and do*, etc., are the most common
iteration constructs in Common Lisp.
(defun iteration-example-with-do (numbers)
(let ((squares ’())
(total 0))
(do ((nums numbers (cdr nums)))
((null nums) (values (nreverse squares) total))
(let ((square (* (car nums) (car nums))))
(setq squares (cons square squares))
(incf total square)))))
The do macros have some drawbacks:
- They are imperative. The body of a
do loop
ultimately must have some sort of side effect or non-local exit to
“get a value out”. Notice how we bind accumulator
variables in an outer scope and assign them in the inner one.
This is a common pattern in a do loop.
- They do not compose. You can nest a
dotimes inside
a dolist, e.g., but you cannot run
a dotimes in parallel with a dolist.
- They are incomplete. There is no
do-array
or do-string, for example.
But at least you can parse them and transform them. They are
structured, and you can write a program that walks the clauses of a
do loop and does something with them.
Map and Reduce
Map and reduce abstract the machinery of iteration away from the
logic of the iteration through use of a monoid (a higher order
function). The resulting code is clear and concise:
(defun iteration-example-with-map-reduce (numbers)
(let* ((squares (map ’list (lambda (num) (* num num)) numbers))
(total (reduce #’+ squares)))
(values squares total)))
The looping is implicit in the mapcar and
reduce functions. You can usually make the assumption
that the language implemetors have optimized these functions to be
reasonably efficient.
I often see programmers writing looping code when a perfectly good
library function exists that does the same thing. For example, it
is common to want to count the number of items in a sequence, and
Commmon Lisp supplies the count function just for this
purpose. There is no need to write a loop.
Common Lisp provides a filter function, but it is
called remove-if-not.
The drawback of using these functions is that large intermediate
sequences can be created. In our example code, the entire list of
squares is constructed prior to reducing it with #’+. Of course the
entire list is one of the return values, so you need it anyway, but
if you only needed the sum of the squares, you would prefer to sum
it incrementally as you go along rather than constructing a list of
squares and then summing it. For small sequences, it doesn’t make
a difference.
Series
The series macro suite attempt to bring you best of both worlds.
You write series expressions that look like sequence functions, but
the macro recognizes that you are iterating and generates efficient
incremental code.
(defun iteration-example-with-series (numbers)
(let ((squares (map-fn ’integer (lambda (n) (* n n)) (scan ’list numbers)))
(values (collect ’list squares)
(collect-sum squares))))
This code is very similar to the sequence case, but the series
macro will generate code that does not construct the entire list of
squares before summing them. It will sum them incrementally as it
goes along.
Series will expand into a tagboy. For example, the
above code will expand into something like this:
(COMMON-LISP:LET* ((#:OUT-1015 NUMBERS))
(COMMON-LISP:LET (#:ELEMENTS-1012
(#:LISTPTR-1013 #:OUT-1015)
(SQUARES 0)
#:SEQ-1018
(#:LIMIT-1019
(COMMON-LISP:MULTIPLE-VALUE-BIND (SERIES::X SERIES::Y)
(SERIES::DECODE-SEQ-TYPE (LIST ’QUOTE ’LISTS))
(DECLARE (IGNORE SERIES::X))
SERIES::Y))
(#:LST-1020 NIL)
(#:SUM-1023 0))
(DECLARE (TYPE LIST #:LISTPTR-1013)
(TYPE INTEGER SQUARES)
(TYPE (SERIES::NULL-OR SERIES::NONNEGATIVE-INTEGER) #:LIMIT-1019)
(TYPE LIST #:LST-1020)
(TYPE NUMBER #:SUM-1023))
(TAGBODY
#:LL-1026
(IF (ENDP #:LISTPTR-1013)
(GO SERIES::END))
(SETQ #:ELEMENTS-1012 (CAR #:LISTPTR-1013))
(SETQ #:LISTPTR-1013 (CDR #:LISTPTR-1013))
(SETQ SQUARES ((LAMBDA (N) (* N N)) #:ELEMENTS-1012))
(SETQ #:LST-1020 (CONS SQUARES #:LST-1020))
(SETQ #:SUM-1023 (+ #:SUM-1023 SQUARES))
(GO #:LL-1026)
SERIES::END)
(COMMON-LISP:LET ((SERIES::NUM (LENGTH #:LST-1020)))
(DECLARE (TYPE SERIES::NONNEGATIVE-INTEGER SERIES::NUM))
(SETQ #:SEQ-1018 (MAKE-SEQUENCE ’LISTS (OR #:LIMIT-1019 SERIES::NUM)))
(DO ((SERIES::I (1- SERIES::NUM) (1- SERIES::I)))
((MINUSP SERIES::I))
(SETF (ELT #:SEQ-1018 SERIES::I) (POP #:LST-1020))))
(VALUES #:SEQ-1018 #:SUM-1023)))
90% of the time, the series macro will produce very efficient code,
but 10% of the time the macro loses its lunch. It takes a little
practice to get use to when the series macro will work and to write
code that the series macro can handle.
Conclusion
There are many ways to iterate in Lisp, some are more efficient
than others, some are more abstrac than others. You choose the way
that suits your needs. I like the abstraction of the series macro,
but I will also use a library function like count when
it is appropriate. When I need tight control, I’ll write a state machine.