Showing posts with label lambda calculus. Show all posts
Showing posts with label lambda calculus. Show all posts

Monday, September 1, 2025

Lisp Still Matters

Lisp is not the most popular computer language, yet it continues to attract a small but devoted following. Sure, you can find nutters for any language, but a lot of Lisp programmers are high power expert programmers. Lisp was developed by the world's top researchers and was a language of choice at places like MIT, Stanford, Carnegie-Mellon, Xerox PARC and Bell Labs. It may be a niche, but your companions in the niche are some of the best programmers in the world. What is it about Lisp that attracts the top talent?

Lisp is a powerful language because it can be easily extended to express new ideas that were not foreseen when the language was originally designed. The original designers were well aware that their new computer language couldn't take into account ideas that would be developed in the future, so they made it easy to extend the language beyond its original design. Lisp was based on Church's lambda calculus, which is a logical formalism designed to allow you to reason about logical formalism itself. When you extend the language, you can use the base language to reason about the extension.

The fundamental building block of lambda calculus is the function. In his Conversions of the Lambda Calculi, Church shows how to bootstrap the rest of mathematics from functions. With Lisp, McCarthy showed the isomorphism between s-expressions and lambda expressions, and developed a universal function within Lisp. Thus if you extend Lisp, you can use Lisp to implement and interpret your extension from within Lisp. You don't have to exit the language to extend it.

Lisp syntax, such as it is, is based upon s-expressions, which are tree structures. Lisp allows you to invent your own nodes in the tree and assign your own semantics to them. You can express the new semantics via function calls, which Church showed to be universal. You don't need to hack the parser or the compiler to extend the language, a simple defmacro will extend the syntax and a handful of defuns will assign semantics to the new syntax. If you are so inclined, you can use compiler macros to generate efficient code for your new constructs.

With Lisp, I have the option of expressing the solution to my problems in Lisp, or the option to extend Lisp to express the solution more naturally. I can tailor my problem to be solved in Lisp, or I can tailor Lisp to solve my problem. Or work from both ends to meet in the middle.

It works. Lately I've been using a forty year old dialect of a sixty year old language to interact with cutting edge LLMs. I've not only built an API to the LLM, but I've extended Lisp to include features implemented by the LLM. This was certainly not forseen by McCarthy et al when Lisp was first designed. I decided to use Lisp to explore the integration because I knew that Lisp was amenable to extension. I could have used another language, but I could not have whipped up a prototype or proof of concept by myself over a couple of weekends.

In conclusion, Lisp's enduring appeal to expert programmers lies in its powerful, foundational design. By allowing a programmer to extend the language from within, Lisp provides a unique environment for tackling novel problems and developing new abstractions. The principles of lambda calculus and the flexibility of s-expressions empower developers to not only solve problems but also to build the perfect tool for the job. This capacity to adapt and evolve, as demonstrated by its modern use with LLMs, ensures that Lisp remains a relevant and potent language for those who wish to go beyond the confines of a predefined system.

Monday, April 7, 2025

Are You Tail Recursive?

I was trying to write a procedure that would test whether tail recursion was enabled or not. It turns out that this is semi-decidable. You can only tell if it is not tail recursive by blowing the stack, but you can never be sure that somebody didn’t make a really, really big stack.

But for the sake of argument, let’s say assume that 228 recursive calls is enough to blow the stack. Also, let’s assume that the system will correctly signal an error and be able to recover from the blown stack once it is unwound. Then we can test for tail recursion as follows:

(defconstant +stack-test-size+ (expt 2 28))
(defun test-self-tail-recursion (x)
 (or (> x +stack-test-size+)
 (test-self-tail-recursion (1+ x))))
(defun test-mutual-tail-recursion (x)
 (or (> x +stack-test-size+)
 (test-mutual-tail-recursion-1 (1+ x))))
(defun test-mutual-tail-recursion-1 (x)
 (or (> x +stack-test-size+)
 (test-mutual-tail-recursion (1+ x))))
(defun self-tail-recursive? ()
 (ignore-errors (test-self-tail-recursion 0)))
(defun mutually-tail-recusive? ()
 (ignore-errors (test-mutual-tail-recursion 0)))

Naturally, your optimize settings are going to affect whether or not you have tail recursion enabled, and even if this code says it is enabled, it doesn’t mean that a different compilation unit compiled at a different time with different optimizations would be tail recursive as well. But if you run this in your default environment and it returns NIL, then you can be pretty sure your default is not tail recursive. If it returns T, however, then there are at least some conditions under which your system is tail recursive.

Use of certain features of Common Lisp will “break” tail recursion, for example special variable binding, multiple-value-bind, unwind-protect, and catch, and basically anything that marks the stack or dynamically allocates on the stack. If control has to return to the current function after a call, then it is not tail recursive, but if control can return directly to the caller of the current function, then the compiler should be able to make a tail recursive call.

Most, but not all, Common Lisp implementations can be configured to enable tail recursion, and it is usually possible to disable it if desired. If you want tail recursion, but your implementation cannot provide it, you are SOL. If you don’t want tail recursion, but your implementation provides it by default, there are a number of things you can do to disable it. Usually, you can set the debugging options to retain every stack frame, or you can set the debug optimization to disable tail recursion. If the recursive call is not in tail position, then it is incorrect to tail call it, so you can disable tail recursion by moving the recursive call out of tail position. For example, you could write a without-tail-call macro:

(defmacro without-tail-call (form)
 (let ((value (gensym)))
 `((lambda (,value) ,value) ,form)))
;; Use like this:
(defun factorial (x &optional (acc 1))
 (if (zerop x)
 (progn (cerror "Return the factorial" "Computed a factorial")
 acc)
 (without-tail-call
 (factorial (1- x) (* acc x)))))

Running this program will drop us into the debugger because of the cerror and we can inspect the stack.

> (factorial 100)
Computed a factorial
 [Condition of type SIMPLE-ERROR]
Restarts:
 0: [CONTINUE] Return the factorial
 1: [RETRY] Retry SLY mREPL evaluation request.
 2: [*ABORT] Return to SLY’s top level.
 3: [ABORT] abort thread (#<THREAD tid=179 "sly-channel-1-mrepl-remote-1" RUNNING {1000BA8003}>)
Backtrace:
 0: (FACT 0 2432902008176640000)
 1: (FACT 1 2432902008176640000)
 2: (FACT 2 1216451004088320000)
 3: (FACT 3 405483668029440000)
 4: (FACT 4 101370917007360000)
 5: (FACT 5 20274183401472000)
 6: (FACT 6 3379030566912000)
 ...

As requested, each recursive calls pushes a stack frame.

But what if the compiler “optimizes” ((λ (x) x) (foo)) to simply be (foo)? That is a questionable “optimization”. Lisp is a call-by-value language, meaning that the arguments to a function are fully evaluated before the function is applied to them. The call to (foo) should be fully evaluated before applying (λ (x) x) to it. The “optimization” essentially treats (foo) as unevaluted and then tail calls it after applying the lambda expression. This is call-by-name evaluation. While anything is possible, an “optimization” that sometimes changes the function call semantics from call-by-value to call-by-need is dubious.

Tuesday, January 14, 2025

λ Calculus

A lambda calculus is a set of rules and strategies for manipulating logical expressions. As Church defined them, these logical expressions are linear lists of symbols. A symbol is effectively a variable. Two expressions in sequence indicate a function application. The special symbol λ is just a marker to indicate a function. Parenthesis can be used to group expressions.

McCarthy’s S-expressions are an alternative representation of a logical expression that is more suitable for a computer. Rather than a linear list of symbols, S-expressions use a tree structure of linked lists in memory. Symbols are still variables, lists represent function application, the special symbol lambda at the beginning of a list indicates a function, and grouping is achieved by nesting a list within another.

When McCarthy invented S-expressions, he wanted to show that the nested list structure of S-expressions could faithfully represent the logical expressions from lambda calculus. (It can.) A lambda calculus can be restated as a set of rules and strategies for manipulating S-expressions. This makes it easier for a computer to do lambda calculus. As a Lisp hacker, I find it also makes it easier for me to think about lambda calculus.

Your basic lambda calculus just has symbols, lists, and λ expressions. That’s it. But let us introduce one more element. Recall that we can think of a LET expression as syntactic sugar for a list (function call) where the first element (the operator) is a lambda expression. We’ll keep our S-expressions fully sugared and write all such lists as LET expressions. So now our S-expressions have symbols, lists, λ expressions, and LET expressions.

The two basic rules for manipulating S-expressions are α, which is a recursive rule for renaming a symbol in an S-expression, and β, which gets rid of a selected LET expression. A basic lambda calculus consists of these two rules and a strategy for selecting which LET expressions to get rid of. β reduces a LET expession by substituting the variables for their bindings in the body of the LET. α is used as needed to avoid unwanted variable capture during β-reduction. β eliminates one LET expression, but it can introduce more if you end up substituting a λ expression into operator position.

If an expression contains no LET expressions, we say it is in “normal form”. A common task in lambda calculus is to try to reduce an expression to normal form by attempting to eliminate all the LET expressions. Sometimes you cannot achieve this goal because every time you apply the β rule to eliminate a LET expression, it ends up introducing further LET expressions.

There are many strategies for selecting LET expressions to eliminate. Not all strategies will necessarily end up getting you to a normal form, but all strategies that do end up at a normal form end up at the same normal form (modulo the variable names). One strategy is of note: selecting the leftmost, outermost LET expression and reducing it first is called “normal order”. It is notable because if any strategy converges to normal form, then the normal order strategy will, too. However, the normal order strategy can lead to an exponential explosion of intermediate expressions. There are other strategies that avoid the exponential explosion, but they don’t always converge to normal form. Pick your poison.

α and β are the only rules we need to compute with S-expressions. The simple lambda calculus with α and β is universal — it can compute anything that can be computed. It is Turing complete.

I don’t know about you, but I find it quite remarkable that this can compute anything, let alone everything. Nothing is going on here. α just renames symbols. Using α-conversion to rename all the foos to bars doesn’t change anything but the symbol names. We define expression equivalence modulo α, so the actual names of the symbols isn’t important. Apparently β-reduction does computation, but it is hard to say how, exactly. It is just simplifying LET expressions by replacing variables with what they are bound to. But a variable is simply a name for a binding. When you replace a variable with what it is bound to, you don’t change any values. The resulting expression may be simpler, but it means the same thing as the original.

We use β reduction as a model of subroutine (function) calls. In a subroutine call, the values of the arguments are bound to the names of the arguments before evaluating the body of the subroutine. In β reduction, the body of the expression is substituted with the names bound to the value expressions. The lambda calculus model of a computer program will have a β reduction wherever the program has a subroutine call. A lambda calculus expression with opportunities for β reduction can be translated into a computer program with subroutine calls at those locations. It’s a one-to-one mapping. Since we can compute anything using just the α and β rules, we can likewise compute anything with just function calls. I think that’s pretty remarkable, too.

Turing’s machine formalism was designed to be understandable as a physical machine. Turing was particular that his machine could be realized as a mechanical object or electronically. It is far less clear how to make a lambda calculus into a physical machine. Once we recognize that β can be realized as a subroutine in software, we can see that Church’s lambda calculus formalism can be understable as a virtual machine.

Church’s Calculi of Lambda Conversion is a cool book where he lays out the principals of lambda calculus. It is pretty accessible if you have experience in Lisp, and the examples in the book will run in a Scheme interpreter if you translate them.

Saturday, November 2, 2024

Don't Try to Program in Lisp

A comment on my previous post said,

The most difficult thing when coming to a different language is to leave the other language behind. The kind of friction experienced here is common when transliterating ideas from one language to another. Go (in this case) is telling you it just doesn’t like to work like this.
Try writing simple Go, instead of reaching for Lisp idioms. Then find the ways that work for Go to express the concepts you find.

That's not at all how I approach programming.

A friend of mine once paid me a high compliment. He said, “Even your C code looks like Lisp.”

When I write code, I don't think in terms of the language I'm using, I think in terms of the problem I'm solving. I'm a mostly functional programmer, so I like to think in terms of functions and abstractions. I mostly reason about my code informally, but I draw upon the formal framework of Lambda Calculus. Lambda Calculus is a simple, but powerful (and universal) model of computation.

Programming therefore becomes a matter of expressing the solution to a problem with the syntax and idioms of the language I'm using. Lisp was inspired by Lambda Calculus, so there is little friction in expressing computations in Lisp. Lisp is extensible and customizable, so I can add new syntax and idioms as desired.

Other languages are less accommodating. Some computations are not easily expressable in the syntax of the language, or the semantics of the language are quirky and inconsistent. Essentially, every general purpose fourth generation programming language can be viewed as a poorly-specified, half-assed, incomplete, bug-ridden implementation of half of Common Lisp. The friction comes from working around the limitations of the language.

Wednesday, September 27, 2023

Greenspun's tenth rule of programming states

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug ridden, slow implementation of half of Common Lisp.
Observe that the Python interpreter is written in C.

In fact, most popular computer languages can be thought of as a poorly implemented Common Lisp. There is a reason for this. Church's lambda calculus is a great foundation for reasoning about programming language semantics. Lisp can be seen as a realization of a lambda calculus interpreter. By reasoning about a language's semantics in Lisp, we're essentially reasoning about the semantics in a variation of lambda calculus.

Saturday, May 15, 2021

β-conversion

If you have an expression that is an application, and the operator of the application is a lambda expression, then you can β-reduce the application by substituting the arguments of the application for the bound variables of the lambda within the body of the lambda.

(defun beta (expression if-reduced if-not-reduced)
 (if (application? expression)
 (let ((operator (application-operator expression))
 (operands (application-operands expression)))
 (if (lambda? operator)
 (let ((bound-variables (lambda-bound-variables operator))
 (body (lambda-body operator)))
 (if (same-length? bound-variables operands)
 (funcall if-reduced
 (xsubst body
 (table/extend* (table/empty)
 bound-variables
 operands)))
 (funcall if-not-reduced)))
 (funcall if-not-reduced)))
 (funcall if-not-reduced)))
* (beta '((lambda (x y) (lambda (z) (* x y z))) a (+ z 3))
 #'identity (constantly nil))
(LAMBDA (#:Z460) (* A (+ Z 3) #:Z460))

A large, complex expression may or may not have subexpressions that can be β-reduced. If neither an expression or any of its subexpressions can be β-reduced, then we say the expression is in “β-normal form”. We may be able to reduce an expression to β-normal form by β-reducing where possible. A β-reduction can introduce further reducible expressions if we substitute a lambda expression for a symbol in operator position, so reducing to β-normal form is an iterative process where we continue to reduce any reducible expressions that arise from substitution.

(defun beta-normalize-step (expression)
 (expression-dispatch expression
 ;; case application
 (lambda (subexpressions)
 ;; Normal order reduction
 ;; First, try to beta reduce the outermost application,
 ;; otherwise, recursively descend the subexpressions, working
 ;; from left to right.
 (beta expression
 #'identity
 (lambda ()
 (labels ((l (subexpressions)
 (if (null subexpressions)
 '()
 (let ((new-sub (beta-normalize-step (car subexpressions))))
 (if (eq new-sub (car subexpressions))
 (let ((new-tail (l (cdr subexpressions))))
 (if (eq new-tail (cdr subexpressions))
 subexpressions
 (cons (car subexpressions) new-tail)))
 (cons new-sub (cdr subexpressions)))))))
 (let ((new-subexpressions (l subexpressions)))
 (if (eq new-subexpressions subexpressions)
 expression
 (make-application new-subexpressions)))))))
 ;; case lambda
 (lambda (bound-variables body)
 (let ((new-body (beta-normalize-step body)))
 (if (eql new-body body)
 expression
 (make-lambda bound-variables new-body))))
 ;; case symbol
 (constantly expression)))
;;; A normalized expression is a fixed point of the
;;; beta-normalize-step function.
(defun beta-normalize (expression)
 (do ((expression expression (beta-normalize-step expression))
 (expression1 '() expression)
 (count 0 (+ count 1)))
 ((eq expression expression1)
 (format t "~%~d beta reductions" (- count 1))
 expression)))

You can compute just by using β-reduction. Here we construct an expression that reduces to the factorial of 3. We only have β-reduction, so we have to encode numbers with Church encoding.

(defun test-form ()
 (let ((table
 (table/extend*
 (table/empty)
 '(one
 three
 *
 pred
 zero?
 y)
 '(
 ;; Church numeral one
 (lambda (f) (lambda (x) (f x)))
 ;; Church numeral three 
 (lambda (f) (lambda (x) (f (f (f x)))))
 ;; * (multiply Church numerals)
 (lambda (m n)
 (lambda (f)
 (m (n f))))
 ;; pred (subtract 1 from Church numeral)
 (lambda (n)
 (lambda (f)
 (lambda (x) (((n (lambda (g)
 (lambda (h)
 (h (g f)))))
 (lambda (u) x))
 (lambda (u) u)))))
 ;; zero? (test if Church numeral is zero)
 (lambda (n t f) ((n (lambda (x) f)) t))
 ;; Y operator for recursion
 (lambda (f)
 ((lambda (x) (f (x x)))
 (lambda (x) (f (x x)))))
 )))
 (expr
 '((lambda (factorial)
 (factorial three))
 (y (lambda (fact)
 (lambda (x)
 (zero? x
 one
 (* (fact (pred x)) x))))))))
 (xsubst expr table)))
* (test-form)
((LAMBDA (FACTORIAL) (FACTORIAL (LAMBDA (F) (LAMBDA (X) (F (F (F X)))))))
 ((LAMBDA (F) ((LAMBDA (X) (F (X X))) (LAMBDA (X) (F (X X)))))
 (LAMBDA (FACT)
 (LAMBDA (X)
 ((LAMBDA (N T F) ((N (LAMBDA (X) F)) T)) X
 (LAMBDA (F) (LAMBDA (X) (F X)))
 ((LAMBDA (M N) (LAMBDA (F) (M (N F))))
 (FACT
 ((LAMBDA (N)
 (LAMBDA (F)
 (LAMBDA (X)
 (((N (LAMBDA (G) (LAMBDA (H) (H (G F))))) (LAMBDA (U) X))
 (LAMBDA (U) U)))))
 X))
 X))))))
* (beta-normalize (test-form))
127 beta reductions
(LAMBDA (F) (LAMBDA (X) (F (F (F (F (F (F X))))))))

This is the Church numeral for 6.

I find it pretty amazing that we can bootstrap ourselves up to arithmetic just by repeatedly β-reducing where we can. It doesn't seem like we're actually doing any work. We're just replacing names with what they stand for.

The β-substitution above replaces all the bound variables with their arguments if there is the correct number of arguments. One could easily implement a partial β-substitution that replaced only some of the bound variables. You'd still have an application, but some of the bound variables in the lambda would be eliminated and the corresponding argument would be removed.

Subscribe to: Comments (Atom)

AltStyle によって変換されたページ (->オリジナル) /