System F

Simply typed lambda calculus is restrictive. The let-polymorphism of Hindley-Milner gives us more breathing room, but can we do better?

System F frees the type system further by introducing parts of lambda calculus at the type level. We have type abstraction terms and type application terms, which define and apply functions that take types as arguments and return terms. At the same time, System F remains normalizing.

System F is rich enough that the self-application \x.x x is typable.

If we focus on the types, System F is a gentle generalization of Hindley-Milner. In the latter, any universal quantifiers (∀) must appear at the beginning of type, meaning they are scoped over the entire type. In System F, they can be arbitrary scoped. For example. (∀X.X->X)->(∀X.X->X) is a System F type, but not a Hindley-Milner type, while ∀X.(X->X)->X->X is a type in both systems.

This seemingly small change has deep consequences. Recall Hindley-Milner allows:

let id = \x.x in id succ(id 0)

but rejects:

(\f.f succ(f 0)) (\x.x)

This is because algorithm W assigns x a type variable, say X, then finds the conflicting constraints X = Nat and X = Nat -> Nat. A locally scoped generalized variable fixes this by causing fresh type variables to be generated for each use, so the resulting constraints, say, X = Nat and Y = Nat -> Nat, live and let live. Let-free let-polymorphism!

It’s easy to demonstrate this in Haskell. The following fails:

(\f->f succ(f 0))(\x->x)

We can make it run by enabling an extension, and annotating the identity function appropriately:

:set -XRankNTypes
((\f->f succ(f 0)) :: ((forall x.x->x)->Int))(\x->x)

We write type abstractions as lambda abstractions without a type annotation. The simplest example is the polymorphic identity function:

id=\X.\x:X.x

For clarity, we capitalize the first letter of type variables in our examples.

The above represents a function that takes a type, and then returns an identity function for that type.

We write type applications like term applications except we surround the argument with square brackets. For example:

id [Nat] 42

evaluates to 42.

Type spam

Our new features have a heavy price tag. In System F, type reconstruction is undecidable. We must add explicit types to every binding in every lambda abstraction. Moreover, applying type abstractions require us to state even more types.

Haskell magically fills in missing types if enough hints are given, which is why our example above got away with relatively little annotation. Our implementation of System F lacks this ability, so we must always supply types. (This is why we used Haskell above instead of our System F interpreter!)

Because types must frequently be specified, few practical languages are built on System F. (Perhaps it’s also because System F is a relatively recent discovery?) The Hindley-Milner system is often preferable, due to type inference.

However, behind the scenes, modern Haskell is an extension of System F. Certain language features require type annotation, and they generate unseen intermediate code full of type abstractions and type applications.

Type operators make types less eye-watering.

Definitions

We replace the GV constructor representing Hindley-Milner generalized variables with its scoped version Forall.

We add type applications and type abstractions to the Term data type. A type application is like a term application, except it expects a type as input (and during printing should enclose it within square brackets). A type abstraction is like a term abstraction, except its variable, being a type variable, has no type annotation.

data Type = TV String | Forall String Type | Type :-> Type
data Term = Var String | App Term Term | Lam (String, Type) Term
 | Let String Term Term
 | TLam String Term | TApp Term Type
instance Show Type where
 show (TV s) = s
 show (Forall s t) = "∀" ++ s ++ "." ++ show t
 show (t :-> u) = showL t ++ " -> " ++ showR u where
 showL (Forall _ _) = "(" ++ show t ++ ")"
 showL (_ :-> _) = "(" ++ show t ++ ")"
 showL _ = show t
 showR (Forall _ _) = "(" ++ show u ++ ")"
 showR _ = show u
instance Show Term where
 show (Lam (x, t) y) = "λ" ++ x ++ showT t ++ showB y where
 showB (Lam (x, t) y) = " " ++ x ++ showT t ++ showB y
 showB expr = '.':show expr
 showT (TV "_") = ""
 showT t = ':':show t
 show (TLam s t) = "λ" ++ s ++ showB t where
 showB (TLam s t) = " " ++ s ++ showB t
 showB expr = '.':show expr
 show (Var s) = s
 show (App x y) = showL x ++ showR y where
 showL (Lam _ _) = "(" ++ show x ++ ")"
 showL _ = show x
 showR (Var s) = ' ':s
 showR _ = "(" ++ show y ++ ")"
 show (TApp x y) = showL x ++ "[" ++ show y ++ "]" where
 showL (Lam _ _) = "(" ++ show x ++ ")"
 showL _ = show x
 show (Let x y z) =
 "let " ++ x ++ " = " ++ show y ++ " in " ++ show z

Universal types complicate type comparison, because bound type variables may be renamed arbitrarily. That is, types are unique up to alpha-conversion.

instance Eq Type where
 t1 == t2 = f [] t1 t2 where
 f alpha (TV s) (TV t)
 | Just t' <- lookup s alpha = t' == t
 | Just _ <- lookup t ((\(a, b) -> (b, a)) <$> alpha) = False
 | otherwise = s == t
 f alpha (Forall s x) (Forall t y)
 | s == t = f alpha x y
 | otherwise = f ((s, t):alpha) x y
 f alpha (a :-> b) (c :-> d) = f alpha a c && f alpha b d
 f alpha _ _ = False

Parsing

Parsing is trickier because elements of lambda calculus have invaded the type level. For each abstraction, we look for a type binding to determine if it’s at the term or type level. For applications, we look for square brackets to decide.

We follow Haskell and accept forall as well as the harder-to-type ∀ symbol. Conventionally, the arrow type constructor (->) has higher precedence.

We accept inputs that omit all but the last period and all but the first quantifier in a sequence of universal quantified type variables, an abbreviation similar to the one we’ve been using in sequences of abstractions.

jsEval "curl_module('../compiler/Charser.ob')"

import Charser
data FLine = None | TopLet String Term | Run Term deriving Show
line = between ws eof $
 TopLet <$> v <*> (str "=" *> term) <|> Run <$> term <|> pure None where
 term = letx <|> lam <|> app
 letx = Let <$> (str "let" *> v) <*> (str "=" *> term) <*> (str "in" *> term)
 lam = flip (foldr pickLam) <$> between lam0 lam1 (some vt) <*> term where
 lam0 = str "\\" <|> str "λ"
 lam1 = str "."
 vt = (,) <$> v <*> (str ":" *> (Just <$> typ) <|> pure Nothing)
 pickLam (s, Nothing) = TLam s
 pickLam (s, Just t) = Lam (s, t)
 typ = forallt <|> fun
 forallt = flip (foldr Forall) <$> between fa0 fa1 (some v) <*> fun where
 fa0 = str "forall" <|> str "∀"
 fa1 = str "."
 fun = (TV <$> v
 <|> between (str "(") (str ")") typ) `chainr1` (str "->" *> pure (:->))
 app = termArg >>= moreArg
 termArg = (Var <$> v) <|> between (str "(") (str ")") term
 moreArg t = ((App t <$> termArg
 <|> (TApp t <$> between (str "[") (str "]") typ)) >>= moreArg) <|> pure t
 v = do
 s <- some alphaNumChar
 when (s `elem` words "let in forall") $ Charser $ const $ Left "unexpected keyword"
 ws
 pure s
 str = (<* ws) . string
 ws = space *> (string "--" *> many (sat $ const True) *> pure () <|> pure ())
chainr1 p op = go id where
 go d = do
 x <- p
 (op >>= \f -> go (d . (f x:))) <|> pure (foldr ($) x $ d [])

Typing

We’ve abandoned type inference, which simplifies typing. Nonetheless, we must handle the new terms. Type abstractions are trivial. As for type applications, once again, a routine used at the term level must now be written at the type level: we must rename type variables when they conflict with free type variables.

typeOf :: [(String, Type)] -> Term -> Either String Type
typeOf gamma t = case t of
 Var s | Just t <- lookup s gamma -> pure t
 | otherwise -> Left $ "undefined " ++ s
 App x y -> do
 tx <- rec x
 ty <- rec y
 case tx of
 ty' :-> tz | ty == ty' -> pure tz
 _ -> Left $ show tx ++ " apply " ++ show ty
 Lam (x, t) y -> do
 u <- typeOf ((x, t):gamma) y
 pure $ t :-> u
 TLam s t -> Forall s <$> typeOf gamma t
 TApp x y -> do
 tx <- typeOf gamma x
 case tx of
 Forall s t -> pure $ beta t where
 beta (TV v) | s == v = y
 | otherwise = TV v
 beta (Forall u v)
 | s == u = Forall u v
 | u `elem` fvs = let u1 = newName u fvs in
 Forall u1 $ beta $ tRename u u1 v
 | otherwise = Forall u $ beta v
 beta (m :-> n) = beta m :-> beta n
 fvs = tfv [] y
 _ -> Left $ "TApp " ++ show tx
 Let s t u -> do
 tt <- typeOf gamma t
 typeOf ((s, tt):gamma) u
 where rec = typeOf gamma
tfv vs (TV s) | s `elem` vs = []
 | otherwise = [s]
tfv vs (x :-> y) = tfv vs x `union` tfv vs y
tfv vs (Forall s t) = tfv (s:vs) t
tRename x x1 ty = case ty of
 TV s | s == x -> TV x1
 | otherwise -> ty
 Forall s t
 | s == x -> ty
 | otherwise -> Forall s (rec t)
 a :-> b -> rec a :-> rec b
 where rec = tRename x x1
newName x ys = head $ filter (`notElem` ys) $ (s ++) . show <$> [1..] where
 s = takeWhile (/= '_') x ++ "_"

Evaluation

Evaluation remains about the same. We erase types as we go.

As usual, the function eval takes us to weak head normal form:

eval env (Let x y z) = eval env $ beta (x, y) z
eval env (App m a) = let m' = eval env m in case m' of
 Lam (v, _) f -> eval env $ beta (v, a) f
 _ -> App m' a
eval env (TApp m _) = eval env m
eval env (TLam _ t) = eval env t
eval env term@(Var v) | Just x <- lookup v (snd env) = case x of
 Var v' | v == v' -> x
 _ -> eval env x
eval _ term = term
beta (v, a) f = case f of
 Var s | s == v -> a
 | otherwise -> Var s
 Lam (s, _) m
 | s == v -> Lam (s, TV "_") m
 | s `elem` fvs -> let s1 = newName s $ v : fvs `union` fv [] m in
 Lam (s1, TV "_") $ rec $ rename s s1 m
 | otherwise -> Lam (s, TV "_") (rec m)
 App m n -> App (rec m) (rec n)
 TLam s t -> TLam s (rec t)
 TApp t ty -> TApp (rec t) ty
 Let x y z -> Let x (rec y) (rec z)
 where
 fvs = fv [] a
 rec = beta (v, a)
fv vs (Var s) | s `elem` vs = []
 | otherwise = [s]
fv vs (Lam (s, _) f) = fv (s:vs) f
fv vs (App x y) = fv vs x `union` fv vs y
fv vs (Let _ x y) = fv vs x `union` fv vs y
fv vs (TLam _ t) = fv vs t
fv vs (TApp x _) = fv vs x
rename x x1 term = case term of
 Var s | s == x -> Var x1
 | otherwise -> term
 Lam (s, t) b
 | s == x -> term
 | otherwise -> Lam (s, t) (rec b)
 App a b -> App (rec a) (rec b)
 Let a b c -> Let a (rec b) (rec c)
 TLam s t -> TLam s (rec t)
 TApp a b -> TApp (rec a) b
 where rec = rename x x1

The function norm recurses to find the normal form:

norm env@(gamma, lets) term = case eval env term of
 Var v -> Var v
 -- Record abstraction variable to avoid clashing with let definitions.
 Lam (v, _) m -> Lam (v, TV "_") (norm (gamma, (v, Var v):lets) m)
 App m n -> App (rec m) (rec n)
 Let x y z -> Let x (rec y) (rec z)
 TApp m _ -> rec m
 TLam _ t -> rec t
 where rec = norm env

Booleans, Integers, Pairs, Lists

Hindley-Milner supports Church-encodings of booleans, integers, pairs, and lists. System F is more general, so of course supports them too. However, we must be explicit about types. With Hindley-Milner, globally scoped universal quantifiers for all type variables are implied. With System F, our terms must start with type abstractions or a term abstraction annotated with a universal type:

true = \X x:X y:X.x
false = \X x:X y:X.y
not = \b:forall X.X->X->X X t:X f:X.b [X] f t
0 = \X s:X->X z:X.z
succ = \n:(forall X.(X->X)->X->X) X s:X->X z:X.s(n[X] s z)
pair = \X Y x:X y:Y Z f:X->Y->Z.f x y
fst = \X Y p:forall Z.(X->Y->Z)->Z.p [X] (\x:X y:Y.x)
snd = \X Y p:forall Z.(X->Y->Z)->Z.p [Y] (\x:X y:Y.y)
nil = \X.(\R.\c:X->R->R.\n:R.n)
cons = \X h:X t:forall R.(X->R->R)->R->R.(\R c:X->R->R n:R.c h (t [R] c n))

Apply yourself!

In our previous systems, the term \x.x x has been untypable. No longer! Self-application can be expressed in System F:

\x:forall X.X->X.x[forall X.X->X] x

Of course, self-application of self-application ((\x.x x)(\x.x x)) remains untypable, because it has no normal form.

Information hiding

It turns out universal types can represent existential types. These types can enforce information hiding. For example, we can give the programmer access to an API but forbid access to the implementation details.

Knowledge is power. Languages with simpler type systems still benefit if their designers know System F. For example, Haskell 98 is strictly Hindley-Milner, but researchers exploited existential types to design and prove theorems about a language extension enabling the ST monad.

System F in System F

The polymorphic identity function is typable in System F, hence we can construct the self-interpreter described in Breaking Through the Normalization Barrier: A Self-Interpreter for F-omega by Matt Brown and Jens Palsberg.

E=\T q:(forall X.X->X)->T.q(\X x:X.x)

As we demonstrated for the shallow encoding of untyped lambda calculus terms, the trick is to block every application (of types or terms) by adding one more layer of abstraction. The evaluation proceeds once we instantiate the abstraction with the polymorphic identity function.

Since we must specify types in System F, computing even a shallow encoding involves type checking.

shallow gamma term = (Forall "_0" (TV "_0" :-> TV "_0") :-> t,
 Lam ("id", Forall "X" (TV "X" :-> TV "X")) q) where
 (t, q) = f gamma term where
 f gamma term = case term of
 Var s -> (ty, Var s) where Just ty = lookup s gamma
 App m n -> (tn, App (App (TApp (Var "id") tm) qm) qn) where
 (tm, qm) = f gamma m
 (tn, qn) = f gamma n
 Lam (s, ty) t -> (ty :-> tt, Lam (s, ty) qt) where
 (tt, qt) = f ((s, ty):gamma) t
 TLam s t -> (Forall s tt, TLam s qt) where
 (tt, qt) = f gamma t
 TApp x ty -> (beta t, TApp (App (TApp (Var "id") tx) q) ty) where
 (tx@(Forall s t), q) = f gamma x
 beta (TV v) | s == v = ty
 | otherwise = TV v
 beta (Forall u v)
 | s == u = Forall u v
 | u `elem` fvs = let u1 = newName u fvs in
 Forall u1 $ beta $ tRename u u1 v
 | otherwise = Forall u $ beta v
 beta (m :-> n) = beta m :-> beta n
 fvs = tfv [] ty
 Let s t u -> f ((s, fst $ f gamma t):gamma) $ beta (s, t) u
expandShallow gamma = go where
 go = \case
 Var s -> Var s
 App (Var "shallow") t -> snd $ shallow gamma t
 App m a -> App (go m) (go a)
 Lam x m -> Lam x (go m)
 Let s t u -> Let s (go t) (go u)
 TLam s t -> TLam s (go t)
 TApp t y -> TApp (go t) y

Why, then, do books and papers claim self-interpreters are impossible for a strongly normalizing language? It turns out the definition of "self-interpreter" varies.

User Interface

The less said the better.

eval = do
 es <- map (parse line "") . lines <$> jsEval "input.value;"
 jsEval $ "output.value = `" ++ run ([], []) es "" ++ "`;"
 where
 run env@(gamma, lets) = \case
 [] -> id
 h:t -> let next = run env t in case h of
 Left err -> ("parse error: "++) . shows err . ("\n"++) . next
 Right m -> case m of
 None -> next
 Run term -> let x = expandShallow gamma term in case typeOf gamma x of
 Left m -> ("type error: "++) . shows (term,x) . (": "++) .
 (m++) . ("\n"++) . next
 Right _ -> shows (norm env x) . ("\n"++) . next
 TopLet s term -> let x = expandShallow gamma term in case typeOf gamma x of
 Left m -> ("type error: "++) . shows term . (": "++) .
 (m++) . ("\n"++) . next
 Right y -> ("["++) . (s++) . (":"++) . shows x . ("]\n"++) .
 run ((s, y):gamma, (s, x):lets) t
jsEval [r|
 evalB.addEventListener("click", (ev) => { repl.run("chat", ["Main"], "eval"); });
|]

⏴ PCF Type operators ⏵

Contents

[back to top]

Ben Lynn blynn@cs.stanford.edu 💡