It’s about time to add support for modules: our last compiler is almost 2000 lines of code.
Continuing our RPG analogy, we liken modules to parties of heroes. Even in single-player games, we often direct a group of specialists rather than one powerful multi-talented being. Perhaps we enjoy observing teamwork, or following gaming traditions.
But perhaps we also like parties for the same reasons we decompose a large problem into digestible subproblems. For example, we might only need to think about the intricacies of magic spells when controlling the wizard character.
As usual, our first stab has limitations. This party is just getting started.
To avoid dealing with files, our compiler reads the concatenation of all the modules on standard input.
To keep parser changes minimal, all symbols are exported, and all module imports are unqualified.
Fixity declarations must precede all occurrences of their corresponding operators in the standard input.
At most one module should declare FFI imports.
Cyclic module dependencies cause an infinite loop.
On the other hand, we proudly support import statements anywhere in the file,
and multiple modules in the same file.
In fact, this is why fixity declarations must appear first in the input. GHC insists on one module per file with imports appearing before other declarations, hence its parser can process imports before reaching any expressions and determine the fixity of any operators that appear when it later reaches them. With our scheme, we may encounter an operator in an expression before learning its fixity, which confuses our simple parser. In a later compiler we’ll address this issue.
We tweak the parser to support module and import, and add a new field to
Neat that hold the imports of a module. A module with no explicit module
declaration is taken to be the Main module. Concatenation implies such a
module would have to appear first.
We add a new Link field to the Extra data type, which holds the module,
symbol, and type of a top-level variable defined in another module. During
inference, we replace a V field with a Link field if we find it is exported
from one of the imported modules.
We introduce a one-off Dep monad because we lack monad transformers, and
would like a combination of the Either and State monads when finding the
dependencies of a definition.
Up until now, all symbols were global across a single file. As we Scott-encoded
ADTs and generated types and selector functions for typeclass methods, we
simply threw them on a big pile in a Neat value being passed around.
Modules force us to be more careful.
We invent a special module "#" preloaded with built-in definitions required by the Haskell syntax we support:
The unit type and value () is part of the language.
If expressions and guards require Bool, True, and False.
Pairs are part of the language, even though suitably defined ADTs could be
used instead (the IOCCC edition of our compiler does this
to save room). Curiously, Haskell has no built-in type for the dual of
pairs; requiring the programmer to define Either.
Lists are famously built into Haskell.
String literals require lists.
We compile recursive let definitions with fix.
Operations involving native integer types: chr ord intAdd intMul
and so on.
Primitives for IO monad methods.
The RTS reduces fail# on failed case matches.
Each module implicitly imports this special "#" module, so these built-in primitives are accessible to all.
This is a good time to mention that rewriting means:
Ranges become expressions involving enumFromTo and enumFrom.
Failed pattern matches are undefined.
We need pure (for pure ()) and >>= to support do notation.
Patterns containing integer and character literals require (==).
List comprehensions are expressed in terms of concatMap and pure.
None of these are built-in; they must be explicitly defined at the top level if
these language features are used. The last of these implies we must define an
Applicative instance for lists. To remove these gotchas, we could define
low-level primitives as we do for the others.
Code generation now has two phases. The first corresponds to GHC incrementally
compiling a module: it resolves all locally defined symbols, and leaves Link
values indicating where to put addresses of symbols defined elsewhere. The
generated code is not position-independent; rather, for each module, we are
given the current heap pointer, and return an updated heap pointer.
The second phase replaces all Link values with heap addresses, as all
entities are in the heap by this point.
Modules make us regret older expedient decisions regarding typeclasses.
We threw default method definitions in one data structure, and lumped together
method type signatures and instances in another. But now we might find a
typeclass definition in one module, and an instance of it in another, so our
code that searches imports for this information is messy. For example,
the fillSigs helper raids other modules for the types of methods.
We had previously substituted the syntax trees for default method
implementations straight into instances. If we one day want incremental
compilation, then it is likely easier to compile a default implementation
once, then access it from other modules via a layer of indirection. With this
in mind, for each method foo, we generate a method called {default}foo
whose body is the default method implementation of foo if given, and
fail# otherwise.
Since we toss dictionary selector functions on to a big pile of ordinary
definitions, to find the type of a method we add typeOfMethod, whose logic is
similar to findImportSym, but differs enough that we implement it
independently.
We modify the code to insert dictionaries one strongly-connected-component at a time rather than one function at a time. This is required to correctly compile mutually recursive functions that use typeclasses. Each function of the component may wind up calling any other, so it needs all the relevant dictionaries.
Up until now we had preserved topological order of the top-level definitions as they made their way through our compiler. We change the code generator so it no longer needs this precondition, so that we can store compiled functions and modules in maps rather than delicately manicured lists.
We introduce a single combinator to act as BK which frequently occurs due to
Scott encoding.
If B K x y z = x y is reduced individually, our virtual machine allocates
a new app-cell for K (x y), only to immediately rewrite it as I (x y),
which again must be reduced to yield x y at last. The BK combinator
avoids this needless song and dance.
A dedicated BK combinator is also aesthetically pleasing. Consider some
three-argument combinator given x y z. We can leave x alone or apply it to
z, and similarly for y, and then apply the first thing to the second:
(x )(y ) (x )(y z) (x z)(y ) (x z)(y z)
The last 3 are the B C S combinators. The first one is BK.
Smullyan appears not to have assigned a bird to this combinator, so we resort
to the clunky name BK throughout our code.
The BK combinator makes it easy for optim to rewrite B BK V as CONS.
We add the LEFT combinator, which is equivalent to B BK T and also arises
frequently in Scott encodings; indeed, the data constructor Left compiles to
LEFT. We add the KI combinator to shave off a few more reductions.
-- Modules.
infixr 9 !
infixr 9 .
infixl 7 * , `div` , `mod`
infixl 6 + , -
infixr 5 ++
infixl 4 <*> , <$> , <* , *>
infix 4 == , /= , <=
infixl 3 && , <|>
infixl 2 ||
infixl 1 >> , >>=
infixr 0 $
foreign import ccall "putchar_shim" putChar :: Char -> IO ()
foreign import ccall "getchar_shim" getChar :: IO Char
foreign import ccall "eof_shim" isEOFInt :: IO Int
foreign import ccall "getargcount" getArgCount :: IO Int
foreign import ccall "getargchar" getArgChar :: Int -> Int -> IO Char
libc = [r|#include<stdio.h>
static int env_argc;
int getargcount() { return env_argc; }
static char **env_argv;
int getargchar(int n, int k) { return env_argv[n][k]; }
static int nextCh, isAhead;
int eof_shim() {
if (!isAhead) {
isAhead = 1;
nextCh = getchar();
}
return nextCh == -1;
}
void exit(int);
void putchar_shim(int c) { putchar(c); }
int getchar_shim() {
if (!isAhead) nextCh = getchar();
if (nextCh == -1) exit(1);
isAhead = 0;
return nextCh;
}
void errchar(int c) { fputc(c, stderr); }
void errexit() { fputc('\n', stderr); }
|]
class Functor f where fmap :: (a -> b) -> f a -> f b
class Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
(<$>) = fmap
liftA2 f x y = f <$> x <*> y
(>>) f g = f >>= \_ -> g
class Eq a where (==) :: a -> a -> Bool
instance Eq Int where (==) = intEq
instance Eq Char where (==) = charEq
($) f x = f x
id x = x
const x y = x
flip f x y = f y x
(&) x f = f x
class Ord a where
(<=) :: a -> a -> Bool
x <= y = case compare x y of
LT -> True
EQ -> True
GT -> False
compare :: a -> a -> Ordering
compare x y = if x <= y then if y <= x then EQ else LT else GT
instance Ord Int where (<=) = intLE
instance Ord Char where (<=) = charLE
data Ordering = LT | GT | EQ
instance Ord a => Ord [a] where
xs <= ys = case xs of
[] -> True
x:xt -> case ys of
[] -> False
y:yt -> if x <= y then if y <= x then xt <= yt else True else False
compare xs ys = case xs of
[] -> case ys of
[] -> EQ
_ -> LT
x:xt -> case ys of
[] -> GT
y:yt -> if x <= y then if y <= x then compare xt yt else LT else GT
data Maybe a = Nothing | Just a
data Either a b = Left a | Right b
fst (x, y) = x
snd (x, y) = y
uncurry f (x, y) = f x y
first f (x, y) = (f x, y)
second f (x, y) = (x, f y)
not a = if a then False else True
x /= y = not $ x == y
(.) f g x = f (g x)
(||) f g = if f then True else g
(&&) f g = if f then g else False
instance Eq a => Eq [a] where
xs == ys = case xs of
[] -> case ys of
[] -> True
_ -> False
x:xt -> case ys of
[] -> False
y:yt -> x == y && xt == yt
take 0 xs = []
take _ [] = []
take n (h:t) = h : take (n - 1) t
maybe n j m = case m of Nothing -> n; Just x -> j x
instance Functor Maybe where fmap f = maybe Nothing (Just . f)
instance Applicative Maybe where pure = Just ; mf <*> mx = maybe Nothing (\f -> maybe Nothing (Just . f) mx) mf
instance Monad Maybe where return = Just ; mf >>= mg = maybe Nothing mg mf
instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x
foldr c n = \case [] -> n; h:t -> c h $ foldr c n t
length = foldr (\_ n -> n + 1) 0
mapM f = foldr (\a rest -> liftA2 (:) (f a) rest) (pure [])
mapM_ f = foldr ((>>) . f) (pure ())
foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0
instance Applicative IO where pure = ioPure ; (<*>) f x = ioBind f \g -> ioBind x \y -> ioPure (g y)
instance Monad IO where return = ioPure ; (>>=) = ioBind
instance Functor IO where fmap f x = ioPure f <*> x
class Show a where
showsPrec :: Int -> a -> String -> String
showsPrec _ x = (show x++)
show :: a -> String
show x = shows x ""
showList :: [a] -> String -> String
showList = showList__ shows
shows = showsPrec 0
showList__ _ [] s = "[]" ++ s
showList__ showx (x:xs) s = '[' : showx x (showl xs)
where
showl [] = ']' : s
showl (y:ys) = ',' : showx y (showl ys)
showInt__ n
| 0 == n = id
| True = showInt__ (n`div`10) . (chr (48+n`mod`10):)
instance Show () where show () = "()"
instance Show Bool where
show True = "True"
show False = "False"
instance Show a => Show [a] where showsPrec _ = showList
instance Show Int where
showsPrec _ n
| 0 == n = ('0':)
| 1 <= n = showInt__ n
| 2 * n == 0 = ("-2147483648"++)
| True = ('-':) . showInt__ (0 - n)
showLitChar__ '\n' = ("\\n"++)
showLitChar__ '\\' = ("\\\\"++)
showLitChar__ c = (c:)
instance Show Char where
showsPrec _ '\'' = ("'\\''"++)
showsPrec _ c = ('\'':) . showLitChar__ c . ('\'':)
showList s = ('"':) . foldr (.) id (map go s) . ('"':) where
go '"' = ("\\\""++)
go c = showLitChar__ c
instance (Show a, Show b) => Show (a, b) where
showsPrec _ (a, b) = showParen True $ shows a . (',':) . shows b
isEOF = (0 /=) <$> isEOFInt
putStr = mapM_ putChar
putStrLn = (>> putChar '\n') . putStr
print = putStrLn . show
getContents = isEOF >>= \b -> if b then pure [] else getChar >>= \c -> (c:) <$> getContents
interact f = getContents >>= putStr . f
getArgs = getArgCount >>= \n -> mapM (go 0) [1..n-1] where
go k n = getArgChar n k >>= \c -> if ord c == 0 then pure [] else (c:) <$> go (k + 1) n
error s = unsafePerformIO $ putStr s >> putChar '\n' >> exitSuccess
undefined = error "undefined"
foldr1 c l@(h:t) = maybe undefined id $ foldr (\x m -> Just $ maybe x (c x) m) Nothing l
foldl f a bs = foldr (\b g x -> g (f x b)) (\x -> x) bs a
foldl1 f (h:t) = foldl f h t
elem k xs = foldr (\x t -> x == k || t) False xs
find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs
(++) = flip (foldr (:))
concat = foldr (++) []
map = flip (foldr . ((:) .)) []
head (h:_) = h
tail (_:t) = t
isSpace c = elem (ord c) [32, 9, 10, 11, 12, 13, 160]
instance Functor [] where fmap = map
instance Applicative [] where pure = (:[]); f <*> x = concatMap (<$> x) f
instance Monad [] where return = (:[]); (>>=) = flip concatMap
concatMap = (concat .) . map
lookup s = foldr (\(k, v) t -> if s == k then Just v else t) Nothing
filter f = foldr (\x xs -> if f x then x:xs else xs) []
union xs ys = foldr (\y acc -> (if elem y acc then id else (y:)) acc) xs ys
intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs
last (x:xt) = go x xt where go x xt = case xt of [] -> x; y:yt -> go y yt
init (x:xt) = case xt of [] -> []; _ -> x : init xt
intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt
intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt)
all f = foldr (&&) True . map f
any f = foldr (||) False . map f
and = foldr (&&) True
or = foldr (||) False
zipWith f xs ys = case xs of [] -> []; x:xt -> case ys of [] -> []; y:yt -> f x y : zipWith f xt yt
zip = zipWith (,)
data State s a = State (s -> (a, s))
runState (State f) = f
instance Functor (State s) where fmap f = \(State h) -> State (first f . h)
instance Applicative (State s) where
pure a = State (a,)
(State f) <*> (State x) = State \s -> case f s of (g, s') -> first g $ x s'
instance Monad (State s) where
return a = State (a,)
(State h) >>= f = State $ uncurry (runState . f) . h
evalState m s = fst $ runState m s
get = State \s -> (s, s)
put n = State \s -> ((), n)
either l r e = case e of Left x -> l x; Right x -> r x
instance Functor (Either a) where fmap f e = either Left (Right . f) e
instance Applicative (Either a) where
pure = Right
ef <*> ex = case ef of
Left s -> Left s
Right f -> either Left (Right . f) ex
instance Monad (Either a) where
return = Right
ex >>= f = either Left f ex
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
asum = foldr (<|>) empty
(*>) = liftA2 \x y -> y
(<*) = liftA2 \x y -> x
many p = liftA2 (:) p (many p) <|> pure []
some p = liftA2 (:) p (many p)
sepBy1 p sep = liftA2 (:) p (many (sep *> p))
sepBy p sep = sepBy1 p sep <|> pure []
between x y p = x *> (p <* y)
-- Map.
data Map k a = Tip | Bin Int k a (Map k a) (Map k a)
instance Functor (Map k) where
fmap f m = case m of
Tip -> Tip
Bin sz k x l r -> Bin sz k (f x) (fmap f l) (fmap f r)
size m = case m of Tip -> 0 ; Bin sz _ _ _ _ -> sz
node k x l r = Bin (1 + size l + size r) k x l r
singleton k x = Bin 1 k x Tip Tip
singleL k x l (Bin _ rk rkx rl rr) = node rk rkx (node k x l rl) rr
doubleL k x l (Bin _ rk rkx (Bin _ rlk rlkx rll rlr) rr) =
node rlk rlkx (node k x l rll) (node rk rkx rlr rr)
singleR k x (Bin _ lk lkx ll lr) r = node lk lkx ll (node k x lr r)
doubleR k x (Bin _ lk lkx ll (Bin _ lrk lrkx lrl lrr)) r =
node lrk lrkx (node lk lkx ll lrl) (node k x lrr r)
balance k x l r = f k x l r where
f | size l + size r <= 1 = node
| 5 * size l + 3 <= 2 * size r = case r of
Tip -> node
Bin sz _ _ rl rr -> if 2 * size rl + 1 <= 3 * size rr
then singleL
else doubleL
| 5 * size r + 3 <= 2 * size l = case l of
Tip -> node
Bin sz _ _ ll lr -> if 2 * size lr + 1 <= 3 * size ll
then singleR
else doubleR
| True = node
insert kx x t = case t of
Tip -> singleton kx x
Bin sz ky y l r -> case compare kx ky of
LT -> balance ky y (insert kx x l) r
GT -> balance ky y l (insert kx x r)
EQ -> Bin sz kx x l r
insertWith f kx x t = case t of
Tip -> singleton kx x
Bin sy ky y l r -> case compare kx ky of
LT -> balance ky y (insertWith f kx x l) r
GT -> balance ky y l (insertWith f kx x r)
EQ -> Bin sy kx (f x y) l r
mlookup kx t = case t of
Tip -> Nothing
Bin _ ky y l r -> case compare kx ky of
LT -> mlookup kx l
GT -> mlookup kx r
EQ -> Just y
fromList = foldl (\t (k, x) -> insert k x t) Tip
member k t = maybe False (const True) $ mlookup k t
t ! k = maybe undefined id $ mlookup k t
foldrWithKey f = go where
go z t = case t of
Tip -> z
Bin _ kx x l r -> go (f kx x (go z r)) l
toAscList = foldrWithKey (\k x xs -> (k,x):xs) []
keys = map fst . toAscList
-- Syntax tree.
data Type = TC String | TV String | TAp Type Type
arr a b = TAp (TAp (TC "->") a) b
data Extra = Basic String | Const Int | ChrCon Char | StrCon String | Link String String Qual
data Pat = PatLit Ast | PatVar String (Maybe Pat) | PatCon String [Pat]
data Ast = E Extra | V String | A Ast Ast | L String Ast | Pa [([Pat], Ast)] | Proof Pred
data Constr = Constr String [Type]
data Pred = Pred String Type
data Qual = Qual [Pred] Type
instance Eq Type where
(TC s) == (TC t) = s == t
(TV s) == (TV t) = s == t
(TAp a b) == (TAp c d) = a == c && b == d
_ == _ = False
instance Eq Pred where (Pred s a) == (Pred t b) = s == t && a == b
data Instance = Instance
-- Type, e.g. Int for Eq Int.
Type
-- Dictionary name, e.g. "{Eq Int}"
String
-- Context.
[Pred]
-- Method definitions
(Map String Ast)
data Tycl = Tycl [String] [Instance]
data Neat = Neat
(Map String Tycl)
-- | Top-level definitions
[(String, Ast)]
-- | Typed ASTs, ready for compilation, including ADTs and methods,
-- e.g. (==), (Eq a => a -> a -> Bool, select-==)
[(String, (Qual, Ast))]
-- | Data constructor table.
(Map String [Constr]) -- AdtTab
-- | FFI declarations.
[(String, Type)]
-- | Exports.
[(String, String)]
-- | Module imports.
[String]
patVars = \case
PatLit _ -> []
PatVar s m -> s : maybe [] patVars m
PatCon _ args -> concat $ patVars <$> args
fvPro bound expr = case expr of
V s | not (elem s bound) -> [s]
A x y -> fvPro bound x `union` fvPro bound y
L s t -> fvPro (s:bound) t
Pa vsts -> foldr union [] $ map (\(vs, t) -> fvPro (concatMap patVars vs ++ bound) t) vsts
_ -> []
overFreePro s f t = case t of
E _ -> t
V s' -> if s == s' then f t else t
A x y -> A (overFreePro s f x) (overFreePro s f y)
L s' t' -> if s == s' then t else L s' $ overFreePro s f t'
Pa vsts -> Pa $ map (\(vs, t) -> (vs, if any (elem s . patVars) vs then t else overFreePro s f t)) vsts
beta s a t = case t of
E _ -> t
V v -> if s == v then a else t
A x y -> A (beta s a x) (beta s a y)
L v u -> if s == v then t else L v $ beta s a u
showParen b f = if b then ('(':) . f . (')':) else f
-- Parser.
data ParserState = ParserState
[(Char, (Int, Int))]
String
[Int]
(Map String (Int, Assoc))
readme (ParserState x _ _ _) = x
landin (ParserState _ x _ _) = x
indents (ParserState _ _ x _) = x
precs (ParserState _ _ _ x) = x
putReadme x (ParserState _ a b c) = ParserState x a b c
putLandin x (ParserState a _ b c) = ParserState a x b c
modIndents f (ParserState a b x c) = ParserState a b (f x) c
data Parser a = Parser (ParserState -> Either String (a, ParserState))
getParser (Parser p) = p
instance Functor Parser where fmap f x = pure f <*> x
instance Applicative Parser where
pure x = Parser \inp -> Right (x, inp)
(Parser f) <*> (Parser x) = Parser \inp -> do
(fun, t) <- f inp
(arg, u) <- x t
pure (fun arg, u)
instance Monad Parser where
return = pure
(Parser x) >>= f = Parser \inp -> do
(a, t) <- x inp
getParser (f a) t
instance Alternative Parser where
empty = bad ""
x <|> y = Parser \inp -> either (const $ getParser y inp) Right $ getParser x inp
getPrecs = Parser \st -> Right (precs st, st)
putPrecs ps = Parser \(ParserState a b c _) -> Right ((), ParserState a b c ps)
notFollowedBy p = do
saved <- Parser \pasta -> Right (pasta, pasta)
ret <- p *> pure (bad "") <|> pure (pure ())
Parser \_ -> Right ((), saved)
ret
parse f str = getParser f $ ParserState (rowcol str (1, 1)) [] [] $ singleton ":" (5, RAssoc) where
rowcol s rc = case s of
[] -> []
h:t -> (h, rc) : rowcol t (advanceRC (ord h) rc)
advanceRC n (r, c)
| n `elem` [10, 11, 12, 13] = (r + 1, 1)
| n == 9 = (r, (c + 8)`mod`8)
| True = (r, c + 1)
indentOf pasta = case readme pasta of
[] -> 1
(_, (_, c)):_ -> c
ins c pasta = putLandin (c:landin pasta) pasta
angle n pasta = case indents pasta of
m:ms | m == n -> ins ';' pasta
| n + 1 <= m -> ins '}' $ angle n $ modIndents tail pasta
_ -> pasta
curly n pasta = case indents pasta of
m:ms | m + 1 <= n -> ins '{' $ modIndents (n:) pasta
[] | 1 <= n -> ins '{' $ modIndents (n:) pasta
_ -> ins '{' . ins '}' $ angle n pasta
sat f = Parser \pasta -> case landin pasta of
c:t -> if f c then Right (c, putLandin t pasta) else Left "unsat"
[] -> case readme pasta of
[] -> case indents pasta of
[] -> Left "EOF"
m:ms | m /= 0 && f '}' -> Right ('}', modIndents tail pasta)
_ -> Left "unsat"
(h, _):t | f h -> let
p' = putReadme t pasta
in case h of
'}' -> case indents pasta of
0:ms -> Right (h, modIndents tail p')
_ -> Left "unsat"
'{' -> Right (h, modIndents (0:) p')
_ -> Right (h, p')
_ -> Left "unsat"
char c = sat (c ==)
rawSat f = Parser \pasta -> case readme pasta of
[] -> Left "EOF"
(h, _):t -> if f h then Right (h, putReadme t pasta) else Left "unsat"
eof = Parser \pasta -> case pasta of
ParserState [] [] _ _ -> Right ((), pasta)
_ -> badpos pasta "want eof"
comment = rawSat ('-' ==) *> some (rawSat ('-' ==)) *>
(rawSat isNewline <|> rawSat (not . isSymbol) *> many (rawSat $ not . isNewline) *> rawSat isNewline) *> pure True
spaces = isNewline <$> rawSat isSpace
whitespace = do
offside <- or <$> many (spaces <|> comment)
Parser \pasta -> Right ((), if offside then angle (indentOf pasta) pasta else pasta)
hexValue d
| d <= '9' = ord d - ord '0'
| d <= 'F' = 10 + ord d - ord 'A'
| d <= 'f' = 10 + ord d - ord 'a'
isNewline c = ord c `elem` [10, 11, 12, 13]
isSymbol = (`elem` "!#$%&*+./<=>?@\\^|-~:")
isSmall c = c <= 'z' && 'a' <= c || c == '_'
small = sat isSmall
large = sat \x -> (x <= 'Z') && ('A' <= x)
hexit = sat \x -> (x <= '9') && ('0' <= x)
|| (x <= 'F') && ('A' <= x)
|| (x <= 'f') && ('a' <= x)
digit = sat \x -> (x <= '9') && ('0' <= x)
decimal = foldl (\n d -> 10*n + ord d - ord '0') 0 <$> some digit
hexadecimal = foldl (\n d -> 16*n + hexValue d) 0 <$> some hexit
nameTailChar = small <|> large <|> digit <|> char '\''
nameTailed p = liftA2 (:) p $ many nameTailChar
escape = char '\\' *> (sat (`elem` "'\"\\") <|> char 'n' *> pure '\n' <|> char '0' *> pure (chr 0) <|> char 'x' *> (chr <$> hexadecimal))
tokOne delim = escape <|> rawSat (delim /=)
charSeq = mapM char
tokChar = between (char '\'') (char '\'') (tokOne '\'')
quoteStr = between (char '"') (char '"') $ many $ many (charSeq "\\&") *> tokOne '"'
quasiquoteStr = charSeq "[r|" *> quasiquoteBody
quasiquoteBody = charSeq "|]" *> pure [] <|> (:) <$> rawSat (const True) <*> quasiquoteBody
tokStr = quoteStr <|> quasiquoteStr
integer = char '0' *> (char 'x' <|> char 'X') *> hexadecimal <|> decimal
literal = lexeme . fmap E $ Const <$> integer <|> ChrCon <$> tokChar <|> StrCon <$> tokStr
varish = lexeme $ nameTailed small
bad s = Parser \pasta -> badpos pasta s
badpos pasta s = Left $ loc $ ": " ++ s where
loc = case readme pasta of
[] -> ("EOF"++)
(_, (r, c)):_ -> ("row "++) . shows r . (" col "++) . shows c
varId = do
s <- varish
if elem s
["export", "case", "class", "data", "default", "deriving", "do", "else", "foreign", "if", "import", "in", "infix", "infixl", "infixr", "instance", "let", "module", "newtype", "of", "then", "type", "where", "_"]
then bad $ "reserved: " ++ s else pure s
varSymish = lexeme $ (:) <$> sat (\c -> isSymbol c && c /= ':') <*> many (sat isSymbol)
varSym = lexeme $ do
s <- varSymish
if elem s ["..", "=", "\\", "|", "<-", "->", "@", "~", "=>"] then bad $ "reserved: " ++ s else pure s
conId = lexeme $ nameTailed large
conSymish = lexeme $ liftA2 (:) (char ':') $ many $ sat isSymbol
conSym = do
s <- conSymish
if elem s [":", "::"] then bad $ "reserved: " ++ s else pure s
special c = lexeme $ sat (c ==)
comma = special ','
semicolon = special ';'
lParen = special '('
rParen = special ')'
lBrace = special '{'
rBrace = special '}'
lSquare = special '['
rSquare = special ']'
backquote = special '`'
lexeme f = f <* whitespace
lexemePrelude = whitespace *>
Parser \pasta -> case getParser (res "module" <|> (:[]) <$> char '{') pasta of
Left _ -> Right ((), curly (indentOf pasta) pasta)
Right _ -> Right ((), pasta)
curlyCheck f = do
Parser \pasta -> Right ((), modIndents (0:) pasta)
r <- f
Parser \pasta -> let pasta' = modIndents tail pasta in case readme pasta of
[] -> Right ((), curly 0 pasta')
('{', _):_ -> Right ((), pasta')
(_, (_, col)):_ -> Right ((), curly col pasta')
pure r
conOf (Constr s _) = s
specialCase (h:_) = '|':conOf h
mkCase t cs = (specialCase cs,
( Qual [] $ arr t $ foldr arr (TV "case") $ map (\(Constr _ ts) -> foldr arr (TV "case") ts) cs
, E $ Basic "I"))
mkStrs = snd . foldl (\(s, l) u -> ('@':s, s:l)) ("@", [])
scottEncode _ ":" _ = E $ Basic "CONS"
scottEncode vs s ts = foldr L (foldl (\a b -> A a (V b)) (V s) ts) (ts ++ vs)
scottConstr t cs (Constr s ts) = (s,
(Qual [] $ foldr arr t ts , scottEncode (map conOf cs) s $ mkStrs ts))
mkAdtDefs t cs = mkCase t cs : map (scottConstr t cs) cs
mkFFIHelper n t acc = case t of
TC s -> acc
TAp (TC "IO") _ -> acc
TAp (TAp (TC "->") x) y -> L (show n) $ mkFFIHelper (n + 1) y $ A (V $ show n) acc
updateDcs cs dcs = foldr (\(Constr s _) m -> insert s cs m) dcs cs
addAdt t cs (Neat tycl fs typed dcs ffis ffes ims) =
Neat tycl fs (mkAdtDefs t cs ++ typed) (updateDcs cs dcs) ffis ffes ims
emptyTycl = Tycl [] []
addClass classId v (sigs, defs) (Neat tycl fs typed dcs ffis ffes ims) = let
vars = take (size sigs) $ show <$> [0..]
selectors = zipWith (\var (s, t) -> (s, (Qual [Pred classId v] t,
L "@" $ A (V "@") $ foldr L (V var) vars))) vars $ toAscList sigs
defaults = map (\(s, t) -> if member s sigs then ("{default}" ++ s, t) else error $ "bad default method: " ++ s) $ toAscList defs
Tycl ms is = maybe emptyTycl id $ mlookup classId tycl
tycl' = insert classId (Tycl (keys sigs) is) tycl
in if null ms then Neat tycl' (defaults ++ fs) (selectors ++ typed) dcs ffis ffes ims
else error $ "duplicate class: " ++ classId
addInstance classId ps ty ds (Neat tycl fs typed dcs ffis ffes ims) = let
Tycl ms is = maybe emptyTycl id $ mlookup classId tycl
tycl' = insert classId (Tycl ms $ Instance ty name ps (fromList ds):is) tycl
name = '{':classId ++ (' ':shows ty "}")
in Neat tycl' fs typed dcs ffis ffes ims
addFFI foreignname ourname t (Neat tycl fs typed dcs ffis ffes ims) = let
fn = A (E $ Basic "F") $ E $ Const $ length ffis
in Neat tycl fs ((ourname, (Qual [] t, mkFFIHelper 0 t fn)) : typed) dcs ((foreignname, t):ffis) ffes ims
addDefs ds (Neat tycl fs typed dcs ffis ffes ims) = Neat tycl (ds ++ fs) typed dcs ffis ffes ims
addImport im (Neat tycl fs typed dcs ffis exs ims) = Neat tycl fs typed dcs ffis exs (im:ims)
addExport e f (Neat tycl fs typed dcs ffis ffes ims) = Neat tycl fs typed dcs ffis ((e, f):ffes) ims
parseErrorRule = Parser \pasta -> case indents pasta of
m:ms | m /= 0 -> Right ('}', modIndents tail pasta)
_ -> badpos pasta "missing }"
res w@(h:_) = reservedSeq *> pure w <|> bad ("want \"" ++ w ++ "\"") where
reservedSeq = if elem w ["let", "where", "do", "of"]
then curlyCheck $ lexeme $ charSeq w *> notFollowedBy nameTailChar
else lexeme $ charSeq w *> notFollowedBy (if isSmall h then nameTailChar else sat isSymbol)
paren = between lParen rParen
braceSep f = between lBrace (rBrace <|> parseErrorRule) $ foldr ($) [] <$> sepBy ((:) <$> f <|> pure id) semicolon
maybeFix s x = if elem s $ fvPro [] x then A (V "fix") (L s x) else x
nonemptyTails [] = []
nonemptyTails xs@(x:xt) = xs : nonemptyTails xt
joinIsFail t = A (L "join#" t) (V "fail#")
addLets ls x = foldr triangle x components where
vs = fst <$> ls
ios = foldr (\(s, dsts) (ins, outs) ->
(foldr (\dst -> insertWith union dst [s]) ins dsts, insertWith union s dsts outs))
(Tip, Tip) $ map (\(s, t) -> (s, intersect (fvPro [] t) vs)) ls
components = scc (\k -> maybe [] id $ mlookup k $ fst ios) (\k -> maybe [] id $ mlookup k $ snd ios) vs
triangle names expr = let
tnames = nonemptyTails names
suball t = foldr (\(x:xt) t -> overFreePro x (const $ foldl (\acc s -> A acc (V s)) (V x) xt) t) t tnames
insLams vs t = foldr L t vs
in foldr (\(x:xt) t -> A (L x t) $ maybeFix x $ insLams xt $ suball $ maybe undefined joinIsFail $ lookup x ls) (suball expr) tnames
data Assoc = NAssoc | LAssoc | RAssoc
instance Eq Assoc where
NAssoc == NAssoc = True
LAssoc == LAssoc = True
RAssoc == RAssoc = True
_ == _ = False
precOf s precTab = maybe 9 fst $ mlookup s precTab
assocOf s precTab = maybe LAssoc snd $ mlookup s precTab
opFold precTab f x xs = case xs of
[] -> pure x
(op, y):xt -> case find (\(op', _) -> assocOf op precTab /= assocOf op' precTab) xt of
Nothing -> case assocOf op precTab of
NAssoc -> case xt of
[] -> pure $ f op x y
y:yt -> bad "NAssoc repeat"
LAssoc -> pure $ foldl (\a (op, y) -> f op a y) x xs
RAssoc -> pure $ foldr (\(op, y) b -> \e -> f op e (b y)) id xs $ x
Just y -> bad "Assoc clash"
qconop = conSym <|> res ":" <|> between backquote backquote conId
qconsym = conSym <|> res ":"
op = qconsym <|> varSym <|> between backquote backquote (conId <|> varId)
con = conId <|> paren qconsym
var = varId <|> paren varSym
tycon = do
s <- conId
pure $ if s == "String" then TAp (TC "[]") (TC "Char") else TC s
aType =
lParen *>
( rParen *> pure (TC "()")
<|> (foldr1 (TAp . TAp (TC ",")) <$> sepBy1 _type comma) <* rParen)
<|> tycon
<|> TV <$> varId
<|> (lSquare *> (rSquare *> pure (TC "[]") <|> TAp (TC "[]") <$> (_type <* rSquare)))
bType = foldl1 TAp <$> some aType
_type = foldr1 arr <$> sepBy bType (res "->")
fixityDecl w a = do
res w
n <- lexeme integer
os <- sepBy op comma
precs <- getPrecs
putPrecs $ foldr (\o m -> insert o (n, a) m) precs os
fixity = fixityDecl "infix" NAssoc <|> fixityDecl "infixl" LAssoc <|> fixityDecl "infixr" RAssoc
cDecls = first fromList . second fromList . foldr ($) ([], []) <$> braceSep cDecl
cDecl = first . (:) <$> genDecl <|> second . (++) <$> defSemi
genDecl = (,) <$> var <*> (res "::" *> _type)
classDecl = res "class" *> (addClass <$> conId <*> (TV <$> varId) <*> (res "where" *> cDecls))
simpleClass = Pred <$> conId <*> _type
scontext = (:[]) <$> simpleClass <|> paren (sepBy simpleClass comma)
instDecl = res "instance" *>
((\ps cl ty defs -> addInstance cl ps ty defs) <$>
(scontext <* res "=>" <|> pure [])
<*> conId <*> _type <*> (res "where" *> braceDef))
letin = addLets <$> between (res "let") (res "in") braceDef <*> expr
ifthenelse = (\a b c -> A (A (A (V "if") a) b) c) <$>
(res "if" *> expr) <*> (res "then" *> expr) <*> (res "else" *> expr)
listify = foldr (\h t -> A (A (V ":") h) t) (V "[]")
alts = joinIsFail . Pa <$> braceSep ((\x y -> ([x], y)) <$> pat <*> guards "->")
cas = flip A <$> between (res "case") (res "of") expr <*> alts
lamCase = curlyCheck (res "case") *> alts
lam = res "\\" *> (lamCase <|> liftA2 onePat (some apat) (res "->" *> expr))
flipPairize y x = A (A (V ",") x) y
moreCommas = foldr1 (A . A (V ",")) <$> sepBy1 expr comma
thenComma = comma *> ((flipPairize <$> moreCommas) <|> pure (A (V ",")))
parenExpr = (&) <$> expr <*> (((\v a -> A (V v) a) <$> op) <|> thenComma <|> pure id)
rightSect = ((\v a -> L "@" $ A (A (V v) $ V "@") a) <$> (op <|> (:"") <$> comma)) <*> expr
section = lParen *> (parenExpr <* rParen <|> rightSect <* rParen <|> rParen *> pure (V "()"))
maybePureUnit = maybe (V "pure" `A` V "()") id
stmt = (\p x -> Just . A (V ">>=" `A` x) . onePat [p] . maybePureUnit) <$> pat <*> (res "<-" *> expr)
<|> (\x -> Just . maybe x (\y -> (V ">>=" `A` x) `A` (L "_" y))) <$> expr
<|> (\ds -> Just . addLets ds . maybePureUnit) <$> (res "let" *> braceDef)
doblock = res "do" *> (maybePureUnit . foldr ($) Nothing <$> braceSep stmt)
compQual =
(\p xs e -> A (A (V "concatMap") $ onePat [p] e) xs)
<$> pat <*> (res "<-" *> expr)
<|> (\b e -> A (A (A (V "if") b) e) $ V "[]") <$> expr
<|> addLets <$> (res "let" *> braceDef)
sqExpr = between lSquare rSquare $
((&) <$> expr <*>
( res ".." *>
( (\hi lo -> (A (A (V "enumFromTo") lo) hi)) <$> expr
<|> pure (A (V "enumFrom"))
)
<|> res "|" *>
((. A (V "pure")) . foldr (.) id <$> sepBy1 compQual comma)
<|> (\t h -> listify (h:t)) <$> many (comma *> expr)
)
)
<|> pure (V "[]")
atom = ifthenelse <|> doblock <|> letin <|> sqExpr <|> section
<|> cas <|> lam <|> (paren comma *> pure (V ","))
<|> V <$> (con <|> var) <|> literal
aexp = foldl1 A <$> some atom
withPrec precTab n p = p >>= \s ->
if n == precOf s precTab then pure s else Parser $ const $ Left ""
exprP n = if n <= 9
then getPrecs >>= \precTab
-> exprP (succ n) >>= \a
-> many ((,) <$> withPrec precTab n op <*> exprP (succ n)) >>= \as
-> opFold precTab (\op x y -> A (A (V op) x) y) a as
else aexp
expr = exprP 0
gcon = conId <|> paren (qconsym <|> (:"") <$> comma) <|> (lSquare *> rSquare *> pure "[]")
apat = PatVar <$> var <*> (res "@" *> (Just <$> apat) <|> pure Nothing)
<|> flip PatVar Nothing <$> (res "_" *> pure "_")
<|> flip PatCon [] <$> gcon
<|> PatLit <$> literal
<|> foldr (\h t -> PatCon ":" [h, t]) (PatCon "[]" [])
<$> between lSquare rSquare (sepBy pat comma)
<|> paren (foldr1 pairPat <$> sepBy1 pat comma <|> pure (PatCon "()" []))
where pairPat x y = PatCon "," [x, y]
binPat f x y = PatCon f [x, y]
patP n = if n <= 9
then getPrecs >>= \precTab
-> patP (succ n) >>= \a
-> many ((,) <$> withPrec precTab n qconop <*> patP (succ n)) >>= \as
-> opFold precTab binPat a as
else PatCon <$> gcon <*> many apat <|> apat
pat = patP 0
maybeWhere p = (&) <$> p <*> (res "where" *> (addLets <$> braceDef) <|> pure id)
guards s = maybeWhere $ res s *> expr <|> foldr ($) (V "join#") <$> some ((\x y -> case x of
V "True" -> \_ -> y
_ -> A (A (A (V "if") x) y)
) <$> (res "|" *> expr) <*> (res s *> expr))
onePat vs x = joinIsFail $ Pa [(vs, x)]
defOnePat vs x = Pa [(vs, x)]
opDef x f y rhs = [(f, defOnePat [x, y] rhs)]
leftyPat p expr = case pvars of
[] -> []
(h:t) -> let gen = '@':h in
(gen, expr):map (\v -> (v, A (Pa [([p], V v)]) $ V gen)) pvars
where
pvars = filter (/= "_") $ patVars p
def = liftA2 (\l r -> [(l, r)]) var (liftA2 defOnePat (many apat) $ guards "=")
<|> (pat >>= \x -> opDef x <$> varSym <*> pat <*> guards "=" <|> leftyPat x <$> guards "=")
coalesce = \case
[] -> []
h@(s, x):t -> case t of
[] -> [h]
(s', x'):t' -> let
f (Pa vsts) (Pa vsts') = Pa $ vsts ++ vsts'
f _ _ = error "bad multidef"
in if s == s' then coalesce $ (s, f x x'):t' else h:coalesce t
defSemi = coalesce . concat <$> sepBy1 def (some semicolon)
braceDef = concat <$> braceSep defSemi
simpleType c vs = foldl TAp (TC c) (map TV vs)
conop = conSym <|> between backquote backquote conId
constr = (\x c y -> Constr c [x, y]) <$> aType <*> conop <*> aType
<|> Constr <$> conId <*> many aType
adt = addAdt <$> between (res "data") (res "=") (simpleType <$> conId <*> many varId) <*> sepBy constr (res "|")
impDecl = addImport <$> (res "import" *> conId)
topdecls = braceSep
$ adt
<|> classDecl
<|> instDecl
<|> res "foreign" *>
( res "import" *> var *> (addFFI <$> lexeme tokStr <*> var <*> (res "::" *> _type))
<|> res "export" *> var *> (addExport <$> lexeme tokStr <*> var)
)
<|> addDefs <$> defSemi
<|> fixity *> pure id
<|> impDecl
haskell = between lexemePrelude eof $ some $ (,) <$> (res "module" *> conId <* res "where" <|> pure "Main") <*> topdecls
parseProgram s = fst <$> parse haskell s
-- Primitives.
primAdts =
[ (TC "()", [Constr "()" []])
, (TC "Bool", [Constr "True" [], Constr "False" []])
, (TAp (TC "[]") (TV "a"), [Constr "[]" [], Constr ":" [TV "a", TAp (TC "[]") (TV "a")]])
, (TAp (TAp (TC ",") (TV "a")) (TV "b"), [Constr "," [TV "a", TV "b"]])
]
prims = let
ro = E . Basic
dyad s = TC s `arr` (TC s `arr` TC s)
bin s = A (ro "Q") (ro s)
in map (second (first $ Qual [])) $
[ ("intEq", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "EQ"))
, ("intLE", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "LE"))
, ("charEq", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "EQ"))
, ("charLE", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "LE"))
, ("fix", (arr (arr (TV "a") (TV "a")) (TV "a"), ro "Y"))
, ("if", (arr (TC "Bool") $ arr (TV "a") $ arr (TV "a") (TV "a"), ro "I"))
, ("chr", (arr (TC "Int") (TC "Char"), ro "I"))
, ("ord", (arr (TC "Char") (TC "Int"), ro "I"))
, ("ioBind", (arr (TAp (TC "IO") (TV "a")) (arr (arr (TV "a") (TAp (TC "IO") (TV "b"))) (TAp (TC "IO") (TV "b"))), ro "C"))
, ("ioPure", (arr (TV "a") (TAp (TC "IO") (TV "a")), ro "V"))
, ("primitiveError", (arr (TAp (TC "[]") (TC "Char")) (TV "a"), ro "ERR"))
, ("newIORef", (arr (TV "a") (TAp (TC "IO") (TAp (TC "IORef") (TV "a"))), ro "NEWREF"))
, ("readIORef", (arr (TAp (TC "IORef") (TV "a")) (TAp (TC "IO") (TV "a")),
A (ro "T") (ro "READREF")))
, ("writeIORef", (arr (TAp (TC "IORef") (TV "a")) (arr (TV "a") (TAp (TC "IO") (TC "()"))),
A (A (ro "R") (ro "WRITEREF")) (ro "B")))
, ("exitSuccess", (TAp (TC "IO") (TV "a"), ro "END"))
, ("unsafePerformIO", (arr (TAp (TC "IO") (TV "a")) (TV "a"), A (A (ro "C") (A (ro "T") (ro "END"))) (ro "K")))
, ("join#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess")))
, ("fail#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess")))
]
++ map (\(s, v) -> (s, (dyad "Int", bin v)))
[ ("intAdd", "ADD")
, ("intSub", "SUB")
, ("intMul", "MUL")
, ("intDiv", "DIV")
, ("intMod", "MOD")
, ("intQuot", "DIV")
, ("intRem", "MOD")
]
-- Conversion to De Bruijn indices.
data LC = Ze | Su LC | Pass IntTree | La LC | App LC LC
debruijn n e = case e of
E x -> Pass $ Lf x
V v -> maybe (Pass $ LfVar v) id $
foldr (\h found -> if h == v then Just Ze else Su <$> found) Nothing n
A x y -> App (debruijn n x) (debruijn n y)
L s t -> La (debruijn (s:n) t)
-- Kiselyov bracket abstraction.
data IntTree = Lf Extra | LfVar String | Nd IntTree IntTree
data Sem = Defer | Closed IntTree | Need Sem | Weak Sem
lf = Lf . Basic
x ## y = case x of
Defer -> case y of
Defer -> Need $ Closed (Nd (Nd (lf "S") (lf "I")) (lf "I"))
Closed d -> Need $ Closed (Nd (lf "T") d)
Need e -> Need $ Closed (Nd (lf "S") (lf "I")) ## e
Weak e -> Need $ Closed (lf "T") ## e
Closed d -> case y of
Defer -> Need $ Closed d
Closed dd -> Closed $ Nd d dd
Need e -> Need $ Closed (Nd (lf "B") d) ## e
Weak e -> Weak $ Closed d ## e
Need e -> case y of
Defer -> Need $ Closed (lf "S") ## e ## Closed (lf "I")
Closed d -> Need $ Closed (Nd (lf "R") d) ## e
Need ee -> Need $ Closed (lf "S") ## e ## ee
Weak ee -> Need $ Closed (lf "C") ## e ## ee
Weak e -> case y of
Defer -> Need e
Closed d -> Weak $ e ## Closed d
Need ee -> Need $ Closed (lf "B") ## e ## ee
Weak ee -> Weak $ e ## ee
babs t = case t of
Ze -> Defer
Su x -> Weak $ babs x
Pass x -> Closed x
La t -> case babs t of
Defer -> Closed $ lf "I"
Closed d -> Closed $ Nd (lf "K") d
Need e -> e
Weak e -> Closed (lf "K") ## e
App x y -> babs x ## babs y
nolam x = (\(Closed d) -> d) $ babs $ debruijn [] x
optim t = case t of
Nd x y -> go (optim x) (optim y)
_ -> t
where
go (Lf (Basic "I")) q = q
go p q@(Lf (Basic c)) = case c of
"K" -> case p of
Lf (Basic "B") -> lf "BK"
_ -> Nd p q
"I" -> case p of
Lf (Basic r) -> case r of
"C" -> lf "T"
"B" -> lf "I"
"K" -> lf "KI"
_ -> Nd p q
Nd p1 p2 -> case p1 of
Lf (Basic "B") -> p2
Lf (Basic "R") -> Nd (lf "T") p2
_ -> Nd (Nd p1 p2) q
_ -> Nd p q
"T" -> case p of
Nd (Lf (Basic "B")) (Lf (Basic r)) -> case r of
"C" -> lf "V"
"BK" -> lf "LEFT"
_ -> Nd p q
_ -> Nd p q
"V" -> case p of
Nd (Lf (Basic "B")) (Lf (Basic "BK")) -> lf "CONS"
_ -> Nd p q
_ -> Nd p q
go p q = Nd p q
app01 s x y = maybe (A (L s x) y) snd $ go x where
go expr = case expr of
E _ -> Just (False, expr)
V v -> Just $ if s == v then (True, y) else (False, expr)
A l r -> do
(a, l') <- go l
(b, r') <- go r
if a && b then Nothing else pure (a || b, A l' r')
L v t -> if v == s then Just (False, expr) else second (L v) <$> go t
optiApp t = case t of
A x y -> let
x' = optiApp x
y' = optiApp y
in case x' of
L s v -> app01 s v y'
_ -> A x' y'
L s x -> L s (optiApp x)
_ -> t
-- Pattern compiler.
rewritePats dcs = \case
[] -> pure $ V "join#"
vsxs@((as0, _):_) -> case as0 of
[] -> pure $ foldr1 (A . L "join#") $ snd <$> vsxs
_ -> do
let k = length as0
n <- get
put $ n + k
let vs@(vh:vt) = take k $ (`shows` "#") <$> [n..]
cs <- flip mapM vsxs \(a:at, x) -> (a,) <$> foldM (\b (p, v) -> rewriteCase dcs v Tip [(p, b)]) x (zip at vt)
flip (foldr L) vs <$> rewriteCase dcs vh Tip cs
patEq lit b x y = A (L "join#" $ A (A (A (V "if") (A (A (V "==") lit) b)) x) $ V "join#") y
rewriteCase dcs caseVar tab = \case
[] -> flush $ V "join#"
((v, x):rest) -> go v x rest
where
rec = rewriteCase dcs caseVar
go v x rest = case v of
PatLit lit -> patEq lit (V caseVar) x <$> rec Tip rest >>= flush
PatVar s m -> let x' = beta s (V caseVar) x in case m of
Nothing -> A (L "join#" x') <$> rec Tip rest >>= flush
Just v' -> go v' x' rest
PatCon con args -> rec (insertWith (flip (.)) con ((args, x):) tab) rest
flush onFail = case toAscList tab of
[] -> pure onFail
-- TODO: Check rest of `tab` lies in cs.
(firstC, _):_ -> do
let cs = maybe undefined id $ dcs firstC
jumpTable <- mapM (\(Constr s ts) -> case mlookup s tab of
Nothing -> pure $ foldr L (V "join#") $ const "_" <$> ts
Just f -> rewritePats dcs $ f []
) cs
pure $ A (L "join#" $ foldl A (A (V $ specialCase cs) $ V caseVar) jumpTable) onFail
secondM f (a, b) = (a,) <$> f b
patternCompile dcs t = optiApp $ evalState (go t) 0 where
go t = case t of
E _ -> pure t
V _ -> pure t
A x y -> liftA2 A (go x) (go y)
L s x -> L s <$> go x
Pa vsxs -> mapM (secondM go) vsxs >>= rewritePats dcs
-- Unification and matching.
apply sub t = case t of
TC v -> t
TV v -> maybe t id $ lookup v sub
TAp a b -> TAp (apply sub a) (apply sub b)
(@@) s1 s2 = map (second (apply s1)) s2 ++ s1
occurs s t = case t of
TC v -> False
TV v -> s == v
TAp a b -> occurs s a || occurs s b
varBind s t = case t of
TC v -> Right [(s, t)]
TV v -> Right $ if v == s then [] else [(s, t)]
TAp a b -> if occurs s t then Left "occurs check" else Right [(s, t)]
ufail t u = Left $ ("unify fail: "++) . shows t . (" vs "++) . shows u $ ""
mgu t u = case t of
TC a -> case u of
TC b -> if a == b then Right [] else ufail t u
TV b -> varBind b t
TAp a b -> ufail t u
TV a -> varBind a u
TAp a b -> case u of
TC b -> ufail t u
TV b -> varBind b t
TAp c d -> mgu a c >>= unify b d
unify a b s = (@@ s) <$> mgu (apply s a) (apply s b)
merge s1 s2 = if all (\v -> apply s1 (TV v) == apply s2 (TV v))
$ map fst s1 `intersect` map fst s2 then Just $ s1 ++ s2 else Nothing
match h t = case h of
TC a -> case t of
TC b | a == b -> Just []
_ -> Nothing
TV a -> Just [(a, t)]
TAp a b -> case t of
TAp c d -> case match a c of
Nothing -> Nothing
Just ac -> case match b d of
Nothing -> Nothing
Just bd -> merge ac bd
_ -> Nothing
-- Type inference.
instantiate' t n tab = case t of
TC s -> ((t, n), tab)
TV s -> case lookup s tab of
Nothing -> let va = TV $ show n in ((va, n + 1), (s, va):tab)
Just v -> ((v, n), tab)
TAp x y -> let
((t1, n1), tab1) = instantiate' x n tab
((t2, n2), tab2) = instantiate' y n1 tab1
in ((TAp t1 t2, n2), tab2)
instantiatePred (Pred s t) ((out, n), tab) = first (first ((:out) . Pred s)) (instantiate' t n tab)
instantiate (Qual ps t) n = first (Qual ps1) $ fst $ instantiate' t n1 tab where
((ps1, n1), tab) = foldr instantiatePred (([], n), []) ps
proofApply sub a = case a of
Proof (Pred cl ty) -> Proof (Pred cl $ apply sub ty)
A x y -> A (proofApply sub x) (proofApply sub y)
L s t -> L s $ proofApply sub t
_ -> a
typeAstSub sub (t, a) = (apply sub t, proofApply sub a)
infer typed loc ast csn@(cs, n) = case ast of
E x -> Right $ case x of
Const _ -> ((TC "Int", ast), csn)
ChrCon _ -> ((TC "Char", ast), csn)
StrCon _ -> ((TAp (TC "[]") (TC "Char"), ast), csn)
Link im s q -> insta q
V s -> maybe (Left $ "undefined: " ++ s) Right
$ (\t -> ((t, ast), csn)) <$> lookup s loc
<|> insta . fst <$> mlookup s typed
A x y -> infer typed loc x (cs, n + 1) >>=
\((tx, ax), csn1) -> infer typed loc y csn1 >>=
\((ty, ay), (cs2, n2)) -> unify tx (arr ty va) cs2 >>=
\cs -> Right ((va, A ax ay), (cs, n2))
L s x -> first (\(t, a) -> (arr va t, L s a)) <$> infer typed ((s, va):loc) x (cs, n + 1)
where
va = TV $ show n
insta ty = ((ty1, foldl A ast (map Proof preds)), (cs, n1))
where (Qual preds ty1, n1) = instantiate ty n
findInstance tycl qn@(q, n) p@(Pred cl ty) insts = case insts of
[] -> let v = '*':show n in Right (((p, v):q, n + 1), V v)
(modName, Instance h name ps _):rest -> case match h ty of
Nothing -> findInstance tycl qn p rest
Just subs -> foldM (\(qn1, t) (Pred cl1 ty1) -> second (A t)
<$> findProof tycl (Pred cl1 $ apply subs ty1) qn1) (qn, if modName == "" then V name else E $ Link modName name undefined) ps
findProof tycl pred@(Pred classId t) psn@(ps, n) = case lookup pred ps of
Nothing -> findInstance tycl psn pred $ tycl classId
Just s -> Right (psn, V s)
prove tycl psn a = case a of
Proof pred -> findProof tycl pred psn
A x y -> prove tycl psn x >>= \(psn1, x1) ->
second (A x1) <$> prove tycl psn1 y
L s t -> second (L s) <$> prove tycl psn t
_ -> Right (psn, a)
data Dep a = Dep ([String] -> Either String ([String], a))
instance Functor Dep where
fmap f = \(Dep mf) -> Dep \g -> do
(g', x) <- mf g
pure (g', f x)
instance Applicative Dep where
pure x = Dep \g -> Right (g, x)
(Dep mf) <*> (Dep mx) = Dep \g -> do
(g', f) <- mf g
(g'', x) <- mx g'
pure (g'', f x)
addDep s = Dep \deps -> Right (if s `elem` deps then deps else s : deps, ())
badDep s = Dep $ const $ Left s
runDep (Dep f) = f []
astLink typed locals imps mods ast = runDep $ go [] ast where
go bound ast = case ast of
V s
| elem s bound -> pure ast
| member s locals -> case findImportSym imps mods s of
[] -> (if member s typed then pure () else addDep s) *> pure ast
_ -> badDep $ "ambiguous: " ++ s
| True -> case findImportSym imps mods s of
[] -> badDep $ "missing: " ++ s
[(im, t)] -> pure $ E $ Link im s t
_ -> badDep $ "ambiguous: " ++ s
A x y -> A <$> go bound x <*> go bound y
L s t -> L s <$> go (s:bound) t
_ -> pure ast
depthFirstSearch = (foldl .) \relation st@(visited, sequence) vertex ->
if vertex `elem` visited then st else second (vertex:)
$ depthFirstSearch relation (vertex:visited, sequence) (relation vertex)
spanningSearch = (foldl .) \relation st@(visited, setSequence) vertex ->
if vertex `elem` visited then st else second ((:setSequence) . (vertex:))
$ depthFirstSearch relation (vertex:visited, []) (relation vertex)
scc ins outs = spanning . depthFirst where
depthFirst = snd . depthFirstSearch outs ([], [])
spanning = snd . spanningSearch ins ([], [])
forFree cond f bound t = case t of
E _ -> t
V s -> if (not $ s `elem` bound) && cond s then f t else t
A x y -> A (rec bound x) (rec bound y)
L s t' -> L s $ rec (s:bound) t'
where rec = forFree cond f
inferno tycl typed defmap syms = let
loc = zip syms $ TV . (' ':) <$> syms
principal (acc, (subs, n)) s = do
expr <- maybe (Left $ "missing: " ++ s) Right (mlookup s defmap)
((t, a), (ms, n1)) <- infer typed loc expr (subs, n)
cs <- unify (TV (' ':s)) t ms
Right ((s, (t, a)):acc, (cs, n1))
gatherPreds (acc, psn) (s, (t, a)) = do
(psn, a) <- prove tycl psn a
pure ((s, (t, a)):acc, psn)
in do
(stas, (soln, _)) <- foldM principal ([], ([], 0)) syms
stas <- pure $ second (typeAstSub soln) <$> stas
(stas, (ps, _)) <- foldM gatherPreds ([], ([], 0)) $ second (typeAstSub soln) <$> stas
let
preds = fst <$> ps
dicts = snd <$> ps
applyDicts (s, (t, a)) = (s, (Qual preds t,
foldr L (forFree (`elem` syms) (\t -> foldl A t $ V <$> dicts) [] a) dicts))
pure $ map applyDicts stas
findImportSym imps mods s = concat [maybe [] (\(t, _) -> [(im, t)]) $ mlookup s qas | im <- imps, let qas = fst $ mods ! im]
inferDefs tycl defs typed = do
let
insertUnique m (s, (_, t)) = case mlookup s m of
Nothing -> case mlookup s typed of
Nothing -> Right $ insert s t m
_ -> Left $ "reserved: " ++ s
_ -> Left $ "duplicate: " ++ s
addEdges (sym, (deps, _)) (ins, outs) = (foldr (\dep es -> insertWith union dep [sym] es) ins deps, insertWith union sym deps outs)
graph = foldr addEdges (Tip, Tip) defs
defmap <- foldM insertUnique Tip defs
let
ins k = maybe [] id $ mlookup k $ fst graph
outs k = maybe [] id $ mlookup k $ snd graph
typeTab = fst <$> typed
inferComponent typed syms = foldr (uncurry insert) typed <$> inferno tycl typed defmap syms
foldM inferComponent typed $ scc ins outs $ keys defmap
dictVars ps n = (zip ps $ map (('*':) . show) [n..], n + length ps)
inferTypeclasses tycl typeOfMethod typed dcs linker ienv = foldM perClass typed $ toAscList ienv where
perClass typed (classId, Tycl sigs insts) = foldM perInstance typed insts where
perInstance typed (Instance ty name ps idefs) = do
let
dvs = map snd $ fst $ dictVars ps 0
perMethod s = do
let rawExpr = maybe (V $ "{default}" ++ s) id $ mlookup s idefs
expr <- snd <$> linker (patternCompile dcs rawExpr)
(ta, (sub, n)) <- either (Left . (name++) . (" "++) . (s++) . (": "++)) Right
$ infer typed [] expr ([], 0)
let
(tx, ax) = typeAstSub sub ta
-- e.g. qc = Eq a => a -> a -> Bool
-- We instantiate: Eq a1 => a1 -> a1 -> Bool.
qc = typeOfMethod s
(Qual [Pred _ headT] tc, n1) = instantiate qc n
-- Mix the predicates `ps` with the type of `headT`, applying a
-- substitution such as (a1, [a]) so the variable names match.
-- e.g. Eq a => [a] -> [a] -> Bool
Just subc = match headT ty
(Qual ps2 t2, n2) = instantiate (Qual ps $ apply subc tc) n1
case match tx t2 of
Nothing -> Left "class/instance type conflict"
Just subx -> do
((ps3, _), tr) <- prove tycl (dictVars ps2 0) (proofApply subx ax)
if length ps2 /= length ps3
then Left $ ("want context: "++) . (foldr (.) id $ shows . fst <$> ps3) $ name
else pure tr
ms <- mapM perMethod sigs
pure $ insert name (Qual [] $ TC "DICTIONARY", flip (foldr L) dvs $ L "@" $ foldl A (V "@") ms) typed
neatNew = Neat Tip [] [] Tip [] [] []
neatPrim = foldr (uncurry addAdt) (Neat Tip [] prims Tip [] [] []) primAdts
typedAsts (Neat _ _ tas _ _ _ _) = tas
typeclasses (Neat tcs _ _ _ _ _ _) = tcs
dataCons (Neat _ _ _ dcs _ _ _) = dcs
soloPrim = singleton "#" (fromList $ typedAsts neatPrim, ([], []))
tabulateModules mods = foldM ins (singleton "#" neatPrim) mods where
go = foldr ($) neatNew
ins tab (k, prog) = case mlookup k tab of
Nothing -> Right $ insert k (go prog) tab
Just _ -> Left $ "duplicate module: " ++ k
null xs = case xs of
[] -> True
_ -> False
inferModule tab acc name = case mlookup name acc of
Nothing -> do
let
Neat rawIenv defs typedList adtTab ffis ffes rawImps = tab ! name
typed = fromList typedList
fillSigs (cl, Tycl sigs is) = (cl,) $ case sigs of
[] -> Tycl (findSigs cl) is
_ -> Tycl sigs is
findSigs cl = maybe (error $ "no sigs: " ++ cl) id $ find (not . null) [maybe [] (\(Tycl sigs _) -> sigs) $ mlookup cl $ typeclasses (tab ! im) | im <- imps]
ienv = fromList $ fillSigs <$> toAscList rawIenv
imps = "#":rawImps
locals = fromList $ map (, ()) $ (fst <$> typedList) ++ (fst <$> defs)
insts im (Tycl _ is) = (im,) <$> is
classes im = if im == "" then ienv else typeclasses $ tab ! im
tycl classId = concat [maybe [] (insts im) $ mlookup classId $ classes im | im <- "":imps]
dcs s = foldr (<|>) (mlookup s adtTab) $ map (\im -> mlookup s $ dataCons $ tab ! im) imps
typeOfMethod s = maybe undefined id $ foldr (<|>) (fst <$> mlookup s typed) [fmap fst $ lookup s $ typedAsts $ tab ! im | im <- imps]
acc' <- foldM (inferModule tab) acc imps
let linker = astLink typed locals imps acc'
depdefs <- mapM (\(s, t) -> (s,) <$> linker (patternCompile dcs t)) defs
typed <- inferDefs tycl depdefs typed
typed <- inferTypeclasses tycl typeOfMethod typed dcs linker ienv
Right $ insert name (typed, (ffis, ffes)) acc'
Just _ -> Right acc
untangle s = do
tab <- parseProgram s >>= tabulateModules
foldM (inferModule tab) soloPrim $ keys tab
optiComb' (subs, combs) (s, lamb) = let
gosub t = case t of
LfVar v -> maybe t id $ lookup v subs
Nd a b -> Nd (gosub a) (gosub b)
_ -> t
c = optim $ gosub $ nolam $ optiApp lamb
combs' = combs . ((s, c):)
in case c of
Lf (Basic _) -> ((s, c):subs, combs')
LfVar v -> if v == s then (subs, combs . ((s, Nd (lf "Y") (lf "I")):)) else ((s, gosub c):subs, combs')
_ -> (subs, combs')
optiComb lambs = ($[]) . snd $ foldl optiComb' ([], id) lambs
instance Show Type where
showsPrec _ = \case
TC s -> (s++)
TV s -> (s++)
TAp (TAp (TC "->") a) b -> showParen True $ shows a . (" -> "++) . shows b
TAp a b -> showParen True $ shows a . (' ':) . shows b
instance Show Pred where
showsPrec _ (Pred s t) = (s++) . (' ':) . shows t . (" => "++)
instance Show Qual where
showsPrec _ (Qual ps t) = foldr (.) id (map shows ps) . shows t
instance Show Extra where
showsPrec _ = \case
Basic s -> (s++)
Const i -> shows i
ChrCon c -> shows c
StrCon s -> shows s
Link im s _ -> (im++) . ('.':) . (s++)
instance Show Pat where
showsPrec _ = \case
PatLit e -> shows e
PatVar s mp -> (s++) . maybe id ((('@':) .) . shows) mp
PatCon s ps -> (s++) . foldr (.) id (((' ':) .) . shows <$> ps)
showVar s@(h:_) = showParen (elem h ":!#$%&*+./<=>?@\\^|-~") (s++)
instance Show Ast where
showsPrec prec = \case
E e -> shows e
V s -> showVar s
A x y -> showParen (1 <= prec) $ shows x . (' ':) . showsPrec 1 y
L s t -> showParen True $ ('\\':) . (s++) . (" -> "++) . shows t
Pa vsts -> ('\\':) . showParen True (foldr (.) id $ intersperse (';':) $ map (\(vs, t) -> foldr (.) id (intersperse (' ':) $ map (showParen True . shows) vs) . (" -> "++) . shows t) vsts)
Proof p -> ("{Proof "++) . shows p . ("}"++)
instance Show IntTree where
showsPrec prec = \case
LfVar s -> showVar s
Lf extra -> shows extra
Nd x y -> showParen (1 <= prec) $ showsPrec 0 x . (' ':) . showsPrec 1 y
disasm (s, t) = (s++) . (" = "++) . shows t . (";\n"++)
dumpWith dumper s = case untangle s of
Left err -> err
Right tab -> foldr ($) [] $ map (\(name, mod) -> ("module "++) . (name++) . ('\n':) . (foldr (.) id $ dumper mod)) $ toAscList tab
dumpCombs (typed, _) = map disasm $ optiComb $ lambsList typed
dumpLambs (typed, _) = map (\(s, (_, t)) -> (s++) . (" = "++) . shows t . ('\n':)) $ toAscList typed
dumpTypes (typed, _) = map (\(s, (q, _)) -> (s++) . (" :: "++) . shows q . ('\n':)) $ toAscList typed
-- Code generation.
appCell (hp, bs) x y = (Right hp, (hp + 2, bs . (x:) . (y:)))
enc tab mem = \case
Lf n -> case n of
Basic c -> (Right $ comEnum c, mem)
Const c -> appCell mem (Right $ comEnum "NUM") $ Right c
ChrCon c -> appCell mem (Right $ comEnum "NUM") $ Right $ ord c
StrCon s -> enc tab mem $ foldr (\h t -> Nd (Nd (lf "CONS") (Lf $ ChrCon h)) t) (lf "K") s
Link m s _ -> (Left (m, s), mem)
LfVar s -> maybe (error $ "resolve " ++ s) (, mem) $ mlookup s tab
Nd x y -> let
(xAddr, mem') = enc tab mem x
(yAddr, mem'') = enc tab mem' y
in appCell mem'' xAddr yAddr
asm hp0 combs = tabmem where
tabmem = foldl (\(as, m) (s, t) -> let (p, m') = enc (fst tabmem) m t
in (insert s p as, m')) (Tip, (hp0, id)) combs
argList t = case t of
TC s -> [TC s]
TV s -> [TV s]
TAp (TC "IO") (TC u) -> [TC u]
TAp (TAp (TC "->") x) y -> x : argList y
cTypeName (TC "()") = "void"
cTypeName (TC "Int") = "int"
cTypeName (TC "Char") = "int"
ffiDeclare (name, t) = let tys = argList t in concat
[cTypeName $ last tys, " ", name, "(", intercalate "," $ cTypeName <$> init tys, ");\n"]
ffiArgs n t = case t of
TC s -> ("", ((True, s), n))
TAp (TC "IO") (TC u) -> ("", ((False, u), n))
TAp (TAp (TC "->") x) y -> first (((if 3 <= n then ", " else "") ++ "num(" ++ shows n ")") ++) $ ffiArgs (n + 1) y
ffiDefine n ffis = case ffis of
[] -> id
(name, t):xt -> let
(args, ((isPure, ret), count)) = ffiArgs 2 t
lazyn = ("lazy2(" ++) . shows (if isPure then count - 1 else count + 1) . (", " ++)
cont tgt = if isPure then ("_I, "++) . tgt else ("app(arg("++) . shows (count + 1) . ("), "++) . tgt . ("), arg("++) . shows count . (")"++)
longDistanceCall = (name++) . ("("++) . (args++) . ("); "++) . lazyn
in ("case " ++) . shows n . (": " ++) . if ret == "()"
then longDistanceCall . cont ("_K"++) . ("); break;"++) . ffiDefine (n - 1) xt
else ("{u r = "++) . longDistanceCall . cont ("app(_NUM, r)" ++) . ("); break;}\n"++) . ffiDefine (n - 1) xt
genMain n = "int main(int argc,char**argv){env_argc=argc;env_argv=argv;rts_reduce(" ++ shows n ");return 0;}\n"
resolve bigmap (m, s) = either (resolve bigmap) id $ (bigmap ! m) ! s
mayResolve bigmap (m, s) = mlookup m bigmap
>>= fmap (either (resolve bigmap) id) . mlookup s
lambsList typed = toAscList $ snd <$> typed
codegenLocal (name, (typed, _)) (bigmap, (hp, f)) =
(insert name localmap bigmap, (hp', f . memF))
where
(localmap, (hp', memF)) = asm hp $ optiComb $ lambsList typed
codegen mods = (bigmap, mem) where
(bigmap, (_, memF)) = foldr codegenLocal (Tip, (128, id)) $ toAscList mods
mem = either (resolve bigmap) id <$> memF []
getIOType (Qual [] (TAp (TC "IO") t)) = Right t
getIOType q = Left $ "main : " ++ shows q ""
ffcat (name, (_, (ffis, ffes))) (xs, ys) = (ffis ++ xs, ((name,) <$> ffes) ++ ys)
compile s = either id id do
mods <- untangle s
let
(bigmap, mem) = codegen mods
(ffis, ffes) = foldr ffcat ([], []) $ toAscList mods
mustType modName s = case mlookup s (fst $ mods ! modName) of
Just (Qual [] t, _) -> t
_ -> error "TODO: report bad exports"
mayMain = do
mainAddr <- mayResolve bigmap ("Main", "main")
(mainType, _) <- mlookup "main" (fst $ mods ! "Main")
pure (mainAddr, mainType)
mainStr <- case mayMain of
Nothing -> pure ""
Just (a, q) -> do
getIOType q
pure $ genMain a
pure
$ ("#include<stdio.h>\n"++)
. ("typedef unsigned u;\n"++)
. ("enum{_UNDEFINED=0,"++)
. foldr (.) id (map (\(s, _) -> ('_':) . (s++) . (',':)) comdefs)
. ("};\n"++)
. ("static const u prog[]={" ++)
. foldr (.) id (map (\n -> shows n . (',':)) mem)
. ("};\nstatic const u prog_size="++) . shows (length mem) . (";\n"++)
. ("static u root[]={" ++)
. foldr (\(modName, (_, ourName)) f -> shows (resolve bigmap (modName, ourName)) . (", " ++) . f) id ffes
. ("0};\n" ++)
. (preamble++)
. (libc++)
. (concatMap ffiDeclare ffis ++)
. ("static void foreign(u n) {\n switch(n) {\n" ++)
. ffiDefine (length ffis - 1) ffis
. ("\n }\n}\n" ++)
. runFun
. foldr (.) id (zipWith (\(modName, (expName, ourName)) n -> ("EXPORT(f"++) . shows n . (", \""++) . (expName++) . ("\")\n"++)
. genExport (arrCount $ mustType modName ourName) n) ffes [0..])
$ mainStr
genExport m n = ("void f"++) . shows n . ("("++)
. foldr (.) id (intersperse (',':) $ map (("u "++) .) xs)
. ("){rts_reduce("++)
. foldl (\s x -> ("app("++) . s . (",app(_NUM,"++) . x . ("))"++)) rt xs
. (");}\n"++)
where
xs = map ((('x':) .) . shows) [0..m - 1]
rt = ("root["++) . shows n . ("]"++)
arrCount = \case
TAp (TAp (TC "->") _) y -> 1 + arrCount y
_ -> 0
-- Main VM loop.
comdefsrc = [r|
F x = "foreign(num(1));"
Y x = x "sp[1]"
Q x y z = z(y x)
S x y z = x z(y z)
B x y z = x (y z)
BK x y z = x y
C x y z = x z y
R x y z = y z x
V x y z = z x y
T x y = y x
K x y = "_I" x
KI x y = "_I" y
I x = "sp[1] = arg(1); sp++;"
LEFT x y z = y x
CONS x y z w = w x y
NUM x y = y "sp[1]"
ADD x y = "_NUM" "num(1) + num(2)"
SUB x y = "_NUM" "num(1) - num(2)"
MUL x y = "_NUM" "num(1) * num(2)"
DIV x y = "_NUM" "num(1) / num(2)"
MOD x y = "_NUM" "num(1) % num(2)"
EQ x y = "num(1) == num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);"
LE x y = "num(1) <= num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);"
REF x y = y "sp[1]"
NEWREF x y z = z ("_REF" x) y
READREF x y z = z "num(1)" y
WRITEREF x y z w = w "((mem[arg(2) + 1] = arg(1)), _K)" z
END = "return;"
ERR = "sp[1]=app(app(arg(1),_ERREND),_ERR2);sp++;"
ERR2 = "lazy3(2, arg(1), _ERROUT, arg(2));"
ERROUT = "errchar(num(1)); lazy2(2, _ERR, arg(2));"
ERREND = "errexit(); return;"
|]
comb = (,) <$> conId <*> ((,) <$> many varId <*> (res "=" *> combExpr))
combExpr = foldl1 A <$> some
(V <$> varId <|> E . StrCon <$> lexeme tokStr <|> paren combExpr)
comdefs = case parse (lexemePrelude *> braceSep comb <* eof) comdefsrc of
Left e -> error e
Right (cs, _) -> cs
comEnum s = maybe (error s) id $ lookup s $ zip (fst <$> comdefs) [1..]
comName i = maybe undefined id $ lookup i $ zip [1..] (fst <$> comdefs)
preamble = [r|#define EXPORT(f, sym, n) void f() asm(sym) __attribute__((export_name(sym))); void f(){rts_reduce(root[n]);}
void *malloc(unsigned long);
enum { FORWARD = 127, REDUCING = 126 };
enum { TOP = 1<<24 };
static u *mem, *altmem, *sp, *spTop, hp;
static inline u isAddr(u n) { return n>=128; }
static u evac(u n) {
if (!isAddr(n)) return n;
u x = mem[n];
while (isAddr(x) && mem[x] == _T) {
mem[n] = mem[n + 1];
mem[n + 1] = mem[x + 1];
x = mem[n];
}
if (isAddr(x) && mem[x] == _K) {
mem[n + 1] = mem[x + 1];
x = mem[n] = _I;
}
u y = mem[n + 1];
switch(x) {
case FORWARD: return y;
case REDUCING:
mem[n] = FORWARD;
mem[n + 1] = hp;
hp += 2;
return mem[n + 1];
case _I:
mem[n] = REDUCING;
y = evac(y);
if (mem[n] == FORWARD) {
altmem[mem[n + 1]] = _I;
altmem[mem[n + 1] + 1] = y;
} else {
mem[n] = FORWARD;
mem[n + 1] = y;
}
return mem[n + 1];
default: break;
}
u z = hp;
hp += 2;
mem[n] = FORWARD;
mem[n + 1] = z;
altmem[z] = x;
altmem[z + 1] = y;
return z;
}
static void gc() {
hp = 128;
u di = hp;
sp = altmem + TOP - 1;
for(u *r = root; *r; r++) *r = evac(*r);
*sp = evac(*spTop);
while (di < hp) {
u x = altmem[di] = evac(altmem[di]);
di++;
if (x != _NUM) altmem[di] = evac(altmem[di]);
di++;
}
spTop = sp;
u *tmp = mem;
mem = altmem;
altmem = tmp;
}
static inline u app(u f, u x) { mem[hp] = f; mem[hp + 1] = x; return (hp += 2) - 2; }
static inline u arg(u n) { return mem[sp [n] + 1]; }
static inline int num(u n) { return mem[arg(n) + 1]; }
static inline void lazy2(u height, u f, u x) {
u *p = mem + sp[height];
*p = f;
*++p = x;
sp += height - 1;
*sp = f;
}
static void lazy3(u height,u x1,u x2,u x3){u*p=mem+sp[height];sp[height-1]=*p=app(x1,x2);*++p=x3;*(sp+=height-2)=x1;}
|]
runFun = ([r|static void run() {
for(;;) {
if (mem + hp > sp - 8) gc();
u x = *sp;
if (isAddr(x)) *--sp = mem[x]; else switch(x) {
|]++)
. foldr (.) id (genComb <$> comdefs)
. ([r|
}
}
}
void rts_init() {
mem = malloc(TOP * sizeof(u)); altmem = malloc(TOP * sizeof(u));
hp = 128;
for (u i = 0; i < prog_size; i++) mem[hp++] = prog[i];
spTop = mem + TOP - 1;
}
void rts_reduce(u n) {
static u ready;if (!ready){ready=1;rts_init();}
*(sp = spTop) = app(app(n, _UNDEFINED), _END);
run();
}
|]++)
genArg m a = case a of
V s -> ("arg("++) . (maybe undefined shows $ lookup s m) . (')':)
E (StrCon s) -> (s++)
A x y -> ("app("++) . genArg m x . (',':) . genArg m y . (')':)
genArgs m as = foldl1 (.) $ map (\a -> (","++) . genArg m a) as
genComb (s, (args, body)) = let
argc = ('(':) . shows (length args)
m = zip args [1..]
in ("case _"++) . (s++) . (':':) . (case body of
A (A x y) z -> ("lazy3"++) . argc . genArgs m [x, y, z] . (");"++)
A x y -> ("lazy2"++) . argc . genArgs m [x, y] . (");"++)
E (StrCon s) -> (s++)
) . ("break;\n"++)
main = getArgs >>= \case
"comb":_ -> interact $ dumpWith dumpCombs
"lamb":_ -> interact $ dumpWith dumpLambs
"type":_ -> interact $ dumpWith dumpTypes
_ -> interact compile
iterate f x = x : iterate f (f x)
takeWhile _ [] = []
takeWhile p xs@(x:xt)
| p x = x : takeWhile p xt
| True = []
class Enum a where
succ :: a -> a
pred :: a -> a
toEnum :: Int -> a
fromEnum :: a -> Int
enumFrom :: a -> [a]
enumFromTo :: a -> a -> [a]
instance Enum Int where
succ = (+1)
pred = (+(0-1))
toEnum = id
fromEnum = id
enumFrom = iterate succ
enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo
instance Enum Char where
succ = chr . (+1) . ord
pred = chr . (+(0-1)) . ord
toEnum = chr
fromEnum = ord
enumFrom = iterate succ
enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo
(+) = intAdd
(-) = intSub
(*) = intMul
div = intDiv
mod = intMod
We put our changes to the test by splitting party.hs into modules.
(We really should do the same for our "marginally" compiler, namely create
an intermediate stage that is exactly the same except we use indentation
instead of braces and semicolons. This would make it easier to compare against
its successor "methodically".)
cat Base0.hs Ast.hs Map.hs Parser.hs Kiselyov.hs Unify.hs RTS.hs Typer.hs party.hs
module Base where
infixr 9 .
infixl 7 * , `div` , `mod`
infixl 6 + , -
infixr 5 ++
infixl 4 <*> , <$> , <* , *>
infix 4 == , /= , <=
infixl 3 && , <|>
infixl 2 ||
infixl 1 >> , >>=
infixr 0 $
class Functor f where fmap :: (a -> b) -> f a -> f b
class Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
(<$>) = fmap
liftA2 f x y = f <$> x <*> y
(>>) f g = f >>= \_ -> g
class Eq a where (==) :: a -> a -> Bool
instance Eq Int where (==) = intEq
instance Eq Char where (==) = charEq
($) f x = f x
id x = x
const x y = x
flip f x y = f y x
(&) x f = f x
class Ord a where (<=) :: a -> a -> Bool
compare x y = if x <= y then if y <= x then EQ else LT else GT
instance Ord Int where (<=) = intLE
instance Ord Char where (<=) = charLE
data Ordering = LT | GT | EQ
instance Ord a => Ord [a] where
xs <= ys = case xs of
[] -> True
x:xt -> case ys of
[] -> False
y:yt -> if x <= y then if y <= x then xt <= yt else True else False
compare xs ys = case xs of
[] -> case ys of
[] -> EQ
_ -> LT
x:xt -> case ys of
[] -> GT
y:yt -> if x <= y then if y <= x then compare xt yt else LT else GT
data Maybe a = Nothing | Just a
data Either a b = Left a | Right b
fst (x, y) = x
snd (x, y) = y
uncurry f (x, y) = f x y
first f (x, y) = (f x, y)
second f (x, y) = (x, f y)
not a = if a then False else True
x /= y = not $ x == y
(.) f g x = f (g x)
(||) f g = if f then True else g
(&&) f g = if f then g else False
instance Eq a => Eq [a] where
xs == ys = case xs of
[] -> case ys of
[] -> True
_ -> False
x:xt -> case ys of
[] -> False
y:yt -> x == y && xt == yt
take 0 xs = []
take _ [] = []
take n (h:t) = h : take (n - 1) t
maybe n j m = case m of Nothing -> n; Just x -> j x
instance Functor Maybe where fmap f = maybe Nothing (Just . f)
instance Applicative Maybe where pure = Just ; mf <*> mx = maybe Nothing (\f -> maybe Nothing (Just . f) mx) mf
instance Monad Maybe where return = Just ; mf >>= mg = maybe Nothing mg mf
instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x
foldr c n = \case [] -> n; h:t -> c h $ foldr c n t
length = foldr (\_ n -> n + 1) 0
mapM f = foldr (\a rest -> liftA2 (:) (f a) rest) (pure [])
mapM_ f = foldr ((>>) . f) (pure ())
foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0
error = primitiveError
undefined = error "undefined"
foldr1 c l@(h:t) = maybe undefined id $ foldr (\x m -> Just $ maybe x (c x) m) Nothing l
foldl f a bs = foldr (\b g x -> g (f x b)) (\x -> x) bs a
foldl1 f (h:t) = foldl f h t
elem k xs = foldr (\x t -> x == k || t) False xs
find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs
(++) = flip (foldr (:))
concat = foldr (++) []
map = flip (foldr . ((:) .)) []
head (h:_) = h
tail (_:t) = t
isSpace c = elem (ord c) [32, 9, 10, 11, 12, 13, 160]
instance Functor [] where fmap = map
instance Applicative [] where pure = (:[]); f <*> x = concatMap (<$> x) f
instance Monad [] where return = (:[]); (>>=) = flip concatMap
concatMap = (concat .) . map
lookup s = foldr (\(k, v) t -> if s == k then Just v else t) Nothing
filter f = foldr (\x xs -> if f x then x:xs else xs) []
union xs ys = foldr (\y acc -> (if elem y acc then id else (y:)) acc) xs ys
intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs
last (x:xt) = go x xt where go x xt = case xt of [] -> x; y:yt -> go y yt
init (x:xt) = case xt of [] -> []; _ -> x : init xt
intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt
intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt)
all f = foldr (&&) True . map f
any f = foldr (||) False . map f
and = foldr (&&) True
or = foldr (||) False
zipWith f xs ys = case xs of [] -> []; x:xt -> case ys of [] -> []; y:yt -> f x y : zipWith f xt yt
zip = zipWith (,)
data State s a = State (s -> (a, s))
runState (State f) = f
instance Functor (State s) where fmap f = \(State h) -> State (first f . h)
instance Applicative (State s) where
pure a = State (a,)
(State f) <*> (State x) = State \s -> case f s of (g, s') -> first g $ x s'
instance Monad (State s) where
return a = State (a,)
(State h) >>= f = State $ uncurry (runState . f) . h
evalState m s = fst $ runState m s
get = State \s -> (s, s)
put n = State \s -> ((), n)
either l r e = case e of Left x -> l x; Right x -> r x
instance Functor (Either a) where fmap f e = either Left (Right . f) e
instance Applicative (Either a) where
pure = Right
ef <*> ex = case ef of
Left s -> Left s
Right f -> either Left (Right . f) ex
instance Monad (Either a) where
return = Right
ex >>= f = either Left f ex
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
asum = foldr (<|>) empty
(*>) = liftA2 \x y -> y
(<*) = liftA2 \x y -> x
many p = liftA2 (:) p (many p) <|> pure []
some p = liftA2 (:) p (many p)
sepBy1 p sep = liftA2 (:) p (many (sep *> p))
sepBy p sep = sepBy1 p sep <|> pure []
between x y p = x *> (p <* y)
showParen b f = if b then ('(':) . f . (')':) else f
iterate f x = x : iterate f (f x)
takeWhile _ [] = []
takeWhile p xs@(x:xt)
| p x = x : takeWhile p xt
| True = []
class Enum a where
succ :: a -> a
pred :: a -> a
toEnum :: Int -> a
fromEnum :: a -> Int
enumFrom :: a -> [a]
enumFromTo :: a -> a -> [a]
instance Enum Int where
succ = (+1)
pred = (+(0-1))
toEnum = id
fromEnum = id
enumFrom = iterate succ
enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo
instance Enum Char where
succ = chr . (+1) . ord
pred = chr . (+(0-1)) . ord
toEnum = chr
fromEnum = ord
enumFrom = iterate succ
enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo
(+) = intAdd
(-) = intSub
(*) = intMul
div = intDiv
mod = intMod
instance (Eq a, Eq b) => Eq (a, b) where
(a1, b1) == (a2, b2) = a1 == a2 && b1 == b2
instance (Ord a, Ord b) => Ord (a, b) where
(a1, b1) <= (a2, b2) = a1 <= a2 && (not (a2 <= a1) || b1 <= b2)
null xs = case xs of
[] -> True
_ -> False
instance Applicative IO where pure = ioPure ; (<*>) f x = ioBind f \g -> ioBind x \y -> ioPure (g y)
instance Monad IO where return = ioPure ; (>>=) = ioBind
instance Functor IO where fmap f x = ioPure f <*> x
class Show a where
showsPrec :: Int -> a -> String -> String
showsPrec _ x = (show x++)
show :: a -> String
show x = shows x ""
showList :: [a] -> String -> String
showList = showList__ shows
shows = showsPrec 0
showList__ _ [] s = "[]" ++ s
showList__ showx (x:xs) s = '[' : showx x (showl xs)
where
showl [] = ']' : s
showl (y:ys) = ',' : showx y (showl ys)
showInt__ n
| 0 == n = id
| True = showInt__ (n`div`10) . (chr (48+n`mod`10):)
instance Show () where show () = "()"
instance Show Bool where
show True = "True"
show False = "False"
instance Show a => Show [a] where showsPrec _ = showList
instance Show Int where
showsPrec _ n
| 0 == n = ('0':)
| 1 <= n = showInt__ n
| 2 * n == 0 = ("-2147483648"++)
| True = ('-':) . showInt__ (0 - n)
showLitChar__ '\n' = ("\\n"++)
showLitChar__ '\\' = ("\\\\"++)
showLitChar__ c = (c:)
instance Show Char where
showsPrec _ '\'' = ("'\\''"++)
showsPrec _ c = ('\'':) . showLitChar__ c . ('\'':)
showList s = ('"':) . foldr (.) id (map go s) . ('"':) where
go '"' = ("\\\""++)
go c = showLitChar__ c
instance (Show a, Show b) => Show (a, b) where
showsPrec _ (a, b) = showParen True $ shows a . (',':) . shows b
module Map where import Base infixl 9 ! data Map k a = Tip | Bin Int k a (Map k a) (Map k a) instance Functor (Map k) where fmap f m = case m of Tip -> Tip Bin sz k x l r -> Bin sz k (f x) (fmap f l) (fmap f r) size m = case m of Tip -> 0 ; Bin sz _ _ _ _ -> sz node k x l r = Bin (1 + size l + size r) k x l r singleton k x = Bin 1 k x Tip Tip singleL k x l (Bin _ rk rkx rl rr) = node rk rkx (node k x l rl) rr doubleL k x l (Bin _ rk rkx (Bin _ rlk rlkx rll rlr) rr) = node rlk rlkx (node k x l rll) (node rk rkx rlr rr) singleR k x (Bin _ lk lkx ll lr) r = node lk lkx ll (node k x lr r) doubleR k x (Bin _ lk lkx ll (Bin _ lrk lrkx lrl lrr)) r = node lrk lrkx (node lk lkx ll lrl) (node k x lrr r) balance k x l r = f k x l r where f | size l + size r <= 1 = node | 5 * size l + 3 <= 2 * size r = case r of Tip -> node Bin sz _ _ rl rr -> if 2 * size rl + 1 <= 3 * size rr then singleL else doubleL | 5 * size r + 3 <= 2 * size l = case l of Tip -> node Bin sz _ _ ll lr -> if 2 * size lr + 1 <= 3 * size ll then singleR else doubleR | True = node insert kx x t = case t of Tip -> singleton kx x Bin sz ky y l r -> case compare kx ky of LT -> balance ky y (insert kx x l) r GT -> balance ky y l (insert kx x r) EQ -> Bin sz kx x l r insertWith f kx x t = case t of Tip -> singleton kx x Bin sy ky y l r -> case compare kx ky of LT -> balance ky y (insertWith f kx x l) r GT -> balance ky y l (insertWith f kx x r) EQ -> Bin sy kx (f x y) l r mlookup kx t = case t of Tip -> Nothing Bin _ ky y l r -> case compare kx ky of LT -> mlookup kx l GT -> mlookup kx r EQ -> Just y fromList = foldl (\t (k, x) -> insert k x t) Tip member k t = maybe False (const True) $ mlookup k t t ! k = maybe undefined id $ mlookup k t foldrWithKey f = go where go z t = case t of Tip -> z Bin _ kx x l r -> go (f kx x (go z r)) l mapWithKey _ Tip = Tip mapWithKey f (Bin sx kx x l r) = Bin sx kx (f kx x) (mapWithKey f l) (mapWithKey f r) toAscList = foldrWithKey (\k x xs -> (k,x):xs) [] keys = map fst . toAscList elems = map snd . toAscList assocs = toAscList
-- Add `Show` instance.
module Kiselyov where
import Base
import Ast
-- Conversion to De Bruijn indices.
data LC = Ze | Su LC | Pass IntTree | La LC | App LC LC
debruijn n e = case e of
E x -> Pass $ Lf x
V v -> maybe (Pass $ LfVar v) id $
foldr (\h found -> if h == v then Just Ze else Su <$> found) Nothing n
A x y -> App (debruijn n x) (debruijn n y)
L s t -> La (debruijn (s:n) t)
-- Kiselyov bracket abstraction.
data IntTree = Lf Extra | LfVar String | Nd IntTree IntTree
data Sem = Defer | Closed IntTree | Need Sem | Weak Sem
instance Show IntTree where
showsPrec prec = \case
LfVar s -> showVar s
Lf extra -> shows extra
Nd x y -> showParen (1 <= prec) $ showsPrec 0 x . (' ':) . showsPrec 1 y
lf = Lf . Basic
x ## y = case x of
Defer -> case y of
Defer -> Need $ Closed (Nd (Nd (lf "S") (lf "I")) (lf "I"))
Closed d -> Need $ Closed (Nd (lf "T") d)
Need e -> Need $ Closed (Nd (lf "S") (lf "I")) ## e
Weak e -> Need $ Closed (lf "T") ## e
Closed d -> case y of
Defer -> Need $ Closed d
Closed dd -> Closed $ Nd d dd
Need e -> Need $ Closed (Nd (lf "B") d) ## e
Weak e -> Weak $ Closed d ## e
Need e -> case y of
Defer -> Need $ Closed (lf "S") ## e ## Closed (lf "I")
Closed d -> Need $ Closed (Nd (lf "R") d) ## e
Need ee -> Need $ Closed (lf "S") ## e ## ee
Weak ee -> Need $ Closed (lf "C") ## e ## ee
Weak e -> case y of
Defer -> Need e
Closed d -> Weak $ e ## Closed d
Need ee -> Need $ Closed (lf "B") ## e ## ee
Weak ee -> Weak $ e ## ee
babs t = case t of
Ze -> Defer
Su x -> Weak $ babs x
Pass x -> Closed x
La t -> case babs t of
Defer -> Closed $ lf "I"
Closed d -> Closed $ Nd (lf "K") d
Need e -> e
Weak e -> Closed (lf "K") ## e
App x y -> babs x ## babs y
nolam x = (\(Closed d) -> d) $ babs $ debruijn [] x
-- Optimizations.
optim t = case t of
Nd x y -> go (optim x) (optim y)
_ -> t
where
go (Lf (Basic "I")) q = q
go p q@(Lf (Basic c)) = case c of
"K" -> case p of
Lf (Basic "B") -> lf "BK"
_ -> Nd p q
"I" -> case p of
Lf (Basic r) -> case r of
"C" -> lf "T"
"B" -> lf "I"
"K" -> lf "KI"
_ -> Nd p q
Nd p1 p2 -> case p1 of
Lf (Basic "B") -> p2
Lf (Basic "R") -> Nd (lf "T") p2
_ -> Nd (Nd p1 p2) q
_ -> Nd p q
"T" -> case p of
Nd (Lf (Basic "B")) (Lf (Basic r)) -> case r of
"C" -> lf "V"
"BK" -> lf "LEFT"
_ -> Nd p q
_ -> Nd p q
"V" -> case p of
Nd (Lf (Basic "B")) (Lf (Basic "BK")) -> lf "CONS"
_ -> Nd p q
_ -> Nd p q
go p q = Nd p q
(There are more files, which I’ll include if I get around to writing a tool to help show several source files in HTML. For now, see the git repo.)
The main obstacle to compiling our modules with GHC is the Prelude. We define
entities such as Monad and (==) from scratch, which breaks do notation
for example because GHC always uses Prelude.Monad.
We remove this obstacle by simply removing any overlap with the Prelude. We use
a stripped-down Base.hs when testing with GHC, whereas our compilers really
use files like Base0.hs.
This implies much of our Base code is untested, but for this special case,
perhaps we can add a wrapper to test it on its own with GHC.
-- GHC-compatible version. module Base where import qualified Data.Char (chr, ord, isSpace) hide_prelude_here = hide_prelude_here chr = Data.Char.chr ord = Data.Char.ord isSpace = Data.Char.isSpace first f (x, y) = (f x, y) second f (x, y) = (x, f y) infixl 3 <|> instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x class Alternative f where empty :: f a (<|>) :: f a -> f a -> f a (&) x f = f x liftA2 f x y = f <$> x <*> y many p = liftA2 (:) p (many p) <|> pure [] some p = liftA2 (:) p (many p) sepBy1 p sep = liftA2 (:) p (many (sep *> p)) sepBy p sep = sepBy1 p sep <|> pure [] between x y p = x *> (p <* y) asum = foldr (<|>) empty find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs union xs ys = foldr (\y acc -> (if elem y acc then id else (y:)) acc) xs ys intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt) foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0 data State s a = State (s -> (a, s)) runState (State f) = f instance Functor (State s) where fmap f = \(State h) -> State (first f . h) instance Applicative (State s) where pure a = State (a,) (State f) <*> (State x) = State \s -> case f s of (g, s') -> first g $ x s' instance Monad (State s) where return a = State (a,) (State h) >>= f = State $ uncurry (runState . f) . h evalState m s = fst $ runState m s get = State \s -> (s, s) put n = State \s -> ((), n) integerSignList x f = f (x >= 0) $ go x where go 0 = [] go n = r : go q where (q, r) = divMod n $ 2^32 intFromWord = fromIntegral when x y = if x then y else pure () unless x y = if x then pure () else y
Another obstacle is our built-in support for quasi-quoted raw strings. We solve this by adding the line:
import_qq_here = import_qq_here
immediately after the import statements. Then we enable the C pre-processor and
define import_qq_here to be import Text.RawString.QQ --.
We perform similar tricks to hide Prelude symbols we define in the System
module.
module RTS where
import Base
import Ast
import Kiselyov
import Map
import Parser
import_qq_here = import_qq_here
libc = [r|#include<stdio.h>
static int env_argc;
int getargcount() { return env_argc; }
static char **env_argv;
int getargchar(int n, int k) { return env_argv[n][k]; }
static int nextCh, isAhead;
int eof_shim() {
if (!isAhead) {
isAhead = 1;
nextCh = getchar();
}
return nextCh == -1;
}
void exit(int);
void putchar_shim(int c) { putchar(c); }
int getchar_shim() {
if (!isAhead) nextCh = getchar();
if (nextCh == -1) exit(1);
isAhead = 0;
return nextCh;
}
void errchar(int c) { fputc(c, stderr); }
void errexit() { fputc('\n', stderr); }
|]
preamble = [r|#define EXPORT(f, sym, n) void f() asm(sym) __attribute__((export_name(sym))); void f(){rts_reduce(root[n]);}
void *malloc(unsigned long);
enum { FORWARD = 127, REDUCING = 126 };
enum { TOP = 1<<24 };
static u *mem, *altmem, *sp, *spTop, hp;
static inline u isAddr(u n) { return n>=128; }
static u evac(u n) {
if (!isAddr(n)) return n;
u x = mem[n];
while (isAddr(x) && mem[x] == _T) {
mem[n] = mem[n + 1];
mem[n + 1] = mem[x + 1];
x = mem[n];
}
if (isAddr(x) && mem[x] == _K) {
mem[n + 1] = mem[x + 1];
x = mem[n] = _I;
}
u y = mem[n + 1];
switch(x) {
case FORWARD: return y;
case REDUCING:
mem[n] = FORWARD;
mem[n + 1] = hp;
hp += 2;
return mem[n + 1];
case _I:
mem[n] = REDUCING;
y = evac(y);
if (mem[n] == FORWARD) {
altmem[mem[n + 1]] = _I;
altmem[mem[n + 1] + 1] = y;
} else {
mem[n] = FORWARD;
mem[n + 1] = y;
}
return mem[n + 1];
default: break;
}
u z = hp;
hp += 2;
mem[n] = FORWARD;
mem[n + 1] = z;
altmem[z] = x;
altmem[z + 1] = y;
return z;
}
static void gc() {
hp = 128;
u di = hp;
sp = altmem + TOP - 1;
for(u *r = root; *r; r++) *r = evac(*r);
*sp = evac(*spTop);
while (di < hp) {
u x = altmem[di] = evac(altmem[di]);
di++;
if (x != _NUM) altmem[di] = evac(altmem[di]);
di++;
}
spTop = sp;
u *tmp = mem;
mem = altmem;
altmem = tmp;
}
static inline u app(u f, u x) { mem[hp] = f; mem[hp + 1] = x; return (hp += 2) - 2; }
static inline u arg(u n) { return mem[sp [n] + 1]; }
static inline int num(u n) { return mem[arg(n) + 1]; }
static inline void lazy2(u height, u f, u x) {
u *p = mem + sp[height];
*p = f;
*++p = x;
sp += height - 1;
*sp = f;
}
static void lazy3(u height,u x1,u x2,u x3){u*p=mem+sp[height];sp[height-1]=*p=app(x1,x2);*++p=x3;*(sp+=height-2)=x1;}
|]
-- Main VM loop.
comdefsrc = [r|
F x = "foreign(num(1));"
Y x = x "sp[1]"
Q x y z = z(y x)
S x y z = x z(y z)
B x y z = x (y z)
BK x y z = x y
C x y z = x z y
R x y z = y z x
V x y z = z x y
T x y = y x
K x y = "_I" x
KI x y = "_I" y
I x = "sp[1] = arg(1); sp++;"
LEFT x y z = y x
CONS x y z w = w x y
NUM x y = y "sp[1]"
ADD x y = "_NUM" "num(1) + num(2)"
SUB x y = "_NUM" "num(1) - num(2)"
MUL x y = "_NUM" "num(1) * num(2)"
DIV x y = "_NUM" "num(1) / num(2)"
MOD x y = "_NUM" "num(1) % num(2)"
EQ x y = "num(1) == num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);"
LE x y = "num(1) <= num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);"
REF x y = y "sp[1]"
NEWREF x y z = z ("_REF" x) y
READREF x y z = z "num(1)" y
WRITEREF x y z w = w "((mem[arg(2) + 1] = arg(1)), _K)" z
END = "return;"
ERR = "sp[1]=app(app(arg(1),_ERREND),_ERR2);sp++;"
ERR2 = "lazy3(2, arg(1), _ERROUT, arg(2));"
ERROUT = "errchar(num(1)); lazy2(2, _ERR, arg(2));"
ERREND = "errexit(); return;"
|]
argList t = case t of
TC s -> [TC s]
TV s -> [TV s]
TAp (TC "IO") (TC u) -> [TC u]
TAp (TAp (TC "->") x) y -> x : argList y
cTypeName (TC "()") = "void"
cTypeName (TC "Int") = "int"
cTypeName (TC "Char") = "int"
ffiDeclare (name, t) = let tys = argList t in concat
[cTypeName $ last tys, " ", name, "(", intercalate "," $ cTypeName <$> init tys, ");\n"]
ffiArgs n t = case t of
TC s -> ("", ((True, s), n))
TAp (TC "IO") (TC u) -> ("", ((False, u), n))
TAp (TAp (TC "->") x) y -> first (((if 3 <= n then ", " else "") ++ "num(" ++ shows n ")") ++) $ ffiArgs (n + 1) y
ffiDefine n ffis = case ffis of
[] -> id
(name, t):xt -> let
(args, ((isPure, ret), count)) = ffiArgs 2 t
lazyn = ("lazy2(" ++) . shows (if isPure then count - 1 else count + 1) . (", " ++)
cont tgt = if isPure then ("_I, "++) . tgt else ("app(arg("++) . shows (count + 1) . ("), "++) . tgt . ("), arg("++) . shows count . (")"++)
longDistanceCall = (name++) . ("("++) . (args++) . ("); "++) . lazyn
in ("case " ++) . shows n . (": " ++) . if ret == "()"
then longDistanceCall . cont ("_K"++) . ("); break;"++) . ffiDefine (n - 1) xt
else ("{u r = "++) . longDistanceCall . cont ("app(_NUM, r)" ++) . ("); break;}\n"++) . ffiDefine (n - 1) xt
genMain n = "int main(int argc,char**argv){env_argc=argc;env_argv=argv;rts_reduce(" ++ shows n ");return 0;}\n"
arrCount = \case
TAp (TAp (TC "->") _) y -> 1 + arrCount y
_ -> 0
genExport m n = ("void f"++) . shows n . ("("++)
. foldr (.) id (intersperse (',':) $ map (("u "++) .) xs)
. ("){rts_reduce("++)
. foldl (\s x -> ("app("++) . s . (",app(_NUM,"++) . x . ("))"++)) rt xs
. (");}\n"++)
where
xs = map ((('x':) .) . shows) [0..m - 1]
rt = ("root["++) . shows n . ("]"++)
genArg m a = case a of
V s -> ("arg("++) . (maybe undefined shows $ lookup s m) . (')':)
E (StrCon s) -> (s++)
A x y -> ("app("++) . genArg m x . (',':) . genArg m y . (')':)
genArgs m as = foldl1 (.) $ map (\a -> (","++) . genArg m a) as
genComb (s, (args, body)) = let
argc = ('(':) . shows (length args)
m = zip args [1..]
in ("case _"++) . (s++) . (':':) . (case body of
A (A x y) z -> ("lazy3"++) . argc . genArgs m [x, y, z] . (");"++)
A x y -> ("lazy2"++) . argc . genArgs m [x, y] . (");"++)
E (StrCon s) -> (s++)
) . ("break;\n"++)
comb = (,) <$> conId <*> ((,) <$> many varId <*> (res "=" *> combExpr))
combExpr = foldl1 A <$> some
(V <$> varId <|> E . StrCon <$> lexeme tokStr <|> paren combExpr)
comdefs = case parse (lexemePrelude *> braceSep comb <* eof) comdefsrc of
Left e -> error e
Right (cs, _) -> cs
comEnum s = maybe (error s) id $ lookup s $ zip (fst <$> comdefs) [1..]
comName i = maybe undefined id $ lookup i $ zip [1..] (fst <$> comdefs)
runFun = ([r|static void run() {
for(;;) {
if (mem + hp > sp - 8) gc();
u x = *sp;
if (isAddr(x)) *--sp = mem[x]; else switch(x) {
|]++)
. foldr (.) id (genComb <$> comdefs)
. ([r|
}
}
}
void rts_init() {
mem = malloc(TOP * sizeof(u)); altmem = malloc(TOP * sizeof(u));
hp = 128;
for (u i = 0; i < sizeof(prog)/sizeof(*prog); i++) mem[hp++] = prog[i];
spTop = mem + TOP - 1;
}
void rts_reduce(u n) {
static u ready;if (!ready){ready=1;rts_init();}
*(sp = spTop) = app(app(n, _UNDEFINED), _END);
run();
}
|]++)
resolve bigmap (m, s) = either (resolve bigmap) id $ (bigmap ! m) ! s
mayResolve bigmap (m, s) = mlookup m bigmap
>>= fmap (either (resolve bigmap) id) . mlookup s
appCell (hp, bs) x y = (Right hp, (hp + 2, bs . (x:) . (y:)))
enc tab mem = \case
Lf n -> case n of
Basic c -> (Right $ comEnum c, mem)
Const c -> appCell mem (Right $ comEnum "NUM") $ Right c
ChrCon c -> appCell mem (Right $ comEnum "NUM") $ Right $ ord c
StrCon s -> enc tab mem $ foldr (\h t -> Nd (Nd (lf "CONS") (Lf $ ChrCon h)) t) (lf "K") s
Link m s _ -> (Left (m, s), mem)
LfVar s -> maybe (error $ "resolve " ++ s) (, mem) $ mlookup s tab
Nd x y -> let
(xAddr, mem') = enc tab mem x
(yAddr, mem'') = enc tab mem' y
in appCell mem'' xAddr yAddr
asm hp0 combs = tabmem where
tabmem = foldl (\(as, m) (s, t) -> let (p, m') = enc (fst tabmem) m t
in (insert s p as, m')) (Tip, (hp0, id)) combs
optiComb' (subs, combs) (s, lamb) = let
gosub t = case t of
LfVar v -> maybe t id $ lookup v subs
Nd a b -> Nd (gosub a) (gosub b)
_ -> t
c = optim $ gosub $ nolam lamb
combs' = combs . ((s, c):)
in case c of
Lf (Basic _) -> ((s, c):subs, combs')
LfVar v -> if v == s then (subs, combs . ((s, Nd (lf "Y") (lf "I")):)) else ((s, gosub c):subs, combs')
_ -> (subs, combs')
optiComb lambs = ($[]) . snd $ foldl optiComb' ([], id) lambs
lambsList typed = toAscList $ snd <$> typed
codegenLocal (name, (typed, _)) (bigmap, (hp, f)) =
(insert name localmap bigmap, (hp', f . memF))
where
(localmap, (hp', memF)) = asm hp $ optiComb $ lambsList typed
codegen mods = (bigmap, mem) where
(bigmap, (_, memF)) = foldr codegenLocal (Tip, (128, id)) $ toAscList mods
mem = either (resolve bigmap) id <$> memF []
getIOType (Qual [] (TAp (TC "IO") t)) = Right t
getIOType q = Left $ "main : " ++ shows q ""
ffcat (name, (_, (ffis, ffes))) (xs, ys) = (ffis ++ xs, ((name,) <$> ffes) ++ ys)
compile mods = do
let
(bigmap, mem) = codegen mods
(ffis, ffes) = foldr ffcat ([], []) $ toAscList mods
mustType modName s = case mlookup s (fst $ mods ! modName) of
Just (Qual [] t, _) -> t
_ -> error "TODO: report bad exports"
mayMain = do
mainAddr <- mayResolve bigmap ("Main", "main")
(mainType, _) <- mlookup "main" $ fst $ mods ! "Main"
pure (mainAddr, mainType)
mainStr <- case mayMain of
Nothing -> pure ""
Just (a, q) -> do
getIOType q
pure $ genMain a
pure
$ ("typedef unsigned u;\n"++)
. ("enum{_UNDEFINED=0,"++)
. foldr (.) id (map (\(s, _) -> ('_':) . (s++) . (',':)) comdefs)
. ("};\n"++)
. ("static const u prog[]={" ++)
. foldr (.) id (map (\n -> shows n . (',':)) mem)
. ("};\nstatic u root[]={" ++)
. foldr (\(modName, (_, ourName)) f -> shows (resolve bigmap (modName, ourName)) . (", " ++) . f) id ffes
. ("0};\n" ++)
. (preamble++)
. (libc++)
. (concatMap ffiDeclare ffis ++)
. ("static void foreign(u n) {\n switch(n) {\n" ++)
. ffiDefine (length ffis - 1) ffis
. ("\n }\n}\n" ++)
. runFun
. foldr (.) id (zipWith (\(modName, (expName, ourName)) n -> ("EXPORT(f"++) . shows n . (", \""++) . (expName++) . ("\")\n"++)
. genExport (arrCount $ mustType modName ourName) n) ffes [0..])
$ mainStr
Our source now works with GHC with the following options:
:set "-Dhide_prelude_here=import Prelude hiding (getChar, putChar, getContents, putStr, putStrLn, interact) --" :set "-Dimport_qq_here=import Text.RawString.QQ --" :set -cpp -XQuasiQuotes :set -XBlockArguments -XLambdaCase -XTupleSections :set -XNoMonomorphismRestriction -XMonoLocalBinds
In the inn subdirectory:
$ ghci -ghci-script compat.ghci party.hs ../stub.o
Here, the stub.o has been created from stub.c with clang -c or similar.
We gave the nice filenames to GHC, which expects to find modules in files with matching names. Our compilers tolerate weird filename prefixes and suffixes because we can simply concatenate different files. An alternative is to manage different subdirectories containing the same filenames.
We can test later iterations with GHCi by symlinking appropriate versions of each file in a dedicated subdirectory.
Modules feel revolutionary. Our source becomes clearer, because modularization forces us to think about interdependencies, which guided refactoring so breaking up was less hard to do. And we can progress by making a small change to a small file, like our earliest compilers back in the day.
However, we face new challenges. Addressing the limitations listed above will require effort. Prepending a little wrapper no longer suffices for GHC interoperability. And how are we going to keep track of many versions of many files?
For now we answer the last question by tweaking an existing filename and
Makefile rule. The module name remains the same but we concatenate a
different file.
For mutual let definitions we wrote code that traversed a syntax tree to substitute certain variables. An alternative is to build a syntax tree that describes this substitution. After all, lambda calculus is substitution incarnate. In other words, we rely more on dynamic rather than static semantics, a distinction that sometimes blurs because beta-reducing may occur during optimization.
One advantage of this approach is we can remove overFreePro, a helper that
traverses over syntax trees before case expressions and pattern matches have
been transformed away.
We extend the parser to support named record fields in data type declarations such as:
data Foo = Foo { bar :: Int, baz :: String } | Qux
For accessors, we generate one function definition per field. For example:
bar = \case Foo bar baz -> bar
except at a lower level, exploiting our knowledge that our data types are Scott-encoded.
Record updates and initialization are more challenging. We need more than plain function definitions, and furthermore, we only have all valid field names after parsing. This means we ought to extend our syntax tree to hold lists of field bindings for record updates and initializations.
Instead of adding a new data constructor to our Ast type, we invent two
basic combinators Basic "{=" and Basic "=}" which act as delimiters
for a list of field bindings, where the A data constructor acts like a cons.
An alternative is to use recursion schemes for our many variants of syntax
trees.
By pattern compilation, we know all the field names, so at this point we call
resolveFieldBinds to transform, say:
x { bar = 42 }
into:
case x of \Foo {orig}bar {orig}baz -> Foo 42 {orig}baz
though again using a lower level representation since we know we’re
Scott-encoding the data types. The {orig} added by our code to each variable
name guards against variable capture.
For record initializations, we only generate the right-hand side of the case
match and use undefined for missing fields instead of {orig} variables.
We implement deriving for Eq and Show. It would be nice to automatically
derive Eq for our primitive data types (unit, boolean, pairs, lists) but this
would require all programs to define the Eq class.
Recall for data types, we maintain a map from a data constructor name to the
list of all data constructors of the same type, along with the types of any
field they may have. Even though we need to generate a unique and predictable
symbol per type to represent corresponding case expressions, the function
specialCase simply builds this symbol from the first data constructor.
We barely modify this map for named fields. As a result, there’s no easy way
for findField to look up relevant information based on a field name. We
inefficiently search linearly through possibly repeated entries. It may be
better to add a separate map for named fields, but it’s tedious to add fields
to the Neat type when our current compiler lacks support for naming them!
Once again, a proto-chicken comes first.
To test with GHC, we create a new directory containing appropriately named symlinks to the desired versions of the modules. Incremental development means we only need to change a few symlinks at a time, but in the long run, we ought to automate symlinking from a given set of module files.
-- Record fields.
-- Remove `overFreePro`.
module Ast where
import Base
import Map
data Type = TC String | TV String | TAp Type Type
arr a b = TAp (TAp (TC "->") a) b
data Extra = Basic String | Const Int | ChrCon Char | StrCon String | Link String String Qual
data Pat = PatLit Ast | PatVar String (Maybe Pat) | PatCon String [Pat]
data Ast = E Extra | V String | A Ast Ast | L String Ast | Pa [([Pat], Ast)] | Proof Pred
data Constr = Constr String [(String, Type)]
data Pred = Pred String Type
data Qual = Qual [Pred] Type
instance Eq Type where
(TC s) == (TC t) = s == t
(TV s) == (TV t) = s == t
(TAp a b) == (TAp c d) = a == c && b == d
_ == _ = False
instance Eq Pred where (Pred s a) == (Pred t b) = s == t && a == b
data Instance = Instance
-- Type, e.g. Int for Eq Int.
Type
-- Dictionary name, e.g. "{Eq Int}"
String
-- Context.
[Pred]
-- Method definitions
(Map String Ast)
data Tycl = Tycl [String] [Instance]
data Neat = Neat
(Map String Tycl)
-- | Top-level definitions
[(String, Ast)]
-- | Typed ASTs, ready for compilation, including ADTs and methods,
-- e.g. (==), (Eq a => a -> a -> Bool, select-==)
[(String, (Qual, Ast))]
-- | Data constructor table.
(Map String [Constr])
-- | FFI declarations.
[(String, Type)]
-- | Exports.
[(String, String)]
-- | Module imports.
[String]
patVars = \case
PatLit _ -> []
PatVar s m -> s : maybe [] patVars m
PatCon _ args -> concat $ patVars <$> args
fvPro bound expr = case expr of
V s | not (elem s bound) -> [s]
A x y -> fvPro bound x `union` fvPro bound y
L s t -> fvPro (s:bound) t
Pa vsts -> foldr union [] $ map (\(vs, t) -> fvPro (concatMap patVars vs ++ bound) t) vsts
_ -> []
beta s a t = case t of
E _ -> t
V v -> if s == v then a else t
A x y -> A (beta s a x) (beta s a y)
L v u -> if s == v then t else L v $ beta s a u
instance Show Type where
showsPrec _ = \case
TC s -> (s++)
TV s -> (s++)
TAp (TAp (TC "->") a) b -> showParen True $ shows a . (" -> "++) . shows b
TAp a b -> showParen True $ shows a . (' ':) . shows b
instance Show Pred where
showsPrec _ (Pred s t) = (s++) . (' ':) . shows t . (" => "++)
instance Show Qual where
showsPrec _ (Qual ps t) = foldr (.) id (map shows ps) . shows t
instance Show Extra where
showsPrec _ = \case
Basic s -> (s++)
Const i -> shows i
ChrCon c -> shows c
StrCon s -> shows s
Link im s _ -> (im++) . ('.':) . (s++)
instance Show Pat where
showsPrec _ = \case
PatLit e -> shows e
PatVar s mp -> (s++) . maybe id ((('@':) .) . shows) mp
PatCon s ps -> (s++) . foldr (.) id (((' ':) .) . shows <$> ps)
showVar s@(h:_) = showParen (elem h ":!#$%&*+./<=>?@\\^|-~") (s++)
instance Show Ast where
showsPrec prec = \case
E e -> shows e
V s -> showVar s
A x y -> showParen (1 <= prec) $ shows x . (' ':) . showsPrec 1 y
L s t -> showParen True $ ('\\':) . (s++) . (" -> "++) . shows t
Pa vsts -> ('\\':) . showParen True (foldr (.) id $ intersperse (';':) $ map (\(vs, t) -> foldr (.) id (intersperse (' ':) $ map (showParen True . shows) vs) . (" -> "++) . shows t) vsts)
Proof p -> ("{Proof "++) . shows p . ("}"++)
typedAsts (Neat _ _ tas _ _ _ _) = tas
typeclasses (Neat tcs _ _ _ _ _ _) = tcs
dataCons (Neat _ _ _ dcs _ _ _) = dcs
typeVars = \case
TC _ -> []
TV v -> [v]
TAp x y -> typeVars x `union` typeVars y
depthFirstSearch = (foldl .) \relation st@(visited, sequence) vertex ->
if vertex `elem` visited then st else second (vertex:)
$ depthFirstSearch relation (vertex:visited, sequence) (relation vertex)
spanningSearch = (foldl .) \relation st@(visited, setSequence) vertex ->
if vertex `elem` visited then st else second ((:setSequence) . (vertex:))
$ depthFirstSearch relation (vertex:visited, []) (relation vertex)
scc ins outs = spanning . depthFirst where
depthFirst = snd . depthFirstSearch outs ([], [])
spanning = snd . spanningSearch ins ([], [])
-- Record fields.
-- Deriving `Eq`, `Show`.
module Parser where
import Base
import Ast
import Map
-- Parser.
data ParserState = ParserState
[(Char, (Int, Int))]
String
[Int]
(Map String (Int, Assoc))
readme (ParserState x _ _ _) = x
landin (ParserState _ x _ _) = x
indents (ParserState _ _ x _) = x
precs (ParserState _ _ _ x) = x
putReadme x (ParserState _ a b c) = ParserState x a b c
putLandin x (ParserState a _ b c) = ParserState a x b c
modIndents f (ParserState a b x c) = ParserState a b (f x) c
data Parser a = Parser (ParserState -> Either String (a, ParserState))
getParser (Parser p) = p
instance Functor Parser where fmap f x = pure f <*> x
instance Applicative Parser where
pure x = Parser \inp -> Right (x, inp)
(Parser f) <*> (Parser x) = Parser \inp -> do
(fun, t) <- f inp
(arg, u) <- x t
pure (fun arg, u)
instance Monad Parser where
return = pure
(Parser x) >>= f = Parser \inp -> do
(a, t) <- x inp
getParser (f a) t
instance Alternative Parser where
empty = bad ""
x <|> y = Parser \inp -> either (const $ getParser y inp) Right $ getParser x inp
getPrecs = Parser \st -> Right (precs st, st)
putPrecs ps = Parser \(ParserState a b c _) -> Right ((), ParserState a b c ps)
notFollowedBy p = do
saved <- Parser \pasta -> Right (pasta, pasta)
ret <- p *> pure (bad "") <|> pure (pure ())
Parser \_ -> Right ((), saved)
ret
parse f str = getParser f $ ParserState (rowcol str (1, 1)) [] [] $ singleton ":" (5, RAssoc) where
rowcol s rc = case s of
[] -> []
h:t -> (h, rc) : rowcol t (advanceRC (ord h) rc)
advanceRC n (r, c)
| n `elem` [10, 11, 12, 13] = (r + 1, 1)
| n == 9 = (r, (c + 8)`mod`8)
| True = (r, c + 1)
indentOf pasta = case readme pasta of
[] -> 1
(_, (_, c)):_ -> c
ins c pasta = putLandin (c:landin pasta) pasta
angle n pasta = case indents pasta of
m:ms | m == n -> ins ';' pasta
| n + 1 <= m -> ins '}' $ angle n $ modIndents tail pasta
_ -> pasta
curly n pasta = case indents pasta of
m:ms | m + 1 <= n -> ins '{' $ modIndents (n:) pasta
[] | 1 <= n -> ins '{' $ modIndents (n:) pasta
_ -> ins '{' . ins '}' $ angle n pasta
sat f = Parser \pasta -> case landin pasta of
c:t -> if f c then Right (c, putLandin t pasta) else Left "unsat"
[] -> case readme pasta of
[] -> case indents pasta of
[] -> Left "EOF"
m:ms | m /= 0 && f '}' -> Right ('}', modIndents tail pasta)
_ -> Left "unsat"
(h, _):t | f h -> let
p' = putReadme t pasta
in case h of
'}' -> case indents pasta of
0:ms -> Right (h, modIndents tail p')
_ -> Left "unsat"
'{' -> Right (h, modIndents (0:) p')
_ -> Right (h, p')
_ -> Left "unsat"
char c = sat (c ==)
rawSat f = Parser \pasta -> case readme pasta of
[] -> Left "EOF"
(h, _):t -> if f h then Right (h, putReadme t pasta) else Left "unsat"
eof = Parser \pasta -> case pasta of
ParserState [] [] _ _ -> Right ((), pasta)
_ -> badpos pasta "want eof"
comment = rawSat ('-' ==) *> some (rawSat ('-' ==)) *>
(rawSat isNewline <|> rawSat (not . isSymbol) *> many (rawSat $ not . isNewline) *> rawSat isNewline) *> pure True
spaces = isNewline <$> rawSat isSpace
whitespace = do
offside <- or <$> many (spaces <|> comment)
Parser \pasta -> Right ((), if offside then angle (indentOf pasta) pasta else pasta)
hexValue d
| d <= '9' = ord d - ord '0'
| d <= 'F' = 10 + ord d - ord 'A'
| d <= 'f' = 10 + ord d - ord 'a'
isNewline c = ord c `elem` [10, 11, 12, 13]
isSymbol = (`elem` "!#$%&*+./<=>?@\\^|-~:")
isSmall c = c <= 'z' && 'a' <= c || c == '_'
small = sat isSmall
large = sat \x -> (x <= 'Z') && ('A' <= x)
hexit = sat \x -> (x <= '9') && ('0' <= x)
|| (x <= 'F') && ('A' <= x)
|| (x <= 'f') && ('a' <= x)
digit = sat \x -> (x <= '9') && ('0' <= x)
decimal = foldl (\n d -> 10*n + ord d - ord '0') 0 <$> some digit
hexadecimal = foldl (\n d -> 16*n + hexValue d) 0 <$> some hexit
nameTailChar = small <|> large <|> digit <|> char '\''
nameTailed p = liftA2 (:) p $ many nameTailChar
escape = char '\\' *> (sat (`elem` "'\"\\") <|> char 'n' *> pure '\n' <|> char '0' *> pure (chr 0) <|> char 'x' *> (chr <$> hexadecimal))
tokOne delim = escape <|> rawSat (delim /=)
charSeq = mapM char
tokChar = between (char '\'') (char '\'') (tokOne '\'')
quoteStr = between (char '"') (char '"') $ many $ many (charSeq "\\&") *> tokOne '"'
quasiquoteStr = charSeq "[r|" *> quasiquoteBody
quasiquoteBody = charSeq "|]" *> pure [] <|> (:) <$> rawSat (const True) <*> quasiquoteBody
tokStr = quoteStr <|> quasiquoteStr
integer = char '0' *> (char 'x' <|> char 'X') *> hexadecimal <|> decimal
literal = lexeme . fmap E $ Const <$> integer <|> ChrCon <$> tokChar <|> StrCon <$> tokStr
varish = lexeme $ nameTailed small
bad s = Parser \pasta -> badpos pasta s
badpos pasta s = Left $ loc $ ": " ++ s where
loc = case readme pasta of
[] -> ("EOF"++)
(_, (r, c)):_ -> ("row "++) . shows r . (" col "++) . shows c
varId = do
s <- varish
if elem s
["export", "case", "class", "data", "default", "deriving", "do", "else", "foreign", "if", "import", "in", "infix", "infixl", "infixr", "instance", "let", "module", "newtype", "of", "then", "type", "where", "_"]
then bad $ "reserved: " ++ s else pure s
varSymish = lexeme $ (:) <$> sat (\c -> isSymbol c && c /= ':') <*> many (sat isSymbol)
varSym = lexeme $ do
s <- varSymish
if elem s ["..", "=", "\\", "|", "<-", "->", "@", "~", "=>"] then bad $ "reserved: " ++ s else pure s
conId = lexeme $ nameTailed large
conSymish = lexeme $ liftA2 (:) (char ':') $ many $ sat isSymbol
conSym = do
s <- conSymish
if elem s [":", "::"] then bad $ "reserved: " ++ s else pure s
special c = lexeme $ sat (c ==)
comma = special ','
semicolon = special ';'
lParen = special '('
rParen = special ')'
lBrace = special '{'
rBrace = special '}'
lSquare = special '['
rSquare = special ']'
backquote = special '`'
lexeme f = f <* whitespace
lexemePrelude = whitespace *>
Parser \pasta -> case getParser (res "module" <|> (:[]) <$> char '{') pasta of
Left _ -> Right ((), curly (indentOf pasta) pasta)
Right _ -> Right ((), pasta)
curlyCheck f = do
Parser \pasta -> Right ((), modIndents (0:) pasta)
r <- f
Parser \pasta -> let pasta' = modIndents tail pasta in case readme pasta of
[] -> Right ((), curly 0 pasta')
('{', _):_ -> Right ((), pasta')
(_, (_, col)):_ -> Right ((), curly col pasta')
pure r
conOf (Constr s _) = s
specialCase (h:_) = '|':conOf h
mkCase t cs = (specialCase cs,
( Qual [] $ arr t $ foldr arr (TV "case") $ map (\(Constr _ sts) -> foldr arr (TV "case") $ snd <$> sts) cs
, E $ Basic "I"))
mkStrs = snd . foldl (\(s, l) u -> ('@':s, s:l)) ("@", [])
scottEncode _ ":" _ = E $ Basic "CONS"
scottEncode vs s ts = foldr L (foldl (\a b -> A a (V b)) (V s) ts) (ts ++ vs)
scottConstr t cs (Constr s sts) = (s,
(Qual [] $ foldr arr t ts , scottEncode (map conOf cs) s $ mkStrs ts))
: [(field, (Qual [] $ t `arr` ft, L s $ foldl A (V s) $ inj $ proj field)) | (field, ft) <- sts, field /= ""]
where
ts = snd <$> sts
proj fd = foldr L (V fd) $ fst <$> sts
inj x = map (\(Constr s' _) -> if s' == s then x else V "undefined") cs
mkAdtDefs t cs = mkCase t cs : concatMap (scottConstr t cs) cs
mkFFIHelper n t acc = case t of
TC s -> acc
TAp (TC "IO") _ -> acc
TAp (TAp (TC "->") x) y -> L (show n) $ mkFFIHelper (n + 1) y $ A (V $ show n) acc
updateDcs cs dcs = foldr (\(Constr s _) m -> insert s cs m) dcs cs
addAdt t cs ders (Neat tycl fs typed dcs ffis ffes ims) = foldr derive ast ders where
ast = Neat tycl fs (mkAdtDefs t cs ++ typed) (updateDcs cs dcs) ffis ffes ims
derive "Eq" = addInstance "Eq" (mkPreds "Eq") t
[("==", Pa $ map eqCase cs
)]
derive "Show" = addInstance "Show" (mkPreds "Show") t
[("showsPrec", L "prec" $ Pa $ map showCase cs
)]
derive der = error $ "bad deriving: " ++ der
prec0 = E $ Const 0
showCase (Constr con args) = let as = show <$> [1..length args]
in ([PatCon con $ mkPatVar "" <$> as], case args of
[] -> A (V "++") (E $ StrCon con)
_ -> case con of
':':_ -> A (A (V "showParen") $ V "True") $ foldr1
(\f g -> A (A (V ".") f) g)
[ A (A (V "showsPrec") prec0) (V "1")
, A (V "++") (E $ StrCon $ ' ':con++" ")
, A (A (V "showsPrec") prec0) (V "2")
]
_ -> A (A (V "showParen") $ A (A (V "<=") prec0) $ V "prec")
$ A (A (V ".") $ A (V "++") (E $ StrCon con))
$ foldr (\f g -> A (A (V ".") f) g) (L "x" $ V "x")
$ map (\a -> A (A (V ".") (A (V ":") (E $ ChrCon ' '))) $ A (A (V "showsPrec") prec0) (V a)) as
)
mkPreds classId = Pred classId . TV <$> typeVars t
mkPatVar pre s = PatVar (pre ++ s) Nothing
eqCase (Constr con args) = let as = show <$> [1..length args]
in ([PatCon con $ mkPatVar "l" <$> as], Pa
[ ([PatCon con $ mkPatVar "r" <$> as], foldr (\x y -> (A (A (V "&&") x) y)) (V "True")
$ map (\n -> A (A (V "==") (V $ "l" ++ n)) (V $ "r" ++ n)) as)
, ([PatVar "_" Nothing], V "False")])
emptyTycl = Tycl [] []
addClass classId v (sigs, defs) (Neat tycl fs typed dcs ffis ffes ims) = let
vars = take (size sigs) $ show <$> [0..]
selectors = zipWith (\var (s, t) -> (s, (Qual [Pred classId v] t,
L "@" $ A (V "@") $ foldr L (V var) vars))) vars $ toAscList sigs
defaults = map (\(s, t) -> if member s sigs then ("{default}" ++ s, t) else error $ "bad default method: " ++ s) $ toAscList defs
Tycl ms is = maybe emptyTycl id $ mlookup classId tycl
tycl' = insert classId (Tycl (keys sigs) is) tycl
in if null ms then Neat tycl' (defaults ++ fs) (selectors ++ typed) dcs ffis ffes ims
else error $ "duplicate class: " ++ classId
addInstance classId ps ty ds (Neat tycl fs typed dcs ffis ffes ims) = let
Tycl ms is = maybe emptyTycl id $ mlookup classId tycl
tycl' = insert classId (Tycl ms $ Instance ty name ps (fromList ds):is) tycl
name = '{':classId ++ (' ':shows ty "}")
in Neat tycl' fs typed dcs ffis ffes ims
addFFI foreignname ourname t (Neat tycl fs typed dcs ffis ffes ims) = let
fn = A (E $ Basic "F") $ E $ Const $ length ffis
in Neat tycl fs ((ourname, (Qual [] t, mkFFIHelper 0 t fn)) : typed) dcs ((foreignname, t):ffis) ffes ims
addDefs ds (Neat tycl fs typed dcs ffis ffes ims) = Neat tycl (ds ++ fs) typed dcs ffis ffes ims
addImport im (Neat tycl fs typed dcs ffis exs ims) = Neat tycl fs typed dcs ffis exs (im:ims)
addExport e f (Neat tycl fs typed dcs ffis ffes ims) = Neat tycl fs typed dcs ffis ((e, f):ffes) ims
parseErrorRule = Parser \pasta -> case indents pasta of
m:ms | m /= 0 -> Right ('}', modIndents tail pasta)
_ -> badpos pasta "missing }"
res w@(h:_) = reservedSeq *> pure w <|> bad ("want \"" ++ w ++ "\"") where
reservedSeq = if elem w ["let", "where", "do", "of"]
then curlyCheck $ lexeme $ charSeq w *> notFollowedBy nameTailChar
else lexeme $ charSeq w *> notFollowedBy (if isSmall h then nameTailChar else sat isSymbol)
paren = between lParen rParen
braceSep f = between lBrace (rBrace <|> parseErrorRule) $ foldr ($) [] <$> sepBy ((:) <$> f <|> pure id) semicolon
maybeFix s x = if elem s $ fvPro [] x then A (V "fix") (L s x) else x
nonemptyTails [] = []
nonemptyTails xs@(x:xt) = xs : nonemptyTails xt
joinIsFail t = A (L "join#" t) (V "fail#")
addLets ls x = foldr triangle x components where
vs = fst <$> ls
ios = foldr (\(s, dsts) (ins, outs) ->
(foldr (\dst -> insertWith union dst [s]) ins dsts, insertWith union s dsts outs))
(Tip, Tip) $ map (\(s, t) -> (s, intersect (fvPro [] t) vs)) ls
components = scc (\k -> maybe [] id $ mlookup k $ fst ios) (\k -> maybe [] id $ mlookup k $ snd ios) vs
triangle names expr = let
tnames = nonemptyTails names
appem vs = foldl1 A $ V <$> vs
suball expr = foldl A (foldr L expr $ init names) $ appem <$> init tnames
redef tns expr = foldr L (suball expr) tns
in foldr (\(x:xt) t -> A (L x t) $ maybeFix x $ redef xt $ maybe undefined joinIsFail $ lookup x ls) (suball expr) tnames
data Assoc = NAssoc | LAssoc | RAssoc
instance Eq Assoc where
NAssoc == NAssoc = True
LAssoc == LAssoc = True
RAssoc == RAssoc = True
_ == _ = False
precOf s precTab = maybe 9 fst $ mlookup s precTab
assocOf s precTab = maybe LAssoc snd $ mlookup s precTab
opFold precTab f x xs = case xs of
[] -> pure x
(op, y):xt -> case find (\(op', _) -> assocOf op precTab /= assocOf op' precTab) xt of
Nothing -> case assocOf op precTab of
NAssoc -> case xt of
[] -> pure $ f op x y
y:yt -> bad "NAssoc repeat"
LAssoc -> pure $ foldl (\a (op, y) -> f op a y) x xs
RAssoc -> pure $ foldr (\(op, y) b -> \e -> f op e (b y)) id xs $ x
Just y -> bad "Assoc clash"
qconop = conSym <|> res ":" <|> between backquote backquote conId
qconsym = conSym <|> res ":"
op = qconsym <|> varSym <|> between backquote backquote (conId <|> varId)
con = conId <|> paren qconsym
var = varId <|> paren varSym
tycon = do
s <- conId
pure $ if s == "String" then TAp (TC "[]") (TC "Char") else TC s
aType =
lParen *>
( rParen *> pure (TC "()")
<|> (foldr1 (TAp . TAp (TC ",")) <$> sepBy1 _type comma) <* rParen)
<|> tycon
<|> TV <$> varId
<|> (lSquare *> (rSquare *> pure (TC "[]") <|> TAp (TC "[]") <$> (_type <* rSquare)))
bType = foldl1 TAp <$> some aType
_type = foldr1 arr <$> sepBy bType (res "->")
fixityDecl w a = do
res w
n <- lexeme integer
os <- sepBy op comma
precs <- getPrecs
putPrecs $ foldr (\o m -> insert o (n, a) m) precs os
fixity = fixityDecl "infix" NAssoc <|> fixityDecl "infixl" LAssoc <|> fixityDecl "infixr" RAssoc
cDecls = first fromList . second fromList . foldr ($) ([], []) <$> braceSep cDecl
cDecl = first . (:) <$> genDecl <|> second . (++) <$> defSemi
genDecl = (,) <$> var <*> (res "::" *> _type)
classDecl = res "class" *> (addClass <$> conId <*> (TV <$> varId) <*> (res "where" *> cDecls))
simpleClass = Pred <$> conId <*> _type
scontext = (:[]) <$> simpleClass <|> paren (sepBy simpleClass comma)
instDecl = res "instance" *>
((\ps cl ty defs -> addInstance cl ps ty defs) <$>
(scontext <* res "=>" <|> pure [])
<*> conId <*> _type <*> (res "where" *> braceDef))
letin = addLets <$> between (res "let") (res "in") braceDef <*> expr
ifthenelse = (\a b c -> A (A (A (V "if") a) b) c) <$>
(res "if" *> expr) <*> (res "then" *> expr) <*> (res "else" *> expr)
listify = foldr (\h t -> A (A (V ":") h) t) (V "[]")
alts = joinIsFail . Pa <$> braceSep ((\x y -> ([x], y)) <$> pat <*> guards "->")
cas = flip A <$> between (res "case") (res "of") expr <*> alts
lamCase = curlyCheck (res "case") *> alts
lam = res "\\" *> (lamCase <|> liftA2 onePat (some apat) (res "->" *> expr))
flipPairize y x = A (A (V ",") x) y
moreCommas = foldr1 (A . A (V ",")) <$> sepBy1 expr comma
thenComma = comma *> ((flipPairize <$> moreCommas) <|> pure (A (V ",")))
parenExpr = (&) <$> expr <*> (((\v a -> A (V v) a) <$> op) <|> thenComma <|> pure id)
rightSect = ((\v a -> L "@" $ A (A (V v) $ V "@") a) <$> (op <|> (:"") <$> comma)) <*> expr
section = lParen *> (parenExpr <* rParen <|> rightSect <* rParen <|> rParen *> pure (V "()"))
maybePureUnit = maybe (V "pure" `A` V "()") id
stmt = (\p x -> Just . A (V ">>=" `A` x) . onePat [p] . maybePureUnit) <$> pat <*> (res "<-" *> expr)
<|> (\x -> Just . maybe x (\y -> (V ">>=" `A` x) `A` (L "_" y))) <$> expr
<|> (\ds -> Just . addLets ds . maybePureUnit) <$> (res "let" *> braceDef)
doblock = res "do" *> (maybePureUnit . foldr ($) Nothing <$> braceSep stmt)
compQual =
(\p xs e -> A (A (V "concatMap") $ onePat [p] e) xs)
<$> pat <*> (res "<-" *> expr)
<|> (\b e -> A (A (A (V "if") b) e) $ V "[]") <$> expr
<|> addLets <$> (res "let" *> braceDef)
sqExpr = between lSquare rSquare $
((&) <$> expr <*>
( res ".." *>
( (\hi lo -> (A (A (V "enumFromTo") lo) hi)) <$> expr
<|> pure (A (V "enumFrom"))
)
<|> res "|" *>
((. A (V "pure")) . foldr (.) id <$> sepBy1 compQual comma)
<|> (\t h -> listify (h:t)) <$> many (comma *> expr)
)
)
<|> pure (V "[]")
fbind = A <$> (V <$> var) <*> (res "=" *> expr)
fBinds v = (do
fbs <- between lBrace rBrace $ sepBy1 fbind comma
pure $ A (E $ Basic "{=") $ foldr A (E $ Basic "=}") $ v:fbs
) <|> pure v
atom = ifthenelse <|> doblock <|> letin <|> sqExpr <|> section
<|> cas <|> lam <|> (paren comma *> pure (V ","))
<|> V <$> (con <|> var) <|> literal
>>= fBinds
aexp = foldl1 A <$> some atom
withPrec precTab n p = p >>= \s ->
if n == precOf s precTab then pure s else Parser $ const $ Left ""
exprP n = if n <= 9
then getPrecs >>= \precTab
-> exprP (succ n) >>= \a
-> many ((,) <$> withPrec precTab n op <*> exprP (succ n)) >>= \as
-> opFold precTab (\op x y -> A (A (V op) x) y) a as
else aexp
expr = exprP 0
gcon = conId <|> paren (qconsym <|> (:"") <$> comma) <|> (lSquare *> rSquare *> pure "[]")
apat = PatVar <$> var <*> (res "@" *> (Just <$> apat) <|> pure Nothing)
<|> flip PatVar Nothing <$> (res "_" *> pure "_")
<|> flip PatCon [] <$> gcon
<|> PatLit <$> literal
<|> foldr (\h t -> PatCon ":" [h, t]) (PatCon "[]" [])
<$> between lSquare rSquare (sepBy pat comma)
<|> paren (foldr1 pairPat <$> sepBy1 pat comma <|> pure (PatCon "()" []))
where pairPat x y = PatCon "," [x, y]
binPat f x y = PatCon f [x, y]
patP n = if n <= 9
then getPrecs >>= \precTab
-> patP (succ n) >>= \a
-> many ((,) <$> withPrec precTab n qconop <*> patP (succ n)) >>= \as
-> opFold precTab binPat a as
else PatCon <$> gcon <*> many apat <|> apat
pat = patP 0
maybeWhere p = (&) <$> p <*> (res "where" *> (addLets <$> braceDef) <|> pure id)
guards s = maybeWhere $ res s *> expr <|> foldr ($) (V "join#") <$> some ((\x y -> case x of
V "True" -> \_ -> y
_ -> A (A (A (V "if") x) y)
) <$> (res "|" *> expr) <*> (res s *> expr))
onePat vs x = joinIsFail $ Pa [(vs, x)]
defOnePat vs x = Pa [(vs, x)]
opDef x f y rhs = [(f, defOnePat [x, y] rhs)]
leftyPat p expr = case pvars of
[] -> []
(h:t) -> let gen = '@':h in
(gen, expr):map (\v -> (v, A (Pa [([p], V v)]) $ V gen)) pvars
where
pvars = filter (/= "_") $ patVars p
def = liftA2 (\l r -> [(l, r)]) var (liftA2 defOnePat (many apat) $ guards "=")
<|> (pat >>= \x -> opDef x <$> varSym <*> pat <*> guards "=" <|> leftyPat x <$> guards "=")
coalesce = \case
[] -> []
h@(s, x):t -> case t of
[] -> [h]
(s', x'):t' -> let
f (Pa vsts) (Pa vsts') = Pa $ vsts ++ vsts'
f _ _ = error "bad multidef"
in if s == s' then coalesce $ (s, f x x'):t' else h:coalesce t
defSemi = coalesce . concat <$> sepBy1 def (some semicolon)
braceDef = concat <$> braceSep defSemi
simpleType c vs = foldl TAp (TC c) (map TV vs)
conop = conSym <|> between backquote backquote conId
fieldDecl = (\vs t -> map (, t) vs) <$> sepBy1 var comma <*> (res "::" *> _type)
constr = (\x c y -> Constr c [("", x), ("", y)]) <$> aType <*> conop <*> aType
<|> Constr <$> conId <*>
( concat <$> between lBrace rBrace (fieldDecl `sepBy` comma)
<|> map ("",) <$> many aType)
dclass = conId
_deriving = (res "deriving" *> ((:[]) <$> dclass <|> paren (dclass `sepBy` comma))) <|> pure []
adt = addAdt <$> between (res "data") (res "=") (simpleType <$> conId <*> many varId) <*> sepBy constr (res "|") <*> _deriving
impDecl = addImport <$> (res "import" *> conId)
topdecls = braceSep
$ adt
<|> classDecl
<|> instDecl
<|> res "foreign" *>
( res "import" *> var *> (addFFI <$> lexeme tokStr <*> var <*> (res "::" *> _type))
<|> res "export" *> var *> (addExport <$> lexeme tokStr <*> var)
)
<|> addDefs <$> defSemi
<|> fixity *> pure id
<|> impDecl
haskell = between lexemePrelude eof $ some $ (,) <$> (res "module" *> conId <* res "where" <|> pure "Main") <*> topdecls
parseProgram s = fst <$> parse haskell s
-- Record fields.
module Typer where
import Base
import Map
import Ast
import Parser
import Unify
app01 s x y = maybe (A (L s x) y) snd $ go x where
go expr = case expr of
E _ -> Just (False, expr)
V v -> Just $ if s == v then (True, y) else (False, expr)
A l r -> do
(a, l') <- go l
(b, r') <- go r
if a && b then Nothing else pure (a || b, A l' r')
L v t -> if v == s then Just (False, expr) else second (L v) <$> go t
optiApp t = case t of
A x y -> let
x' = optiApp x
y' = optiApp y
in case x' of
L s v -> app01 s v y'
_ -> A x' y'
L s x -> L s (optiApp x)
_ -> t
-- Pattern compiler.
findCon dcs s = foldr (<|>) Nothing $ mlookup s <$> dcs
rewritePats dcs = \case
[] -> pure $ V "join#"
vsxs@((as0, _):_) -> case as0 of
[] -> pure $ foldr1 (A . L "join#") $ snd <$> vsxs
_ -> do
let k = length as0
n <- get
put $ n + k
let vs@(vh:vt) = take k $ (`shows` "#") <$> [n..]
cs <- flip mapM vsxs \(a:at, x) -> (a,) <$> foldM (\b (p, v) -> rewriteCase dcs v Tip [(p, b)]) x (zip at vt)
flip (foldr L) vs <$> rewriteCase dcs vh Tip cs
patEq lit b x y = A (L "join#" $ A (A (A (V "if") (A (A (V "==") lit) b)) x) $ V "join#") y
rewriteCase dcs caseVar tab = \case
[] -> flush $ V "join#"
((v, x):rest) -> go v x rest
where
rec = rewriteCase dcs caseVar
go v x rest = case v of
PatLit lit -> patEq lit (V caseVar) x <$> rec Tip rest >>= flush
PatVar s m -> let x' = beta s (V caseVar) x in case m of
Nothing -> A (L "join#" x') <$> rec Tip rest >>= flush
Just v' -> go v' x' rest
PatCon con args -> rec (insertWith (flip (.)) con ((args, x):) tab) rest
flush onFail = case toAscList tab of
[] -> pure onFail
-- TODO: Check rest of `tab` lies in cs.
(firstC, _):_ -> do
let cs = maybe undefined id $ findCon dcs firstC
jumpTable <- mapM (\(Constr s ts) -> case mlookup s tab of
Nothing -> pure $ foldr L (V "join#") $ const "_" <$> ts
Just f -> rewritePats dcs $ f []
) cs
pure $ A (L "join#" $ foldl A (A (V $ specialCase cs) $ V caseVar) jumpTable) onFail
findField dcs f = case [(con, fields) | tab <- dcs, (_, cons) <- toAscList tab, Constr con fields <- cons, (f', _) <- fields, f == f'] of
[] -> error $ "no such field: " ++ f
h:_ -> h
resolveFieldBinds dcs t = go t where
go t = case t of
E _ -> t
V _ -> t
A (E (Basic "{=")) (A rawExpr fbsAst) -> let
expr = go rawExpr
fromAst t = case t of
A (A (V f) body) rest -> (f, go body):fromAst rest
E (Basic "=}") -> []
fbs@((firstField, _):_) = fromAst fbsAst
(con, fields) = findField dcs firstField
cs = maybe undefined id $ findCon dcs con
newValue = foldl A (V con) [maybe (V $ "[old]"++f) id $ lookup f fbs | (f, _) <- fields]
initValue = foldl A expr [maybe (V "undefined") id $ lookup f fbs | (f, _) <- fields]
updater = foldr L newValue $ ("[old]"++) . fst <$> fields
inj x = map (\(Constr con' _) -> if con' == con then x else V "undefined") cs
allPresent = all (`elem` (fst <$> fields)) $ fst <$> fbs
isCon = case expr of
V (h:_) -> 'A' <= h && h <= 'Z'
_ -> False
in if allPresent
then if isCon then initValue else foldl A (A (V $ specialCase cs) expr) $ inj updater
else error "bad fields in update"
A x y -> A (go x) (go y)
L s x -> L s $ go x
secondM f (a, b) = (a,) <$> f b
patternCompile dcs t = optiApp $ resolveFieldBinds dcs $ evalState (go t) 0 where
go t = case t of
E _ -> pure t
V _ -> pure t
A x y -> liftA2 A (go x) (go y)
L s x -> L s <$> go x
Pa vsxs -> mapM (secondM go) vsxs >>= rewritePats dcs
-- Type inference.
instantiate' t n tab = case t of
TC s -> ((t, n), tab)
TV s -> case lookup s tab of
Nothing -> let va = TV $ show n in ((va, n + 1), (s, va):tab)
Just v -> ((v, n), tab)
TAp x y -> let
((t1, n1), tab1) = instantiate' x n tab
((t2, n2), tab2) = instantiate' y n1 tab1
in ((TAp t1 t2, n2), tab2)
instantiatePred (Pred s t) ((out, n), tab) = first (first ((:out) . Pred s)) (instantiate' t n tab)
instantiate (Qual ps t) n = first (Qual ps1) $ fst $ instantiate' t n1 tab where
((ps1, n1), tab) = foldr instantiatePred (([], n), []) ps
proofApply sub a = case a of
Proof (Pred cl ty) -> Proof (Pred cl $ apply sub ty)
A x y -> A (proofApply sub x) (proofApply sub y)
L s t -> L s $ proofApply sub t
_ -> a
typeAstSub sub (t, a) = (apply sub t, proofApply sub a)
infer typed loc ast csn@(cs, n) = case ast of
E x -> Right $ case x of
Const _ -> ((TC "Int", ast), csn)
ChrCon _ -> ((TC "Char", ast), csn)
StrCon _ -> ((TAp (TC "[]") (TC "Char"), ast), csn)
Link im s q -> insta q
V s -> maybe (Left $ "undefined: " ++ s) Right
$ (\t -> ((t, ast), csn)) <$> lookup s loc
<|> insta . fst <$> mlookup s typed
A x y -> infer typed loc x (cs, n + 1) >>=
\((tx, ax), csn1) -> infer typed loc y csn1 >>=
\((ty, ay), (cs2, n2)) -> unify tx (arr ty va) cs2 >>=
\cs -> Right ((va, A ax ay), (cs, n2))
L s x -> first (\(t, a) -> (arr va t, L s a)) <$> infer typed ((s, va):loc) x (cs, n + 1)
where
va = TV $ show n
insta ty = ((ty1, foldl A ast (map Proof preds)), (cs, n1))
where (Qual preds ty1, n1) = instantiate ty n
findInstance tycl qn@(q, n) p@(Pred cl ty) insts = case insts of
[] -> let v = '*':show n in Right (((p, v):q, n + 1), V v)
(modName, Instance h name ps _):rest -> case match h ty of
Nothing -> findInstance tycl qn p rest
Just subs -> foldM (\(qn1, t) (Pred cl1 ty1) -> second (A t)
<$> findProof tycl (Pred cl1 $ apply subs ty1) qn1) (qn, if modName == "" then V name else E $ Link modName name undefined) ps
findProof tycl pred@(Pred classId t) psn@(ps, n) = case lookup pred ps of
Nothing -> findInstance tycl psn pred $ tycl classId
Just s -> Right (psn, V s)
prove tycl psn a = case a of
Proof pred -> findProof tycl pred psn
A x y -> prove tycl psn x >>= \(psn1, x1) ->
second (A x1) <$> prove tycl psn1 y
L s t -> second (L s) <$> prove tycl psn t
_ -> Right (psn, a)
data Dep a = Dep ([String] -> Either String ([String], a))
instance Functor Dep where
fmap f = \(Dep mf) -> Dep \g -> do
(g', x) <- mf g
pure (g', f x)
instance Applicative Dep where
pure x = Dep \g -> Right (g, x)
(Dep mf) <*> (Dep mx) = Dep \g -> do
(g', f) <- mf g
(g'', x) <- mx g'
pure (g'', f x)
addDep s = Dep \deps -> Right (if s `elem` deps then deps else s : deps, ())
badDep s = Dep $ const $ Left s
runDep (Dep f) = f []
astLink typed locals imps mods ast = runDep $ go [] ast where
go bound ast = case ast of
V s
| elem s bound -> pure ast
| member s locals -> case findImportSym imps mods s of
[] -> (if member s typed then pure () else addDep s) *> pure ast
_ -> badDep $ "ambiguous: " ++ s
| True -> case findImportSym imps mods s of
[] -> badDep $ "missing: " ++ s
[(im, t)] -> pure $ E $ Link im s t
_ -> badDep $ "ambiguous: " ++ s
A x y -> A <$> go bound x <*> go bound y
L s t -> L s <$> go (s:bound) t
_ -> pure ast
forFree cond f bound t = case t of
E _ -> t
V s -> if (not $ s `elem` bound) && cond s then f t else t
A x y -> A (rec bound x) (rec bound y)
L s t' -> L s $ rec (s:bound) t'
where rec = forFree cond f
inferno tycl typed defmap syms = let
loc = zip syms $ TV . (' ':) <$> syms
principal (acc, (subs, n)) s = do
expr <- maybe (Left $ "missing: " ++ s) Right (mlookup s defmap)
((t, a), (ms, n1)) <- infer typed loc expr (subs, n)
cs <- unify (TV (' ':s)) t ms
Right ((s, (t, a)):acc, (cs, n1))
gatherPreds (acc, psn) (s, (t, a)) = do
(psn, a) <- prove tycl psn a
pure ((s, (t, a)):acc, psn)
in do
(stas, (soln, _)) <- foldM principal ([], (Tip, 0)) syms
stas <- pure $ second (typeAstSub soln) <$> stas
(stas, (ps, _)) <- foldM gatherPreds ([], ([], 0)) $ second (typeAstSub soln) <$> stas
let
preds = fst <$> ps
dicts = snd <$> ps
applyDicts (s, (t, a)) = (s, (Qual preds t,
foldr L (forFree (`elem` syms) (\t -> foldl A t $ V <$> dicts) [] a) dicts))
pure $ map applyDicts stas
findImportSym imps mods s = concat [maybe [] (\(t, _) -> [(im, t)]) $ mlookup s qas | im <- imps, let qas = fst $ mods ! im]
inferDefs tycl defs typed = do
let
insertUnique m (s, (_, t)) = case mlookup s m of
Nothing -> case mlookup s typed of
Nothing -> Right $ insert s t m
_ -> Left $ "reserved: " ++ s
_ -> Left $ "duplicate: " ++ s
addEdges (sym, (deps, _)) (ins, outs) = (foldr (\dep es -> insertWith union dep [sym] es) ins deps, insertWith union sym deps outs)
graph = foldr addEdges (Tip, Tip) defs
defmap <- foldM insertUnique Tip defs
let
ins k = maybe [] id $ mlookup k $ fst graph
outs k = maybe [] id $ mlookup k $ snd graph
inferComponent typed syms = foldr (uncurry insert) typed <$> inferno tycl typed defmap syms
foldM inferComponent typed $ scc ins outs $ keys defmap
dictVars ps n = (zip ps $ map (('*':) . show) [n..], n + length ps)
inferTypeclasses tycl typeOfMethod typed dcs linker ienv = foldM perClass typed $ toAscList ienv where
perClass typed (classId, Tycl sigs insts) = foldM perInstance typed insts where
perInstance typed (Instance ty name ps idefs) = do
let
dvs = map snd $ fst $ dictVars ps 0
perMethod s = do
let rawExpr = maybe (V $ "{default}" ++ s) id $ mlookup s idefs
expr <- snd <$> linker (patternCompile dcs rawExpr)
(ta, (sub, n)) <- either (Left . (name++) . (" "++) . (s++) . (": "++)) Right
$ infer typed [] expr (Tip, 0)
let
(tx, ax) = typeAstSub sub ta
-- e.g. qc = Eq a => a -> a -> Bool
-- We instantiate: Eq a1 => a1 -> a1 -> Bool.
qc = typeOfMethod s
(Qual [Pred _ headT] tc, n1) = instantiate qc n
-- Mix the predicates `ps` with the type of `headT`, applying a
-- substitution such as (a1, [a]) so the variable names match.
-- e.g. Eq a => [a] -> [a] -> Bool
Just subc = match headT ty
(Qual ps2 t2, n2) = instantiate (Qual ps $ apply subc tc) n1
case match tx t2 of
Nothing -> Left "class/instance type conflict"
Just subx -> do
((ps3, _), tr) <- prove tycl (dictVars ps2 0) (proofApply subx ax)
if length ps2 /= length ps3
then Left $ ("want context: "++) . (foldr (.) id $ shows . fst <$> ps3) $ name
else pure tr
ms <- mapM perMethod sigs
pure $ insert name (Qual [] $ TC "DICTIONARY", flip (foldr L) dvs $ L "@" $ foldl A (V "@") ms) typed
primAdts =
[ (TC "()", [Constr "()" []])
, (TC "Bool", [Constr "True" [], Constr "False" []])
, (TAp (TC "[]") (TV "a"), [Constr "[]" [], Constr ":" $ map ("",) [TV "a", TAp (TC "[]") (TV "a")]])
, (TAp (TAp (TC ",") (TV "a")) (TV "b"), [Constr "," $ map ("",) [TV "a", TV "b"]])
]
prims = let
ro = E . Basic
dyad s = TC s `arr` (TC s `arr` TC s)
bin s = A (ro "Q") (ro s)
in map (second (first $ Qual [])) $
[ ("intEq", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "EQ"))
, ("intLE", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "LE"))
, ("charEq", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "EQ"))
, ("charLE", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "LE"))
, ("fix", (arr (arr (TV "a") (TV "a")) (TV "a"), ro "Y"))
, ("if", (arr (TC "Bool") $ arr (TV "a") $ arr (TV "a") (TV "a"), ro "I"))
, ("chr", (arr (TC "Int") (TC "Char"), ro "I"))
, ("ord", (arr (TC "Char") (TC "Int"), ro "I"))
, ("ioBind", (arr (TAp (TC "IO") (TV "a")) (arr (arr (TV "a") (TAp (TC "IO") (TV "b"))) (TAp (TC "IO") (TV "b"))), ro "C"))
, ("ioPure", (arr (TV "a") (TAp (TC "IO") (TV "a")), ro "V"))
, ("primitiveError", (arr (TAp (TC "[]") (TC "Char")) (TV "a"), ro "ERR"))
, ("newIORef", (arr (TV "a") (TAp (TC "IO") (TAp (TC "IORef") (TV "a"))), ro "NEWREF"))
, ("readIORef", (arr (TAp (TC "IORef") (TV "a")) (TAp (TC "IO") (TV "a")),
A (ro "T") (ro "READREF")))
, ("writeIORef", (arr (TAp (TC "IORef") (TV "a")) (arr (TV "a") (TAp (TC "IO") (TC "()"))),
A (A (ro "R") (ro "WRITEREF")) (ro "B")))
, ("exitSuccess", (TAp (TC "IO") (TV "a"), ro "END"))
, ("unsafePerformIO", (arr (TAp (TC "IO") (TV "a")) (TV "a"), A (A (ro "C") (A (ro "T") (ro "END"))) (ro "K")))
, ("join#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess")))
, ("fail#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess")))
]
++ map (\(s, v) -> (s, (dyad "Int", bin v)))
[ ("intAdd", "ADD")
, ("intSub", "SUB")
, ("intMul", "MUL")
, ("intDiv", "DIV")
, ("intMod", "MOD")
, ("intQuot", "DIV")
, ("intRem", "MOD")
]
neatImportPrim = Neat Tip [] [] Tip [] [] ["#"]
tabulateModules mods = foldM ins Tip mods where
go = foldr ($) neatImportPrim
ins tab (k, prog) = case mlookup k tab of
Nothing -> Right $ insert k (go prog) tab
Just _ -> Left $ "duplicate module: " ++ k
inferModule tab acc name = case mlookup name acc of
Nothing -> do
let
Neat rawIenv defs typedList adtTab ffis ffes imps = tab ! name
typed = fromList typedList
fillSigs (cl, Tycl sigs is) = (cl,) $ case sigs of
[] -> Tycl (findSigs cl) is
_ -> Tycl sigs is
findSigs cl = maybe (error $ "no sigs: " ++ cl) id $ find (not . null) [maybe [] (\(Tycl sigs _) -> sigs) $ mlookup cl $ typeclasses (tab ! im) | im <- imps]
ienv = fromList $ fillSigs <$> toAscList rawIenv
locals = fromList $ map (, ()) $ (fst <$> typedList) ++ (fst <$> defs)
insts im (Tycl _ is) = (im,) <$> is
classes im = if im == "" then ienv else typeclasses $ tab ! im
tycl classId = concat [maybe [] (insts im) $ mlookup classId $ classes im | im <- "":imps]
dcs = adtTab : map (dataCons . (tab !)) imps
typeOfMethod s = maybe undefined id $ foldr (<|>) (fst <$> mlookup s typed) [fmap fst $ lookup s $ typedAsts $ tab ! im | im <- imps]
acc' <- foldM (inferModule tab) acc imps
let linker = astLink typed locals imps acc'
depdefs <- mapM (\(s, t) -> (s,) <$> linker (patternCompile dcs t)) defs
typed <- inferDefs tycl depdefs typed
typed <- inferTypeclasses tycl typeOfMethod typed dcs linker ienv
Right $ insert name (typed, (ffis, ffes)) acc'
Just _ -> Right acc
untangle s = do
tab <- insert "#" neatPrim <$> (parseProgram s >>= tabulateModules)
foldM (inferModule tab) Tip $ keys tab
neatPrim = foldr (\(a, b) -> addAdt a b []) (Neat Tip [] prims Tip [] [] []) primAdts
We replace a list with a Map for faster type unification.
-- Replace list with map.
module Unify where
import Base
import Map
import Ast
apply sub t = case t of
TC v -> t
TV v -> maybe t id $ mlookup v sub
TAp a b -> TAp (apply sub a) (apply sub b)
(@@) s1 s2 = foldr (\(k, v) m -> insert k v m) (apply s1 <$> s2) $ toAscList s1
occurs s t = case t of
TC v -> False
TV v -> s == v
TAp a b -> occurs s a || occurs s b
varBind s t = case t of
TC v -> Right $ singleton s t
TV v -> Right $ if v == s then Tip else singleton s t
TAp a b -> if occurs s t then Left "occurs check" else Right $ singleton s t
ufail t u = Left $ ("unify fail: "++) . shows t . (" vs "++) . shows u $ ""
mgu t u = case t of
TC a -> case u of
TC b -> if a == b then Right Tip else ufail t u
TV b -> varBind b t
TAp a b -> ufail t u
TV a -> varBind a u
TAp a b -> case u of
TC b -> ufail t u
TV b -> varBind b t
TAp c d -> mgu a c >>= unify b d
unify a b s = (@@ s) <$> mgu (apply s a) (apply s b)
merge s1 s2 = foldM go s2 $ toAscList s1 where
go subs (v, t) = case mlookup v s2 of
Nothing -> Just $ insert v t subs
Just _ | apply s1 (TV v) == apply s2 (TV v) -> Just subs
| True -> Nothing
match h t = case h of
TC a -> case t of
TC b | a == b -> Just Tip
_ -> Nothing
TV a -> Just $ singleton a t
TAp a b -> case t of
TAp c d -> case match a c of
Nothing -> Nothing
Just ac -> case match b d of
Nothing -> Nothing
Just bd -> merge ac bd
_ -> Nothing
Recall we require a fixity declaration to precede the use of its corresponding operator, which forces us to concatenate module sources in a particular order. We remove this wart by adding a new phase. Once done, not only may we paste together modules in any order, but we may also declare fixities anywhere within a module.
During parsing, operators have the same precedence. When a chain of two or more
appear in a row, we abuse the syntax tree to store them in a right-associative
list, for example: [1 + 2, * 3, - 4, + 5].
For patterns, we use the list field of a PatCon value; a made-up data
constructor "{+" indicates the beginning of such a list. Expressions are
clumsier; we bookend chains with L "(" and V ")", and fashion a list out of
A and V nodes.
Later, once all fixity declarations are known, we traverse the syntax tree, and
we re-associate each specially marked infix chain. The algorithm starts with
the first binary infix expression, that is, two operands and one operator such
as 1 + 2. For each operator and operand we add on the right, we walk down the
right spine of the current syntax tree until we reach a node of higher
precedence; leaf nodes are considered to have maximum precedence. Then we
insert the operator and operand at this point. We also check for illegal infix
operator conflicts.
The code is messy due to a couple of wrinkles. Firstly, we have two distinct ad hoc representations of lists for holding infix chains. Secondly, we temporarily store the AST being reshaped in one-off tree structures.
However, we’re still cheating: we maintain one giant fixity declaration table for all operators across all modules, which relies on operators being distinct. Also, we only allow top-level fixity declarations. We could add support for scoped fixity declarations with yet more ad hoc encodings that we later use to create scoped fixity lookup tables that override the global ones.
We fix the problem with foreign imports across multiple modules. In the lone-module days, we numbered the imports as we parsed the source. Now, the numbering must be consistent across all modules.
In the spirit of incremental compilation, we replace the number of an import with its name in the syntax tree, which we map to a number during our code generation that corresponds to linking.
We reuse the Link data constructor for this. The special {foreign} module
indicates the function name is foreign.
We also check for name conflicts among foreign imports and exports.
We take advantage of our new ability to derive Eq and Show instances,
and also name the fields of the Neat data type.
The Haskell report describes layout parsing as a separate phase, and we had followed this closely in our first implementation of the feature.
It turns out to be less code to inline the various bits and pieces of the layout logic in the parser. We do wind up with a touch more complexity, as comments and whitespace must be parsed in a different code path, but it’s manageable.
For parsing one of let where do of, and also \case since we act as if the
LambdaCase is enabled, the curlyCheck function temporarily disables the
angle-bracket indentation rule by placing an indent value of 0 at the head of
the indents list.
-- Use `deriving`.
-- Change `isEOF` and `getChar` to behave more like Haskell's.
module Base where
infixr 9 .
infixl 7 * , `div` , `mod`
infixl 6 + , -
infixr 5 ++
infixl 4 <*> , <$> , <* , *>
infix 4 == , /= , <= , < , >= , >
infixl 3 && , <|>
infixl 2 ||
infixl 1 >> , >>=
infixr 1 =<<
infixr 0 $
class Functor f where fmap :: (a -> b) -> f a -> f b
class Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
(<$>) = fmap
liftA2 f x y = f <$> x <*> y
(>>) f g = f >>= \_ -> g
(=<<) = flip (>>=)
class Eq a where (==) :: a -> a -> Bool
instance Eq () where () == () = True
instance Eq Bool where
True == True = True
False == False = True
_ == _ = False
instance (Eq a, Eq b) => Eq (a, b) where
(a1, b1) == (a2, b2) = a1 == a2 && b1 == b2
instance Eq a => Eq [a] where
xs == ys = case xs of
[] -> case ys of
[] -> True
_ -> False
x:xt -> case ys of
[] -> False
y:yt -> x == y && xt == yt
instance Eq Int where (==) = intEq
instance Eq Char where (==) = charEq
($) f x = f x
id x = x
const x y = x
flip f x y = f y x
(&) x f = f x
class Ord a where
(<=) :: a -> a -> Bool
x <= y = case compare x y of
LT -> True
EQ -> True
GT -> False
compare :: a -> a -> Ordering
compare x y = if x <= y then if y <= x then EQ else LT else GT
instance Ord Int where (<=) = intLE
instance Ord Char where (<=) = charLE
data Ordering = LT | GT | EQ deriving (Eq, Show)
instance Ord a => Ord [a] where
xs <= ys = case xs of
[] -> True
x:xt -> case ys of
[] -> False
y:yt -> if x <= y then if y <= x then xt <= yt else True else False
compare xs ys = case xs of
[] -> case ys of
[] -> EQ
_ -> LT
x:xt -> case ys of
[] -> GT
y:yt -> if x <= y then if y <= x then compare xt yt else LT else GT
data Maybe a = Nothing | Just a deriving (Eq, Show)
data Either a b = Left a | Right b deriving (Eq, Show)
fst (x, y) = x
snd (x, y) = y
uncurry f (x, y) = f x y
first f (x, y) = (f x, y)
second f (x, y) = (x, f y)
not a = if a then False else True
x /= y = not $ x == y
(.) f g x = f (g x)
(||) f g = if f then True else g
(&&) f g = if f then g else False
take 0 xs = []
take _ [] = []
take n (h:t) = h : take (n - 1) t
drop n xs | n <= 0 = xs
drop _ [] = []
drop n (_:xs) = drop (n-1) xs
splitAt n xs = (take n xs, drop n xs)
maybe n j m = case m of Nothing -> n; Just x -> j x
instance Functor Maybe where fmap f = maybe Nothing (Just . f)
instance Applicative Maybe where pure = Just ; mf <*> mx = maybe Nothing (\f -> maybe Nothing (Just . f) mx) mf
instance Monad Maybe where return = Just ; mf >>= mg = maybe Nothing mg mf
instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x
foldr c n = \case [] -> n; h:t -> c h $ foldr c n t
length = foldr (\_ n -> n + 1) 0
mapM f = foldr (\a rest -> liftA2 (:) (f a) rest) (pure [])
mapM_ f = foldr ((>>) . f) (pure ())
foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0
when x y = if x then y else pure ()
unless x y = if x then pure () else y
error = primitiveError
undefined = error "undefined"
foldr1 c l@(h:t) = maybe undefined id $ foldr (\x m -> Just $ maybe x (c x) m) Nothing l
foldl f a bs = foldr (\b g x -> g (f x b)) (\x -> x) bs a
foldl1 f (h:t) = foldl f h t
elem k xs = foldr (\x t -> x == k || t) False xs
find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs
(++) = flip (foldr (:))
concat = foldr (++) []
map = flip (foldr . ((:) .)) []
head (h:_) = h
tail (_:t) = t
xs!!0 = head xs
xs!!n = tail xs!!(n - 1)
replicate 0 _ = []
replicate n x = x : replicate (n - 1) x
null [] = True
null _ = False
reverse = foldl (flip (:)) []
dropWhile _ [] = []
dropWhile p xs@(x:xt)
| p x = dropWhile p xt
| True = xs
span _ [] = ([], [])
span p xs@(x:xt)
| p x = first (x:) $ span p xt
| True = ([],xs)
break p = span (not . p)
isSpace c = elem (ord c) [32, 9, 10, 11, 12, 13, 160]
words s = case dropWhile isSpace s of
"" -> []
s' -> w : words s'' where (w, s'') = break isSpace s'
instance Functor [] where fmap = map
instance Applicative [] where pure = (:[]); f <*> x = concatMap (<$> x) f
instance Monad [] where return = (:[]); (>>=) = flip concatMap
concatMap = (concat .) . map
lookup s = foldr (\(k, v) t -> if s == k then Just v else t) Nothing
filter f = foldr (\x xs -> if f x then x:xs else xs) []
union xs ys = foldr (\y acc -> (if elem y acc then id else (y:)) acc) xs ys
intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs
last (x:xt) = go x xt where go x xt = case xt of [] -> x; y:yt -> go y yt
init (x:xt) = case xt of [] -> []; _ -> x : init xt
intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt
intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt)
all f = and . map f
any f = or . map f
and = foldr (&&) True
or = foldr (||) False
zipWith f xs ys = case xs of [] -> []; x:xt -> case ys of [] -> []; y:yt -> f x y : zipWith f xt yt
zip = zipWith (,)
data State s a = State (s -> (a, s))
runState (State f) = f
instance Functor (State s) where fmap f = \(State h) -> State (first f . h)
instance Applicative (State s) where
pure a = State (a,)
(State f) <*> (State x) = State \s -> let (g, s') = f s in first g $ x s'
instance Monad (State s) where
return a = State (a,)
(State h) >>= f = State $ uncurry (runState . f) . h
evalState m s = fst $ runState m s
get = State \s -> (s, s)
put n = State \s -> ((), n)
either l r e = case e of Left x -> l x; Right x -> r x
instance Functor (Either a) where fmap f e = either Left (Right . f) e
instance Applicative (Either a) where
pure = Right
ef <*> ex = case ef of
Left s -> Left s
Right f -> either Left (Right . f) ex
instance Monad (Either a) where
return = Right
ex >>= f = either Left f ex
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
asum = foldr (<|>) empty
(*>) = liftA2 \x y -> y
(<*) = liftA2 \x y -> x
many p = liftA2 (:) p (many p) <|> pure []
some p = liftA2 (:) p (many p)
sepBy1 p sep = liftA2 (:) p (many (sep *> p))
sepBy p sep = sepBy1 p sep <|> pure []
between x y p = x *> (p <* y)
showParen b f = if b then ('(':) . f . (')':) else f
iterate f x = x : iterate f (f x)
takeWhile _ [] = []
takeWhile p xs@(x:xt)
| p x = x : takeWhile p xt
| True = []
class Enum a where
succ :: a -> a
pred :: a -> a
toEnum :: Int -> a
fromEnum :: a -> Int
enumFrom :: a -> [a]
enumFromTo :: a -> a -> [a]
instance Enum Int where
succ = (+1)
pred = (+(0-1))
toEnum = id
fromEnum = id
enumFrom = iterate succ
enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo
instance Enum Char where
succ = chr . (+1) . ord
pred = chr . (+(0-1)) . ord
toEnum = chr
fromEnum = ord
enumFrom = iterate succ
enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo
(+) = intAdd
(-) = intSub
(*) = intMul
div = intDiv
mod = intMod
instance (Ord a, Ord b) => Ord (a, b) where
(a1, b1) <= (a2, b2) = a1 <= a2 && (not (a2 <= a1) || b1 <= b2)
a < b = a <= b && a /= b
a > b = b <= a && a /= b
(>=) = flip(<=)
instance Applicative IO where pure = ioPure ; (<*>) f x = ioBind f \g -> ioBind x \y -> ioPure (g y)
instance Monad IO where return = ioPure ; (>>=) = ioBind
instance Functor IO where fmap f x = ioPure f <*> x
class Show a where
showsPrec :: Int -> a -> String -> String
showsPrec _ x = (show x++)
show :: a -> String
show x = shows x ""
showList :: [a] -> String -> String
showList = showList__ shows
shows = showsPrec 0
showList__ _ [] s = "[]" ++ s
showList__ showx (x:xs) s = '[' : showx x (showl xs)
where
showl [] = ']' : s
showl (y:ys) = ',' : showx y (showl ys)
showInt__ n
| 0 == n = id
| True = showInt__ (n`div`10) . (chr (48+n`mod`10):)
instance Show () where show () = "()"
instance Show Bool where
show True = "True"
show False = "False"
instance Show a => Show [a] where showsPrec _ = showList
instance Show Int where
showsPrec _ n
| 0 == n = ('0':)
| 1 <= n = showInt__ n
| 2 * n == 0 = ("-2147483648"++)
| True = ('-':) . showInt__ (0 - n)
showLitChar__ '\n' = ("\\n"++)
showLitChar__ '\\' = ("\\\\"++)
showLitChar__ c = (c:)
instance Show Char where
showsPrec _ '\'' = ("'\\''"++)
showsPrec _ c = ('\'':) . showLitChar__ c . ('\'':)
showList s = ('"':) . foldr (.) id (map go s) . ('"':) where
go '"' = ("\\\""++)
go c = showLitChar__ c
instance (Show a, Show b) => Show (a, b) where
showsPrec _ (a, b) = showParen True $ shows a . (',':) . shows b
-- FFI across multiple modules.
-- Rewrite with named fields, deriving.
module Ast where
import Base
import Map
data Type = TC String | TV String | TAp Type Type deriving Eq
arr a b = TAp (TAp (TC "->") a) b
data Extra = Basic String | Const Int | ChrCon Char | StrCon String | Link String String Qual
data Pat = PatLit Ast | PatVar String (Maybe Pat) | PatCon String [Pat]
data Ast = E Extra | V String | A Ast Ast | L String Ast | Pa [([Pat], Ast)] | Proof Pred
data Constr = Constr String [(String, Type)]
data Pred = Pred String Type deriving Eq
data Qual = Qual [Pred] Type
instance Show Type where
showsPrec _ = \case
TC s -> (s++)
TV s -> (s++)
TAp (TAp (TC "->") a) b -> showParen True $ shows a . (" -> "++) . shows b
TAp a b -> showParen True $ shows a . (' ':) . shows b
instance Show Pred where
showsPrec _ (Pred s t) = (s++) . (' ':) . shows t . (" => "++)
instance Show Qual where
showsPrec _ (Qual ps t) = foldr (.) id (map shows ps) . shows t
instance Show Extra where
showsPrec _ = \case
Basic s -> (s++)
Const i -> shows i
ChrCon c -> shows c
StrCon s -> shows s
Link im s _ -> (im++) . ('.':) . (s++)
instance Show Pat where
showsPrec _ = \case
PatLit e -> shows e
PatVar s mp -> (s++) . maybe id ((('@':) .) . shows) mp
PatCon s ps -> (s++) . foldr (.) id (((' ':) .) . shows <$> ps)
showVar s@(h:_) = showParen (elem h ":!#$%&*+./<=>?@\\^|-~") (s++)
instance Show Ast where
showsPrec prec = \case
E e -> shows e
V s -> showVar s
A x y -> showParen (1 <= prec) $ shows x . (' ':) . showsPrec 1 y
L s t -> showParen True $ ('\\':) . (s++) . (" -> "++) . shows t
Pa vsts -> ('\\':) . showParen True (foldr (.) id $ intersperse (';':) $ map (\(vs, t) -> foldr (.) id (intersperse (' ':) $ map (showParen True . shows) vs) . (" -> "++) . shows t) vsts)
Proof p -> ("{Proof "++) . shows p . ("}"++)
data Instance = Instance
-- Type, e.g. Int for Eq Int.
Type
-- Dictionary name, e.g. "{Eq Int}"
String
-- Context.
[Pred]
-- Method definitions
(Map String Ast)
data Assoc = NAssoc | LAssoc | RAssoc deriving Eq
data Neat = Neat
{ typeclasses :: Map String [String]
, instances :: Map String [Instance]
, topDefs :: [(String, Ast)]
-- | Typed ASTs, ready for compilation, including ADTs and methods,
-- e.g. (==), (Eq a => a -> a -> Bool, select-==)
, typedAsts :: [(String, (Qual, Ast))]
, dataCons :: Map String [Constr]
, ffiImports :: Map String Type
, ffiExports :: Map String String
, moduleImports :: [String]
, opFixity :: Map String (Int, Assoc)
}
neatEmpty = Neat Tip Tip [] [] Tip Tip Tip [] Tip
patVars = \case
PatLit _ -> []
PatVar s m -> s : maybe [] patVars m
PatCon _ args -> concat $ patVars <$> args
fvPro bound expr = case expr of
V s | not (elem s bound) -> [s]
A x y -> fvPro bound x `union` fvPro bound y
L s t -> fvPro (s:bound) t
Pa vsts -> foldr union [] $ map (\(vs, t) -> fvPro (concatMap patVars vs ++ bound) t) vsts
_ -> []
beta s a t = case t of
E _ -> t
V v -> if s == v then a else t
A x y -> A (beta s a x) (beta s a y)
L v u -> if s == v then t else L v $ beta s a u
typeVars = \case
TC _ -> []
TV v -> [v]
TAp x y -> typeVars x `union` typeVars y
depthFirstSearch = (foldl .) \relation st@(visited, sequence) vertex ->
if vertex `elem` visited then st else second (vertex:)
$ depthFirstSearch relation (vertex:visited, sequence) (relation vertex)
spanningSearch = (foldl .) \relation st@(visited, setSequence) vertex ->
if vertex `elem` visited then st else second ((:setSequence) . (vertex:))
$ depthFirstSearch relation (vertex:visited, []) (relation vertex)
scc ins outs = spanning . depthFirst where
depthFirst = snd . depthFirstSearch outs ([], [])
spanning = snd . spanningSearch ins ([], [])
-- FFI across multiple modules.
-- Rewrite with named fields.
-- Fix fixities after parsing.
module Parser where
import Base
import Ast
import Map
-- Parser.
data ParserState = ParserState
{ readme :: [(Char, (Int, Int))]
, landin :: String
, indents :: [Int]
}
data Parser a = Parser { getParser :: ParserState -> Either String (a, ParserState) }
instance Functor Parser where fmap f x = pure f <*> x
instance Applicative Parser where
pure x = Parser \inp -> Right (x, inp)
(Parser f) <*> (Parser x) = Parser \inp -> do
(fun, t) <- f inp
(arg, u) <- x t
pure (fun arg, u)
instance Monad Parser where
return = pure
(Parser x) >>= f = Parser \inp -> do
(a, t) <- x inp
getParser (f a) t
instance Alternative Parser where
empty = bad ""
x <|> y = Parser \inp -> either (const $ getParser y inp) Right $ getParser x inp
notFollowedBy p = do
saved <- Parser \pasta -> Right (pasta, pasta)
ret <- p *> pure (bad "") <|> pure (pure ())
Parser \_ -> Right ((), saved)
ret
parse f str = getParser f $ ParserState (rowcol str (1, 1)) [] [] where
rowcol s rc = case s of
[] -> []
h:t -> (h, rc) : rowcol t (advanceRC (ord h) rc)
advanceRC n (r, c)
| n `elem` [10, 11, 12, 13] = (r + 1, 1)
| n == 9 = (r, (c + 8)`mod`8)
| True = (r, c + 1)
indentOf pasta = case readme pasta of
[] -> 1
(_, (_, c)):_ -> c
ins c pasta = pasta { landin = c:landin pasta }
angle n pasta = case indents pasta of
m:ms | m == n -> ins ';' pasta
| n + 1 <= m -> ins '}' $ angle n pasta { indents = ms }
_ -> pasta
curly n pasta = case indents pasta of
m:ms | m + 1 <= n -> ins '{' pasta { indents = n:m:ms }
[] | 1 <= n -> ins '{' pasta { indents = [n] }
_ -> ins '{' . ins '}' $ angle n pasta
sat f = Parser \pasta -> case landin pasta of
c:t -> if f c then Right (c, pasta { landin = t }) else Left "unsat"
[] -> case readme pasta of
[] -> case indents pasta of
[] -> Left "EOF"
m:ms | m /= 0 && f '}' -> Right ('}', pasta { indents = ms })
_ -> Left "unsat"
(h, _):t | f h -> let
p' = pasta { readme = t }
in case h of
'}' -> case indents pasta of
0:ms -> Right (h, p' { indents = ms })
_ -> Left "unsat"
'{' -> Right (h, p' { indents = 0:indents p' })
_ -> Right (h, p')
_ -> Left "unsat"
char c = sat (c ==)
rawSat f = Parser \pasta -> case readme pasta of
[] -> Left "EOF"
(h, _):t -> if f h then Right (h, pasta { readme = t }) else Left "unsat"
eof = Parser \pasta -> case pasta of
ParserState [] [] _ -> Right ((), pasta)
_ -> badpos pasta "want eof"
comment = rawSat ('-' ==) *> some (rawSat ('-' ==)) *>
(rawSat isNewline <|> rawSat (not . isSymbol) *> many (rawSat $ not . isNewline) *> rawSat isNewline) *> pure True
spaces = isNewline <$> rawSat isSpace
whitespace = do
offside <- or <$> many (spaces <|> comment)
Parser \pasta -> Right ((), if offside then angle (indentOf pasta) pasta else pasta)
hexValue d
| d <= '9' = ord d - ord '0'
| d <= 'F' = 10 + ord d - ord 'A'
| d <= 'f' = 10 + ord d - ord 'a'
isNewline c = ord c `elem` [10, 11, 12, 13]
isSymbol = (`elem` "!#$%&*+./<=>?@\\^|-~:")
isSmall c = c <= 'z' && 'a' <= c || c == '_'
small = sat isSmall
large = sat \x -> (x <= 'Z') && ('A' <= x)
hexit = sat \x -> (x <= '9') && ('0' <= x)
|| (x <= 'F') && ('A' <= x)
|| (x <= 'f') && ('a' <= x)
digit = sat \x -> (x <= '9') && ('0' <= x)
decimal = foldl (\n d -> 10*n + ord d - ord '0') 0 <$> some digit
hexadecimal = foldl (\n d -> 16*n + hexValue d) 0 <$> some hexit
nameTailChar = small <|> large <|> digit <|> char '\''
nameTailed p = liftA2 (:) p $ many nameTailChar
escape = char '\\' *> (sat (`elem` "'\"\\") <|> char 'n' *> pure '\n' <|> char '0' *> pure (chr 0) <|> char 'x' *> (chr <$> hexadecimal))
tokOne delim = escape <|> rawSat (delim /=)
charSeq = mapM char
tokChar = between (char '\'') (char '\'') (tokOne '\'')
quoteStr = between (char '"') (char '"') $ many $ many (charSeq "\\&") *> tokOne '"'
quasiquoteStr = charSeq "[r|" *> quasiquoteBody
quasiquoteBody = charSeq "|]" *> pure [] <|> (:) <$> rawSat (const True) <*> quasiquoteBody
tokStr = quoteStr <|> quasiquoteStr
integer = char '0' *> (char 'x' <|> char 'X') *> hexadecimal <|> decimal
literal = lexeme . fmap E $ Const <$> integer <|> ChrCon <$> tokChar <|> StrCon <$> tokStr
varish = lexeme $ nameTailed small
bad s = Parser \pasta -> badpos pasta s
badpos pasta s = Left $ loc $ ": " ++ s where
loc = case readme pasta of
[] -> ("EOF"++)
(_, (r, c)):_ -> ("row "++) . shows r . (" col "++) . shows c
varId = do
s <- varish
if elem s
["export", "case", "class", "data", "default", "deriving", "do", "else", "foreign", "if", "import", "in", "infix", "infixl", "infixr", "instance", "let", "module", "newtype", "of", "then", "type", "where", "_"]
then bad $ "reserved: " ++ s else pure s
varSymish = lexeme $ (:) <$> sat (\c -> isSymbol c && c /= ':') <*> many (sat isSymbol)
varSym = lexeme $ do
s <- varSymish
if elem s ["..", "=", "\\", "|", "<-", "->", "@", "~", "=>"] then bad $ "reserved: " ++ s else pure s
conId = lexeme $ nameTailed large
conSymish = lexeme $ liftA2 (:) (char ':') $ many $ sat isSymbol
conSym = do
s <- conSymish
if elem s [":", "::"] then bad $ "reserved: " ++ s else pure s
special c = lexeme $ sat (c ==)
comma = special ','
semicolon = special ';'
lParen = special '('
rParen = special ')'
lBrace = special '{'
rBrace = special '}'
lSquare = special '['
rSquare = special ']'
backquote = special '`'
lexeme f = f <* whitespace
lexemePrelude = whitespace *>
Parser \pasta -> case getParser (res "module" <|> (:[]) <$> char '{') pasta of
Left _ -> Right ((), curly (indentOf pasta) pasta)
Right _ -> Right ((), pasta)
curlyCheck f = do
Parser \pasta -> Right ((), pasta { indents = 0:indents pasta })
r <- f
Parser \pasta -> let pasta' = pasta { indents = tail $ indents pasta } in case readme pasta of
[] -> Right ((), curly 0 pasta')
('{', _):_ -> Right ((), pasta')
(_, (_, col)):_ -> Right ((), curly col pasta')
pure r
conOf (Constr s _) = s
specialCase (h:_) = '|':conOf h
mkCase t cs = (specialCase cs,
( Qual [] $ arr t $ foldr arr (TV "case") $ map (\(Constr _ sts) -> foldr arr (TV "case") $ snd <$> sts) cs
, E $ Basic "I"))
mkStrs = snd . foldl (\(s, l) u -> ('@':s, s:l)) ("@", [])
scottEncode _ ":" _ = E $ Basic "CONS"
scottEncode vs s ts = foldr L (foldl (\a b -> A a (V b)) (V s) ts) (ts ++ vs)
scottConstr t cs (Constr s sts) = (s,
(Qual [] $ foldr arr t ts , scottEncode (map conOf cs) s $ mkStrs ts))
: [(field, (Qual [] $ t `arr` ft, L s $ foldl A (V s) $ inj $ proj field)) | (field, ft) <- sts, field /= ""]
where
ts = snd <$> sts
proj fd = foldr L (V fd) $ fst <$> sts
inj x = map (\(Constr s' _) -> if s' == s then x else V "undefined") cs
mkAdtDefs t cs = mkCase t cs : concatMap (scottConstr t cs) cs
mkFFIHelper n t acc = case t of
TC s -> acc
TAp (TC "IO") _ -> acc
TAp (TAp (TC "->") x) y -> L (show n) $ mkFFIHelper (n + 1) y $ A (V $ show n) acc
updateDcs cs dcs = foldr (\(Constr s _) m -> insert s cs m) dcs cs
addAdt t cs ders neat = foldr derive neat' ders where
neat' = neat
{ typedAsts = mkAdtDefs t cs ++ typedAsts neat
, dataCons = updateDcs cs $ dataCons neat
}
derive "Eq" = addInstance "Eq" (mkPreds "Eq") t
[("==", Pa $ map eqCase cs
)]
derive "Show" = addInstance "Show" (mkPreds "Show") t
[("showsPrec", L "prec" $ Pa $ map showCase cs
)]
derive der = error $ "bad deriving: " ++ der
prec0 = E $ Const 0
showCase (Constr con args) = let as = show <$> [1..length args]
in ([PatCon con $ mkPatVar "" <$> as], case args of
[] -> A (V "++") (E $ StrCon con)
_ -> case con of
':':_ -> A (A (V "showParen") $ V "True") $ foldr1
(\f g -> A (A (V ".") f) g)
[ A (A (V "showsPrec") prec0) (V "1")
, A (V "++") (E $ StrCon $ ' ':con++" ")
, A (A (V "showsPrec") prec0) (V "2")
]
_ -> A (A (V "showParen") $ A (A (V "<=") prec0) $ V "prec")
$ A (A (V ".") $ A (V "++") (E $ StrCon con))
$ foldr (\f g -> A (A (V ".") f) g) (L "x" $ V "x")
$ map (\a -> A (A (V ".") (A (V ":") (E $ ChrCon ' '))) $ A (A (V "showsPrec") prec0) (V a)) as
)
mkPreds classId = Pred classId . TV <$> typeVars t
mkPatVar pre s = PatVar (pre ++ s) Nothing
eqCase (Constr con args) = let as = show <$> [1..length args]
in ([PatCon con $ mkPatVar "l" <$> as], Pa
[ ([PatCon con $ mkPatVar "r" <$> as], foldr (\x y -> (A (A (V "&&") x) y)) (V "True")
$ map (\n -> A (A (V "==") (V $ "l" ++ n)) (V $ "r" ++ n)) as)
, ([PatVar "_" Nothing], V "False")])
addClass classId v (sigs, defs) neat = if not $ member classId $ typeclasses neat then neat
{ typeclasses = insert classId (keys sigs) $ typeclasses neat
, typedAsts = selectors ++ typedAsts neat
, topDefs = defaults ++ topDefs neat
} else error $ "duplicate class: " ++ classId
where
vars = take (size sigs) $ show <$> [0..]
selectors = zipWith (\var (s, t) -> (s, (Qual [Pred classId v] t,
L "@" $ A (V "@") $ foldr L (V var) vars))) vars $ toAscList sigs
defaults = map (\(s, t) -> if member s sigs then ("{default}" ++ s, t) else error $ "bad default method: " ++ s) $ toAscList defs
addInstance classId ps ty ds neat = neat
{ instances = insertWith (++) classId [Instance ty name ps (fromList ds)] $ instances neat
} where
name = '{':classId ++ (' ':shows ty "}")
addForeignImport foreignname ourname t neat = let ffis = ffiImports neat in neat
{ typedAsts = (ourname, (Qual [] t, mkFFIHelper 0 t $ A (E $ Basic "F") $ A (E $ Basic "NUM") $ E $ Link "{foreign}" foreignname $ Qual [] t)) : typedAsts neat
, ffiImports = insertWith (error $ "duplicate import: " ++ foreignname) foreignname t ffis
}
addForeignExport e f neat = neat { ffiExports = insertWith (error $ "duplicate export: " ++ e) e f $ ffiExports neat }
addDefs ds neat = neat { topDefs = ds ++ topDefs neat }
addImport im neat = neat { moduleImports = im:moduleImports neat }
addFixities os prec neat = neat { opFixity = foldr (\o tab -> insert o prec tab) (opFixity neat) os }
parseErrorRule = Parser \pasta -> case indents pasta of
m:ms | m /= 0 -> Right ('}', pasta { indents = ms })
_ -> badpos pasta "missing }"
res w@(h:_) = reservedSeq *> pure w <|> bad ("want \"" ++ w ++ "\"") where
reservedSeq = if elem w ["let", "where", "do", "of"]
then curlyCheck $ lexeme $ charSeq w *> notFollowedBy nameTailChar
else lexeme $ charSeq w *> notFollowedBy (if isSmall h then nameTailChar else sat isSymbol)
paren = between lParen rParen
braceSep f = between lBrace (rBrace <|> parseErrorRule) $ foldr ($) [] <$> sepBy ((:) <$> f <|> pure id) semicolon
maybeFix s x = if elem s $ fvPro [] x then A (V "fix") (L s x) else x
nonemptyTails [] = []
nonemptyTails xs@(x:xt) = xs : nonemptyTails xt
joinIsFail t = A (L "join#" t) (V "fail#")
addLets ls x = foldr triangle x components where
vs = fst <$> ls
ios = foldr (\(s, dsts) (ins, outs) ->
(foldr (\dst -> insertWith union dst [s]) ins dsts, insertWith union s dsts outs))
(Tip, Tip) $ map (\(s, t) -> (s, intersect (fvPro [] t) vs)) ls
components = scc (\k -> maybe [] id $ mlookup k $ fst ios) (\k -> maybe [] id $ mlookup k $ snd ios) vs
triangle names expr = let
tnames = nonemptyTails names
appem vs = foldl1 A $ V <$> vs
suball expr = foldl A (foldr L expr $ init names) $ appem <$> init tnames
redef tns expr = foldr L (suball expr) tns
in foldr (\(x:xt) t -> A (L x t) $ maybeFix x $ redef xt $ maybe undefined joinIsFail $ lookup x ls) (suball expr) tnames
qconop = conSym <|> res ":" <|> between backquote backquote conId
qconsym = conSym <|> res ":"
op = qconsym <|> varSym <|> between backquote backquote (conId <|> varId)
con = conId <|> paren qconsym
var = varId <|> paren varSym
tycon = do
s <- conId
pure $ if s == "String" then TAp (TC "[]") (TC "Char") else TC s
aType =
lParen *>
( rParen *> pure (TC "()")
<|> (foldr1 (TAp . TAp (TC ",")) <$> sepBy1 _type comma) <* rParen)
<|> tycon
<|> TV <$> varId
<|> (lSquare *> (rSquare *> pure (TC "[]") <|> TAp (TC "[]") <$> (_type <* rSquare)))
bType = foldl1 TAp <$> some aType
_type = foldr1 arr <$> sepBy bType (res "->")
fixityDecl w a = do
res w
n <- lexeme integer
os <- sepBy op comma
pure $ addFixities os (n, a)
fixity = fixityDecl "infix" NAssoc <|> fixityDecl "infixl" LAssoc <|> fixityDecl "infixr" RAssoc
cDecls = first fromList . second fromList . foldr ($) ([], []) <$> braceSep cDecl
cDecl = first . (:) <$> genDecl <|> second . (++) <$> defSemi
genDecl = (,) <$> var <*> (res "::" *> _type)
classDecl = res "class" *> (addClass <$> conId <*> (TV <$> varId) <*> (res "where" *> cDecls))
simpleClass = Pred <$> conId <*> _type
scontext = (:[]) <$> simpleClass <|> paren (sepBy simpleClass comma)
instDecl = res "instance" *>
((\ps cl ty defs -> addInstance cl ps ty defs) <$>
(scontext <* res "=>" <|> pure [])
<*> conId <*> _type <*> (res "where" *> braceDef))
letin = addLets <$> between (res "let") (res "in") braceDef <*> expr
ifthenelse = (\a b c -> A (A (A (V "if") a) b) c) <$>
(res "if" *> expr) <*> (res "then" *> expr) <*> (res "else" *> expr)
listify = foldr (\h t -> A (A (V ":") h) t) (V "[]")
alts = joinIsFail . Pa <$> braceSep ((\x y -> ([x], y)) <$> pat <*> guards "->")
cas = flip A <$> between (res "case") (res "of") expr <*> alts
lamCase = curlyCheck (res "case") *> alts
lam = res "\\" *> (lamCase <|> liftA2 onePat (some apat) (res "->" *> expr))
flipPairize y x = A (A (V ",") x) y
moreCommas = foldr1 (A . A (V ",")) <$> sepBy1 expr comma
thenComma = comma *> ((flipPairize <$> moreCommas) <|> pure (A (V ",")))
parenExpr = (&) <$> expr <*> (((\v a -> A (V v) a) <$> op) <|> thenComma <|> pure id)
rightSect = ((\v a -> L "@" $ A (A (V v) $ V "@") a) <$> (op <|> (:"") <$> comma)) <*> expr
section = lParen *> (parenExpr <* rParen <|> rightSect <* rParen <|> rParen *> pure (V "()"))
maybePureUnit = maybe (V "pure" `A` V "()") id
stmt = (\p x -> Just . A (V ">>=" `A` x) . onePat [p] . maybePureUnit) <$> pat <*> (res "<-" *> expr)
<|> (\x -> Just . maybe x (\y -> (V ">>=" `A` x) `A` (L "_" y))) <$> expr
<|> (\ds -> Just . addLets ds . maybePureUnit) <$> (res "let" *> braceDef)
doblock = res "do" *> (maybePureUnit . foldr ($) Nothing <$> braceSep stmt)
compQual =
(\p xs e -> A (A (V "concatMap") $ onePat [p] e) xs)
<$> pat <*> (res "<-" *> expr)
<|> (\b e -> A (A (A (V "if") b) e) $ V "[]") <$> expr
<|> addLets <$> (res "let" *> braceDef)
sqExpr = between lSquare rSquare $
((&) <$> expr <*>
( res ".." *>
( (\hi lo -> (A (A (V "enumFromTo") lo) hi)) <$> expr
<|> pure (A (V "enumFrom"))
)
<|> res "|" *>
((. A (V "pure")) . foldr (.) id <$> sepBy1 compQual comma)
<|> (\t h -> listify (h:t)) <$> many (comma *> expr)
)
)
<|> pure (V "[]")
fbind = A <$> (V <$> var) <*> (res "=" *> expr)
fBinds v = (do
fbs <- between lBrace rBrace $ sepBy1 fbind comma
pure $ A (E $ Basic "{=") $ foldr A (E $ Basic "=}") $ v:fbs
) <|> pure v
atom = ifthenelse <|> doblock <|> letin <|> sqExpr <|> section
<|> cas <|> lam <|> (paren comma *> pure (V ","))
<|> V <$> (con <|> var) <|> literal
>>= fBinds
aexp = foldl1 A <$> some atom
chain a = \case
[] -> a
A f b:rest -> case rest of
[] -> A (A f a) b
_ -> L "(" $ A (A (A f a) b) $ foldr A (V ")") rest
_ -> error "unreachable"
expr = chain <$> aexp <*> many (A <$> (V <$> op) <*> aexp)
gcon = conId <|> paren (qconsym <|> (:"") <$> comma) <|> (lSquare *> rSquare *> pure "[]")
apat = PatVar <$> var <*> (res "@" *> (Just <$> apat) <|> pure Nothing)
<|> flip PatVar Nothing <$> (res "_" *> pure "_")
<|> flip PatCon [] <$> gcon
<|> PatLit <$> literal
<|> foldr (\h t -> PatCon ":" [h, t]) (PatCon "[]" [])
<$> between lSquare rSquare (sepBy pat comma)
<|> paren (foldr1 pairPat <$> sepBy1 pat comma <|> pure (PatCon "()" []))
where pairPat x y = PatCon "," [x, y]
patChain a = \case
[] -> a
PatCon f [b]:rest -> case rest of
[] -> PatCon f [a, b]
_ -> PatCon "{+" $ PatCon f [a, b] : rest
_ -> error "unreachable"
patAtom = PatCon <$> gcon <*> many apat <|> apat
pat = patChain <$> patAtom <*> many (PatCon <$> qconop <*> ((:[]) <$> patAtom))
maybeWhere p = (&) <$> p <*> (res "where" *> (addLets <$> braceDef) <|> pure id)
guards s = maybeWhere $ res s *> expr <|> foldr ($) (V "join#") <$> some ((\x y -> case x of
V "True" -> \_ -> y
_ -> A (A (A (V "if") x) y)
) <$> (res "|" *> expr) <*> (res s *> expr))
onePat vs x = joinIsFail $ Pa [(vs, x)]
defOnePat vs x = Pa [(vs, x)]
opDef x f y rhs = [(f, defOnePat [x, y] rhs)]
leftyPat p expr = case pvars of
[] -> []
(h:t) -> let gen = '@':h in
(gen, expr):map (\v -> (v, A (Pa [([p], V v)]) $ V gen)) pvars
where
pvars = filter (/= "_") $ patVars p
def = liftA2 (\l r -> [(l, r)]) var (liftA2 defOnePat (many apat) $ guards "=")
<|> (pat >>= \x -> opDef x <$> varSym <*> pat <*> guards "=" <|> leftyPat x <$> guards "=")
coalesce = \case
[] -> []
h@(s, x):t -> case t of
[] -> [h]
(s', x'):t' -> let
f (Pa vsts) (Pa vsts') = Pa $ vsts ++ vsts'
f _ _ = error "bad multidef"
in if s == s' then coalesce $ (s, f x x'):t' else h:coalesce t
defSemi = coalesce . concat <$> sepBy1 def (some semicolon)
braceDef = concat <$> braceSep defSemi
simpleType c vs = foldl TAp (TC c) (map TV vs)
conop = conSym <|> between backquote backquote conId
fieldDecl = (\vs t -> map (, t) vs) <$> sepBy1 var comma <*> (res "::" *> _type)
constr = (\x c y -> Constr c [("", x), ("", y)]) <$> aType <*> conop <*> aType
<|> Constr <$> conId <*>
( concat <$> between lBrace rBrace (fieldDecl `sepBy` comma)
<|> map ("",) <$> many aType)
dclass = conId
_deriving = (res "deriving" *> ((:[]) <$> dclass <|> paren (dclass `sepBy` comma))) <|> pure []
adt = addAdt <$> between (res "data") (res "=") (simpleType <$> conId <*> many varId) <*> sepBy constr (res "|") <*> _deriving
impDecl = addImport <$> (res "import" *> conId)
topdecls = braceSep
$ adt
<|> classDecl
<|> instDecl
<|> res "foreign" *>
( res "import" *> var *> (addForeignImport <$> lexeme tokStr <*> var <*> (res "::" *> _type))
<|> res "export" *> var *> (addForeignExport <$> lexeme tokStr <*> var)
)
<|> addDefs <$> defSemi
<|> fixity
<|> impDecl
haskell = between lexemePrelude eof $ some $ (,) <$> (res "module" *> conId <* res "where" <|> pure "Main") <*> topdecls
parseProgram s = fst <$> parse haskell s
-- FFI across multiple modules.
module Typer where
import Base
import Map
import Ast
import Parser
import Unify
app01 s x y = maybe (A (L s x) y) snd $ go x where
go expr = case expr of
E _ -> Just (False, expr)
V v -> Just $ if s == v then (True, y) else (False, expr)
A l r -> do
(a, l') <- go l
(b, r') <- go r
if a && b then Nothing else pure (a || b, A l' r')
L v t -> if v == s then Just (False, expr) else second (L v) <$> go t
optiApp t = case t of
A x y -> let
x' = optiApp x
y' = optiApp y
in case x' of
L s v -> app01 s v y'
_ -> A x' y'
L s x -> L s (optiApp x)
_ -> t
-- Pattern compiler.
findCon dcs s = foldr (<|>) Nothing $ mlookup s <$> dcs
rewritePats dcs = \case
[] -> pure $ V "join#"
vsxs@((as0, _):_) -> case as0 of
[] -> pure $ foldr1 (A . L "join#") $ snd <$> vsxs
_ -> do
let k = length as0
n <- get
put $ n + k
let vs = take k $ (`shows` "#") <$> [n..]
cs <- flip mapM vsxs \(a:at, x) -> (a,) <$> foldM (\b (p, v) -> rewriteCase dcs v Tip [(p, b)]) x (zip at $ tail vs)
flip (foldr L) vs <$> rewriteCase dcs (head vs) Tip cs
patEq lit b x y = A (L "join#" $ A (A (A (V "if") (A (A (V "==") lit) b)) x) $ V "join#") y
rewriteCase dcs caseVar tab = \case
[] -> flush $ V "join#"
((v, x):rest) -> go v x rest
where
rec = rewriteCase dcs caseVar
go v x rest = case v of
PatLit lit -> flush =<< patEq lit (V caseVar) x <$> rec Tip rest
PatVar s m -> let x' = beta s (V caseVar) x in case m of
Nothing -> flush =<< A (L "join#" x') <$> rec Tip rest
Just v' -> go v' x' rest
PatCon con args -> rec (insertWith (flip (.)) con ((args, x):) tab) rest
flush onFail = case toAscList tab of
[] -> pure onFail
-- TODO: Check rest of `tab` lies in cs.
(firstC, _):_ -> do
let cs = maybe undefined id $ findCon dcs firstC
jumpTable <- mapM (\(Constr s ts) -> case mlookup s tab of
Nothing -> pure $ foldr L (V "join#") $ const "_" <$> ts
Just f -> rewritePats dcs $ f []
) cs
pure $ A (L "join#" $ foldl A (A (V $ specialCase cs) $ V caseVar) jumpTable) onFail
findField dcs f = case [(con, fields) | tab <- dcs, (_, cons) <- toAscList tab, Constr con fields <- cons, (f', _) <- fields, f == f'] of
[] -> error $ "no such field: " ++ f
h:_ -> h
resolveFieldBinds dcs t = go t where
go t = case t of
E _ -> t
V _ -> t
A (E (Basic "{=")) (A rawExpr fbsAst) -> let
expr = go rawExpr
fromAst t = case t of
A (A (V f) body) rest -> (f, go body):fromAst rest
E (Basic "=}") -> []
fbs@((firstField, _):_) = fromAst fbsAst
(con, fields) = findField dcs firstField
cs = maybe undefined id $ findCon dcs con
newValue = foldl A (V con) [maybe (V $ "[old]"++f) id $ lookup f fbs | (f, _) <- fields]
initValue = foldl A expr [maybe (V "undefined") id $ lookup f fbs | (f, _) <- fields]
updater = foldr L newValue $ ("[old]"++) . fst <$> fields
inj x = map (\(Constr con' _) -> if con' == con then x else V "undefined") cs
allPresent = all (`elem` (fst <$> fields)) $ fst <$> fbs
isCon = case expr of
V (h:_) -> 'A' <= h && h <= 'Z'
_ -> False
in if allPresent
then if isCon then initValue else foldl A (A (V $ specialCase cs) expr) $ inj updater
else error "bad fields in update"
A x y -> A (go x) (go y)
L s x -> L s $ go x
fixFixity precs t = case t of
E _ -> t
V _ -> t
A x y -> A (go x) (go y)
L s b
| s == "(" -> infixer precs $ go b
| True -> L s $ go b
Pa vsxs -> Pa $ map (\(ps, a) -> (patFixFixity precs <$> ps, go a)) vsxs
where
go = fixFixity precs
data OpTree = OpLeaf Ast | OpNode String Ast OpTree
infixer precs (A (A (A (V s) x) y) t) = go (OpNode s x (OpLeaf y)) t
where
go acc = \case
A (A (V s) z) rest -> go (ins s z acc) rest
V ")" -> decode acc
_ -> error "unreachable"
ins s z t = case t of
OpNode s' x y
| isStronger precs s s' -> OpNode s' x (ins s z y)
| True -> OpNode s (decode t) (OpLeaf z)
OpLeaf x -> OpNode s x (OpLeaf z)
decode = \case
OpNode f x y -> A (A (V f) x) (decode y)
OpLeaf x -> x
isStronger precs s s' = if prec <= prec'
then if prec == prec'
then if assoc == assoc'
then case assoc of
LAssoc -> False
RAssoc -> True
NAssoc -> error $ "adjacent NAssoc: " ++ s ++ " vs " ++ s'
else error $ "assoc mismatch: " ++ s ++ " vs " ++ s'
else False
else True
where
(prec, assoc) = findPrec s
(prec', assoc') = findPrec s'
findPrec s = if s == ":" then (5, RAssoc) else maybe defPrec id $ mlookup s precs
defPrec = (9, LAssoc)
patFixFixity precs p = case p of
PatLit _ -> p
PatVar s m -> PatVar s $ go <$> m
PatCon "{+" args -> patFixer precs args
PatCon con args -> PatCon con $ go <$> args
where
go = patFixFixity precs
data PopTree = PopLeaf Pat | PopNode String Pat PopTree
patFixer precs (PatCon f [a, b]:rest) = go seed rest where
seed = PopNode f a (PopLeaf b)
go acc = \case
[] -> decode acc
PatCon s [z]:rest -> go (ins s z acc) rest
ins s z t = case t of
PopNode s' x y -> case isStronger precs s s' of
True -> PopNode s' x $ ins s z y
False -> PopNode s (decode t) (PopLeaf z)
PopLeaf x -> PopNode s x (PopLeaf z)
decode = \case
PopNode f x y -> PatCon f [x, decode y]
PopLeaf x -> x
secondM f (a, b) = (a,) <$> f b
patternCompile dcs t = optiApp $ resolveFieldBinds dcs $ evalState (go t) 0 where
go t = case t of
E _ -> pure t
V _ -> pure t
A x y -> liftA2 A (go x) (go y)
L s x -> L s <$> go x
Pa vsxs -> mapM (secondM go) vsxs >>= rewritePats dcs
-- Type inference.
instantiate' t n tab = case t of
TC s -> ((t, n), tab)
TV s -> case lookup s tab of
Nothing -> let va = TV $ show n in ((va, n + 1), (s, va):tab)
Just v -> ((v, n), tab)
TAp x y -> let
((t1, n1), tab1) = instantiate' x n tab
((t2, n2), tab2) = instantiate' y n1 tab1
in ((TAp t1 t2, n2), tab2)
instantiatePred (Pred s t) ((out, n), tab) = first (first ((:out) . Pred s)) (instantiate' t n tab)
instantiate (Qual ps t) n = first (Qual ps1) $ fst $ instantiate' t n1 tab where
((ps1, n1), tab) = foldr instantiatePred (([], n), []) ps
proofApply sub a = case a of
Proof (Pred cl ty) -> Proof (Pred cl $ apply sub ty)
A x y -> A (proofApply sub x) (proofApply sub y)
L s t -> L s $ proofApply sub t
_ -> a
typeAstSub sub (t, a) = (apply sub t, proofApply sub a)
infer typed loc ast csn@(cs, n) = case ast of
E x -> Right $ case x of
Const _ -> ((TC "Int", ast), csn)
ChrCon _ -> ((TC "Char", ast), csn)
StrCon _ -> ((TAp (TC "[]") (TC "Char"), ast), csn)
Link im s q -> insta q
V s -> maybe (Left $ "undefined: " ++ s) Right
$ (\t -> ((t, ast), csn)) <$> lookup s loc
<|> insta . fst <$> mlookup s typed
A x y -> infer typed loc x (cs, n + 1) >>=
\((tx, ax), csn1) -> infer typed loc y csn1 >>=
\((ty, ay), (cs2, n2)) -> unify tx (arr ty va) cs2 >>=
\cs -> Right ((va, A ax ay), (cs, n2))
L s x -> first (\(t, a) -> (arr va t, L s a)) <$> infer typed ((s, va):loc) x (cs, n + 1)
where
va = TV $ show n
insta ty = ((ty1, foldl A ast (map Proof preds)), (cs, n1))
where (Qual preds ty1, n1) = instantiate ty n
findInstance tycl qn@(q, n) p@(Pred cl ty) insts = case insts of
[] -> let v = '*':show n in Right (((p, v):q, n + 1), V v)
(modName, Instance h name ps _):rest -> case match h ty of
Nothing -> findInstance tycl qn p rest
Just subs -> foldM (\(qn1, t) (Pred cl1 ty1) -> second (A t)
<$> findProof tycl (Pred cl1 $ apply subs ty1) qn1) (qn, if modName == "" then V name else E $ Link modName name undefined) ps
findProof tycl pred@(Pred classId t) psn@(ps, n) = case lookup pred ps of
Nothing -> findInstance tycl psn pred $ tycl classId
Just s -> Right (psn, V s)
prove tycl psn a = case a of
Proof pred -> findProof tycl pred psn
A x y -> prove tycl psn x >>= \(psn1, x1) ->
second (A x1) <$> prove tycl psn1 y
L s t -> second (L s) <$> prove tycl psn t
_ -> Right (psn, a)
data Dep a = Dep ([String] -> Either String ([String], a))
instance Functor Dep where
fmap f = \(Dep mf) -> Dep \g -> do
(g', x) <- mf g
pure (g', f x)
instance Applicative Dep where
pure x = Dep \g -> Right (g, x)
(Dep mf) <*> (Dep mx) = Dep \g -> do
(g', f) <- mf g
(g'', x) <- mx g'
pure (g'', f x)
addDep s = Dep \deps -> Right (if s `elem` deps then deps else s : deps, ())
badDep s = Dep $ const $ Left s
runDep (Dep f) = f []
astLink typed locals imps mods ast = runDep $ go [] ast where
go bound ast = case ast of
V s
| elem s bound -> pure ast
| member s locals -> case findImportSym imps mods s of
[] -> (if member s typed then pure () else addDep s) *> pure ast
_ -> badDep $ "ambiguous: " ++ s
| True -> case findImportSym imps mods s of
[] -> badDep $ "missing: " ++ s
[(im, t)] -> pure $ E $ Link im s t
_ -> badDep $ "ambiguous: " ++ s
A x y -> A <$> go bound x <*> go bound y
L s t -> L s <$> go (s:bound) t
_ -> pure ast
forFree cond f bound t = case t of
E _ -> t
V s -> if (not $ s `elem` bound) && cond s then f t else t
A x y -> A (rec bound x) (rec bound y)
L s t' -> L s $ rec (s:bound) t'
where rec = forFree cond f
inferno tycl typed defmap syms = let
loc = zip syms $ TV . (' ':) <$> syms
principal (acc, (subs, n)) s = do
expr <- maybe (Left $ "missing: " ++ s) Right (mlookup s defmap)
((t, a), (ms, n1)) <- infer typed loc expr (subs, n)
cs <- unify (TV (' ':s)) t ms
Right ((s, (t, a)):acc, (cs, n1))
gatherPreds (acc, psn) (s, (t, a)) = do
(psn, a) <- prove tycl psn a
pure ((s, (t, a)):acc, psn)
in do
(stas, (soln, _)) <- foldM principal ([], (Tip, 0)) syms
stas <- pure $ second (typeAstSub soln) <$> stas
(stas, (ps, _)) <- foldM gatherPreds ([], ([], 0)) $ second (typeAstSub soln) <$> stas
let
preds = fst <$> ps
dicts = snd <$> ps
applyDicts (s, (t, a)) = (s, (Qual preds t,
foldr L (forFree (`elem` syms) (\t -> foldl A t $ V <$> dicts) [] a) dicts))
pure $ map applyDicts stas
findImportSym imps mods s = concat [maybe [] (\(t, _) -> [(im, t)]) $ mlookup s qas | im <- imps, let qas = fst $ mods ! im]
inferDefs tycl defs typed = do
let
insertUnique m (s, (_, t)) = case mlookup s m of
Nothing -> case mlookup s typed of
Nothing -> Right $ insert s t m
_ -> Left $ "reserved: " ++ s
_ -> Left $ "duplicate: " ++ s
addEdges (sym, (deps, _)) (ins, outs) = (foldr (\dep es -> insertWith union dep [sym] es) ins deps, insertWith union sym deps outs)
graph = foldr addEdges (Tip, Tip) defs
defmap <- foldM insertUnique Tip defs
let
ins k = maybe [] id $ mlookup k $ fst graph
outs k = maybe [] id $ mlookup k $ snd graph
inferComponent typed syms = foldr (uncurry insert) typed <$> inferno tycl typed defmap syms
foldM inferComponent typed $ scc ins outs $ keys defmap
dictVars ps n = (zip ps $ map (('*':) . show) [n..], n + length ps)
inferTypeclasses tycl typeOfMethod typed dcs precs linker iMap mergedSigs = foldM inferInstance typed [(classId, inst) | (classId, insts) <- toAscList iMap, inst <- insts] where
inferInstance typed (classId, Instance ty name ps idefs) = let
dvs = map snd $ fst $ dictVars ps 0
perMethod s = do
let rawExpr = maybe (V $ "{default}" ++ s) id $ mlookup s idefs
expr <- snd <$> linker (patternCompile dcs $ fixFixity precs rawExpr)
(ta, (sub, n)) <- either (Left . (name++) . (" "++) . (s++) . (": "++)) Right
$ infer typed [] expr (Tip, 0)
let
(tx, ax) = typeAstSub sub ta
-- e.g. qc = Eq a => a -> a -> Bool
-- We instantiate: Eq a1 => a1 -> a1 -> Bool.
qc = typeOfMethod s
(Qual [Pred _ headT] tc, n1) = instantiate qc n
-- Mix the predicates `ps` with the type of `headT`, applying a
-- substitution such as (a1, [a]) so the variable names match.
-- e.g. Eq a => [a] -> [a] -> Bool
Just subc = match headT ty
(Qual ps2 t2, n2) = instantiate (Qual ps $ apply subc tc) n1
case match tx t2 of
Nothing -> Left "class/instance type conflict"
Just subx -> do
((ps3, _), tr) <- prove tycl (dictVars ps2 0) (proofApply subx ax)
if length ps2 /= length ps3
then Left $ ("want context: "++) . (foldr (.) id $ shows . fst <$> ps3) $ name
else pure tr
in do
ms <- mapM perMethod $ maybe (error $ "missing class: " ++ classId) id $ mlookup classId mergedSigs
pure $ insert name (Qual [] $ TC "DICTIONARY", flip (foldr L) dvs $ L "@" $ foldl A (V "@") ms) typed
primAdts =
[ (TC "()", [Constr "()" []])
, (TC "Bool", [Constr "True" [], Constr "False" []])
, (TAp (TC "[]") (TV "a"), [Constr "[]" [], Constr ":" $ map ("",) [TV "a", TAp (TC "[]") (TV "a")]])
, (TAp (TAp (TC ",") (TV "a")) (TV "b"), [Constr "," $ map ("",) [TV "a", TV "b"]])
]
prims = let
ro = E . Basic
dyad s = TC s `arr` (TC s `arr` TC s)
bin s = A (ro "Q") (ro s)
in map (second (first $ Qual [])) $
[ ("intEq", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "EQ"))
, ("intLE", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "LE"))
, ("charEq", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "EQ"))
, ("charLE", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "LE"))
, ("fix", (arr (arr (TV "a") (TV "a")) (TV "a"), ro "Y"))
, ("if", (arr (TC "Bool") $ arr (TV "a") $ arr (TV "a") (TV "a"), ro "I"))
, ("chr", (arr (TC "Int") (TC "Char"), ro "I"))
, ("ord", (arr (TC "Char") (TC "Int"), ro "I"))
, ("ioBind", (arr (TAp (TC "IO") (TV "a")) (arr (arr (TV "a") (TAp (TC "IO") (TV "b"))) (TAp (TC "IO") (TV "b"))), ro "C"))
, ("ioPure", (arr (TV "a") (TAp (TC "IO") (TV "a")), ro "V"))
, ("primitiveError", (arr (TAp (TC "[]") (TC "Char")) (TV "a"), ro "ERR"))
, ("newIORef", (arr (TV "a") (TAp (TC "IO") (TAp (TC "IORef") (TV "a"))), ro "NEWREF"))
, ("readIORef", (arr (TAp (TC "IORef") (TV "a")) (TAp (TC "IO") (TV "a")),
A (ro "T") (ro "READREF")))
, ("writeIORef", (arr (TAp (TC "IORef") (TV "a")) (arr (TV "a") (TAp (TC "IO") (TC "()"))),
A (A (ro "R") (ro "WRITEREF")) (ro "B")))
, ("exitSuccess", (TAp (TC "IO") (TV "a"), ro "END"))
, ("unsafePerformIO", (arr (TAp (TC "IO") (TV "a")) (TV "a"), A (A (ro "C") (A (ro "T") (ro "END"))) (ro "K")))
, ("join#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess")))
, ("fail#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess")))
]
++ map (\(s, v) -> (s, (dyad "Int", bin v)))
[ ("intAdd", "ADD")
, ("intSub", "SUB")
, ("intMul", "MUL")
, ("intDiv", "DIV")
, ("intMod", "MOD")
, ("intQuot", "QUOT")
, ("intRem", "REM")
, ("intXor", "XOR")
, ("intAnd", "AND")
, ("intOr", "OR")
]
tabulateModules mods = snd <$> foldM ins (Tip, Tip) mods where
go precs = foldr ($) neatEmpty{moduleImports = ["#"], opFixity = precs}
ins (accprecs, tab) (k, prog) = case mlookup k tab of
Nothing -> let v = go accprecs prog in Right (opFixity v, insert k v tab)
Just _ -> Left $ "duplicate module: " ++ k
slowUnionWith f x y = foldr go x $ toAscList y where go (k, v) m = insertWith f k v m
inferModule tab acc name = case mlookup name acc of
Nothing -> do
let
Neat mySigs iMap defs typedList adtTab ffis ffes imps precs = tab ! name
typed = fromList typedList
mergedSigs = foldr (slowUnionWith const) Tip $ mySigs : map (typeclasses . (tab !)) imps
mergedInstances = foldr (slowUnionWith (++)) Tip [fmap (map (im,)) m | (im, m) <- ("", iMap) : map (\im -> (im, instances $ tab ! im)) imps]
locals = fromList $ map (, ()) $ (fst <$> typedList) ++ (fst <$> defs)
tycl classId = maybe [] id $ mlookup classId mergedInstances
dcs = adtTab : map (dataCons . (tab !)) imps
typeOfMethod s = maybe undefined id $ foldr (<|>) (fst <$> mlookup s typed) [fmap fst $ lookup s $ typedAsts $ tab ! im | im <- imps]
acc' <- foldM (inferModule tab) acc imps
let linker = astLink typed locals imps acc'
depdefs <- mapM (\(s, t) -> (s,) <$> linker (patternCompile dcs $ fixFixity precs t)) defs
typed <- inferDefs tycl depdefs typed
typed <- inferTypeclasses tycl typeOfMethod typed dcs precs linker iMap mergedSigs
Right $ insert name (typed, (ffis, ffes)) acc'
Just _ -> Right acc
untangle s = do
tab <- insert "#" neatPrim <$> (parseProgram s >>= tabulateModules)
foldM (inferModule tab) Tip $ keys tab
neatPrim = foldr (\(a, b) -> addAdt a b []) (Neat Tip Tip [] prims Tip Tip Tip [] Tip) primAdts
-- FFI across multiple modules.
-- Change `div` and `mod` to round down instead towards zero for `Int`.
module RTS where
import Base
import Ast
import Kiselyov
import Map
import Parser
import_qq_here = import_qq_here
libc = [r|#include<stdio.h>
static int env_argc;
int getargcount() { return env_argc; }
static char **env_argv;
int getargchar(int n, int k) { return env_argv[n][k]; }
static int nextCh, isAhead;
int eof_shim() {
if (!isAhead) {
isAhead = 1;
nextCh = getchar();
}
return nextCh == -1;
}
void exit(int);
void putchar_shim(int c) { putchar(c); }
int getchar_shim() {
if (!isAhead) nextCh = getchar();
if (nextCh == -1) exit(1);
isAhead = 0;
return nextCh;
}
void errchar(int c) { fputc(c, stderr); }
void errexit() { fputc('\n', stderr); }
|]
preamble = [r|#define EXPORT(f, sym) void f() asm(sym) __attribute__((export_name(sym)));
void *malloc(unsigned long);
enum { FORWARD = 127, REDUCING = 126 };
enum { TOP = 1<<24 };
static u *mem, *altmem, *sp, *spTop, hp;
static inline u isAddr(u n) { return n>=128; }
static u evac(u n) {
if (!isAddr(n)) return n;
u x = mem[n];
while (isAddr(x) && mem[x] == _T) {
mem[n] = mem[n + 1];
mem[n + 1] = mem[x + 1];
x = mem[n];
}
if (isAddr(x) && mem[x] == _K) {
mem[n + 1] = mem[x + 1];
x = mem[n] = _I;
}
u y = mem[n + 1];
switch(x) {
case FORWARD: return y;
case REDUCING:
mem[n] = FORWARD;
mem[n + 1] = hp;
hp += 2;
return mem[n + 1];
case _I:
mem[n] = REDUCING;
y = evac(y);
if (mem[n] == FORWARD) {
altmem[mem[n + 1]] = _I;
altmem[mem[n + 1] + 1] = y;
} else {
mem[n] = FORWARD;
mem[n + 1] = y;
}
return mem[n + 1];
default: break;
}
u z = hp;
hp += 2;
mem[n] = FORWARD;
mem[n + 1] = z;
altmem[z] = x;
altmem[z + 1] = y;
return z;
}
static void gc() {
hp = 128;
u di = hp;
sp = altmem + TOP - 1;
for(u *r = root; *r; r++) *r = evac(*r);
*sp = evac(*spTop);
while (di < hp) {
u x = altmem[di] = evac(altmem[di]);
di++;
if (x != _NUM) altmem[di] = evac(altmem[di]);
di++;
}
spTop = sp;
u *tmp = mem;
mem = altmem;
altmem = tmp;
}
static inline u app(u f, u x) { mem[hp] = f; mem[hp + 1] = x; return (hp += 2) - 2; }
static inline u arg(u n) { return mem[sp [n] + 1]; }
static inline int num(u n) { return mem[arg(n) + 1]; }
static inline void lazy2(u height, u f, u x) {
u *p = mem + sp[height];
*p = f;
*++p = x;
sp += height - 1;
*sp = f;
}
static void lazy3(u height,u x1,u x2,u x3){u*p=mem+sp[height];sp[height-1]=*p=app(x1,x2);*++p=x3;*(sp+=height-2)=x1;}
|]
-- Main VM loop.
comdefsrc = [r|
F x = "foreign(num(1));"
Y x = x "sp[1]"
Q x y z = z(y x)
S x y z = x z(y z)
B x y z = x (y z)
BK x y z = x y
C x y z = x z y
R x y z = y z x
V x y z = z x y
T x y = y x
K x y = "_I" x
KI x y = "_I" y
I x = "sp[1] = arg(1); sp++;"
LEFT x y z = y x
CONS x y z w = w x y
NUM x y = y "sp[1]"
ADD x y = "_NUM" "num(1) + num(2)"
SUB x y = "_NUM" "num(1) - num(2)"
MUL x y = "_NUM" "num(1) * num(2)"
QUOT x y = "_NUM" "num(1) / num(2)"
REM x y = "_NUM" "num(1) % num(2)"
DIV x y = "_NUM" "div(num(1), num(2))"
MOD x y = "_NUM" "mod(num(1), num(2))"
XOR x y = "_NUM" "num(1) ^ num(2)"
AND x y = "_NUM" "num(1) & num(2)"
OR x y = "_NUM" "num(1) | num(2)"
EQ x y = "num(1) == num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);"
LE x y = "num(1) <= num(2) ? lazy2(2, _I, _K) : lazy2(2, _K, _I);"
REF x y = y "sp[1]"
NEWREF x y z = z ("_REF" x) y
READREF x y z = z "num(1)" y
WRITEREF x y z w = w "((mem[arg(2) + 1] = arg(1)), _K)" z
END = "return;"
ERR = "sp[1]=app(app(arg(1),_ERREND),_ERR2);sp++;"
ERR2 = "lazy3(2, arg(1), _ERROUT, arg(2));"
ERROUT = "errchar(num(1)); lazy2(2, _ERR, arg(2));"
ERREND = "errexit(); return;"
|]
argList t = case t of
TC s -> [TC s]
TV s -> [TV s]
TAp (TC "IO") (TC u) -> [TC u]
TAp (TAp (TC "->") x) y -> x : argList y
_ -> [t]
cTypeName (TC "()") = "void"
cTypeName (TC "Int") = "int"
cTypeName (TC "Char") = "int"
cTypeName _ = "int"
ffiDeclare (name, t) = let tys = argList t in (concat
[cTypeName $ last tys, " ", name, "(", intercalate "," $ cTypeName <$> init tys, ");\n"]++)
ffiArgs n t = case t of
TAp (TC "IO") u -> ("", ((False, u), n))
TAp (TAp (TC "->") _) y -> first (((if 3 <= n then ", " else "") ++ "num(" ++ shows n ")") ++) $ ffiArgs (n + 1) y
_ -> ("", ((True, t), n))
needsNum t = case t of
TC "Int" -> True
TC "Char" -> True
_ -> False
ffiDefine n (name, t) = ("case " ++) . shows n . (": " ++) . if ret == TC "()"
then longDistanceCall . cont ("_K"++) . ("); break;"++)
else ("{u r = "++) . longDistanceCall . cont ((if needsNum ret then "app(_NUM, r)" else "r") ++) . ("); break;}\n"++)
where
(args, ((isPure, ret), count)) = ffiArgs 2 t
lazyn = ("lazy2(" ++) . shows (if isPure then count - 1 else count + 1) . (", " ++)
cont tgt = if isPure then ("_I, "++) . tgt else ("app(arg("++) . shows (count + 1) . ("), "++) . tgt . ("), arg("++) . shows count . (")"++)
longDistanceCall = (name++) . ("("++) . (args++) . ("); "++) . lazyn
genMain n = "int main(int argc,char**argv){env_argc=argc;env_argv=argv;rts_reduce(" ++ shows n ");return 0;}\n"
arrCount = \case
TAp (TAp (TC "->") _) y -> 1 + arrCount y
_ -> 0
genExport m n = ("void f"++) . shows n . ("("++)
. foldr (.) id (intersperse (',':) $ map (("u "++) .) xs)
. ("){rts_reduce("++)
. foldl (\s x -> ("app("++) . s . (",app(_NUM,"++) . x . ("))"++)) rt xs
. (");}\n"++)
where
xs = map ((('x':) .) . shows) [0..m - 1]
rt = ("root["++) . shows n . ("]"++)
genArg m a = case a of
V s -> ("arg("++) . (maybe undefined shows $ lookup s m) . (')':)
E (StrCon s) -> (s++)
A x y -> ("app("++) . genArg m x . (',':) . genArg m y . (')':)
genArgs m as = foldl1 (.) $ map (\a -> (","++) . genArg m a) as
genComb (s, (args, body)) = let
argc = ('(':) . shows (length args)
m = zip args [1..]
in ("case _"++) . (s++) . (':':) . (case body of
A (A x y) z -> ("lazy3"++) . argc . genArgs m [x, y, z] . (");"++)
A x y -> ("lazy2"++) . argc . genArgs m [x, y] . (");"++)
E (StrCon s) -> (s++)
) . ("break;\n"++)
comb = (,) <$> conId <*> ((,) <$> many varId <*> (res "=" *> combExpr))
combExpr = foldl1 A <$> some
(V <$> varId <|> E . StrCon <$> lexeme tokStr <|> paren combExpr)
comdefs = case parse (lexemePrelude *> braceSep comb <* eof) comdefsrc of
Left e -> error e
Right (cs, _) -> cs
comEnum s = maybe (error s) id $ lookup s $ zip (fst <$> comdefs) [1..]
comName i = maybe undefined id $ lookup i $ zip [1..] (fst <$> comdefs)
runFun = ([r|
static int div(int a, int b) { int q = a/b; return q - (((u)(a^b)) >> 31)*(q*b!=a); }
static int mod(int a, int b) { int r = a%b; return r + (((u)(a^b)) >> 31)*(!!r)*b; }
static void run() {
for(;;) {
if (mem + hp > sp - 8) gc();
u x = *sp;
if (isAddr(x)) *--sp = mem[x]; else switch(x) {
|]++)
. foldr (.) id (genComb <$> comdefs)
. ([r|
}
}
}
void rts_init() {
mem = malloc(TOP * sizeof(u)); altmem = malloc(TOP * sizeof(u));
hp = 128;
for (u i = 0; i < sizeof(prog)/sizeof(*prog); i++) mem[hp++] = prog[i];
spTop = mem + TOP - 1;
}
void rts_reduce(u n) {
static u ready;if (!ready){ready=1;rts_init();}
*(sp = spTop) = app(app(n, _UNDEFINED), _END);
run();
}
|]++)
resolve bigmap (m, s) = either (resolve bigmap) id $ (bigmap ! m) ! s
mayResolve bigmap (m, s) = mlookup m bigmap
>>= fmap (either (resolve bigmap) id) . mlookup s
appCell (hp, bs) x y = (Right hp, (hp + 2, bs . (x:) . (y:)))
enc tab mem = \case
Lf n -> case n of
Basic c -> (Right $ comEnum c, mem)
Const c -> appCell mem (Right $ comEnum "NUM") $ Right c
ChrCon c -> appCell mem (Right $ comEnum "NUM") $ Right $ ord c
StrCon s -> enc tab mem $ foldr (\h t -> Nd (Nd (lf "CONS") (Lf $ ChrCon h)) t) (lf "K") s
Link m s _ -> (Left (m, s), mem)
LfVar s -> maybe (error $ "resolve " ++ s) (, mem) $ mlookup s tab
Nd x y -> let
(xAddr, mem') = enc tab mem x
(yAddr, mem'') = enc tab mem' y
in appCell mem'' xAddr yAddr
asm hp0 combs = tabmem where
tabmem = foldl (\(as, m) (s, t) -> let (p, m') = enc (fst tabmem) m t
in (insert s p as, m')) (Tip, (hp0, id)) combs
rewriteCombs tab = optim . go where
go = \case
LfVar v -> let t = follow [v] v in case t of
Lf (Basic _) -> t
LfVar w -> if v == w then Nd (lf "Y") (lf "I") else t
_ -> LfVar v
Nd a b -> Nd (go a) (go b)
t -> t
follow seen v = case tab ! v of
LfVar w | w `elem` seen -> LfVar $ last seen
| True -> follow (w:seen) w
t -> t
codegenLocal (name, (typed, _)) (bigmap, (hp, f)) =
(insert name localmap bigmap, (hp', f . memF))
where
rawCombs = optim . nolam . snd <$> typed
combs = toAscList $ rewriteCombs rawCombs <$> rawCombs
(localmap, (hp', memF)) = asm hp combs
codegen ffis mods = (bigmap', mem) where
(bigmap, (_, memF)) = foldr codegenLocal (Tip, (128, id)) $ toAscList mods
bigmap' = (resolveGlobal <$>) <$> bigmap
mem = resolveGlobal <$> memF []
ffiIndex = fromList $ zip (keys ffis) [0..]
resolveGlobal = \case
Left (m, s) -> if m == "{foreign}"
then ffiIndex ! s
else resolveGlobal $ (bigmap ! m) ! s
Right n -> n
getIOType (Qual [] (TAp (TC "IO") t)) = Right t
getIOType q = Left $ "main : " ++ show q
compile mods = do
let
ffis = foldr (\(k, v) m -> insertWith (error $ "duplicate import: " ++ k) k v m) Tip $ concatMap (toAscList . fst . snd) $ elems mods
(bigmap, mem) = codegen ffis mods
ffes = foldr (\(expName, v) m -> insertWith (error $ "duplicate export: " ++ expName) expName v m) Tip
[ (expName, (addr, argcount))
| (modName, (_, (_, ffes))) <- toAscList mods
, (expName, ourName) <- toAscList ffes
, let addr = maybe (error $ "missing: " ++ ourName) id $ mlookup ourName $ bigmap ! modName
, let argcount = arrCount $ mustType modName ourName
]
mustType modName s = case mlookup s $ fst $ mods ! modName of
Just (Qual [] t, _) -> t
_ -> error "TODO: report bad exports"
mayMain = do
mainAddr <- mlookup "main" =<< mlookup "Main" bigmap
(mainType, _) <- mlookup "main" $ fst $ mods ! "Main"
pure (mainAddr, mainType)
mainStr <- case mayMain of
Nothing -> pure ""
Just (a, q) -> do
getIOType q
pure $ genMain a
pure
$ ("typedef unsigned u;\n"++)
. ("enum{_UNDEFINED=0,"++)
. foldr (.) id (map (\(s, _) -> ('_':) . (s++) . (',':)) comdefs)
. ("};\n"++)
. ("static const u prog[]={" ++)
. foldr (.) id (map (\n -> shows n . (',':)) mem)
. ("};\nstatic u root[]={" ++)
. foldr (.) id (map (\(addr, _) -> shows addr . (',':)) $ elems ffes)
. ("0};\n" ++)
. (preamble++)
. (libc++)
. foldr (.) id (ffiDeclare <$> toAscList ffis)
. ("static void foreign(u n) {\n switch(n) {\n" ++)
. foldr (.) id (zipWith ffiDefine [0..] $ toAscList ffis)
. ("\n }\n}\n" ++)
. runFun
. foldr (.) id (zipWith (\(expName, (_, argcount)) n -> ("EXPORT(f"++) . shows n . (", \""++) . (expName++) . ("\")\n"++) . genExport argcount n) (toAscList ffes) [0..])
$ mainStr
-- FFI across multiple modules.
module Main where
import Base
import Map
import Ast
import RTS
import Typer
import Kiselyov
import System
hide_prelude_here' = hide_prelude_here'
dumpWith dumper s = case untangle s of
Left err -> err
Right tab -> foldr ($) [] $ map (\(name, mod) -> ("module "++) . (name++) . ('\n':) . (foldr (.) id $ dumper mod)) $ toAscList tab
dumpLambs (typed, _) = map (\(s, (_, t)) -> (s++) . (" = "++) . shows t . ('\n':)) $ toAscList typed
dumpTypes (typed, _) = map (\(s, (q, _)) -> (s++) . (" :: "++) . shows q . ('\n':)) $ toAscList typed
dumpCombs (typed, _) = map go combs where
rawCombs = optim . nolam . snd <$> typed
combs = toAscList $ rewriteCombs rawCombs <$> rawCombs
go (s, t) = (s++) . (" = "++) . shows t . (";\n"++)
main = getArgs >>= \case
"comb":_ -> interact $ dumpWith dumpCombs
"lamb":_ -> interact $ dumpWith dumpLambs
"type":_ -> interact $ dumpWith dumpTypes
_ -> interact \s -> either id id $ untangle s >>= compile