Background
I'm building a lisp-like toy language in Haskell with the following (stripped down) AST:
type Lisp = ExceptT Error (StateT Env IO)
type PrimOp = [Value] -> Lisp Value
data Value
-- | any identifier, e.g. "x", "lambda"
= VName Name
-- | s-expression, e.g. "(x y z)"
| VExpr (NonEmpty Value)
-- | curried lambda, e.g. "((lambda (x y) (y x)) 123)"
-- or even simply "(lambda (x) x)"
| VCurry (NonEmpty Name) Value Env
-- | primitive / built-in syntactic operations, e.g. "lambda"
| VPrimOp PrimOp
Part of the language is a primitive operation named lambda
:
(lambda (<args>...) <value>) -- syntax
(lambda (x y) x) -- example: const function
... which gets translated to the following in Haskell AST:
VExpr (VName "lambda" :| [VExpr (VName "x" :| [VName "y"]), VName "x"])
... then evaluated by the primitive operation primLambda
, converting it from a s-expression to a curried function:
VCurry ("x" :| ["y"]) (VName "x") <env> <- primlambda <above-ast>
The Problematic Code
Due to various quirks of the language, I cannot write the lambda
into the parser. Thus I have ended up with the following implementation of lambda
as a VPrimOp
:
primLambda :: PrimOp
primLambda values = do
(params, result) <- case values of
[params, result] -> pure (params, result)
_ -> throwError $ EArgCount "lambda" 2 (length values)
names <- case params of
VExpr names -> pure names
_ -> throwError $ EArgType "lambda" "VExpr" params
names' <- forM names $ \case
VName name -> pure name
name -> throwError $ EArgType "lambda" "VName" name
gets (VCont names' result)
Obviously there is a repeated pattern here: I'm doing some pattern matching at each of the 3 steps, and calling throwError
if the pattern does not match. However, the second and third step looks only at part of the first result, and the third step has to match everything in a list. Is there a more concise way to do this?
2 Answers 2
The amount of repeated code isn't large, I don't think that's worth worrying about. It does seem like you've got a lot of buried case
s making your nice do
block hard to read.
I don't think you can use MonadFail
for this (although I'm sure you've got the instance on hand); I assume you want the rich custom errors you're making.
My first suggestion would be to move this logic to helpers that live with your Error
class, something like
forceArgCount :: (MonadError Error m) => Int -> String -> [a] -> m [a]
forceArgCount neededLen label as = if neededLen == (length as)
then pure as
else throwError $ EArgCount label neededLen (length as)
...
primLambda values = do
[params, result] <- forceArgCount 2 "lambda" values
names <- unpackVExpr "lambda" params
names' <- forM names (unpackVName "lambda")
gets (VCont names' result)
This is unsatasfying in two ways:
- Pattern matching on a list of fixed size is runtime-failable.
Also value-constants that describe the shape of a data-structure (e.g.
2
) are smelly.
You could improve it (arguably) by using type-application to passneededLen
as a type-level Nat literal, and the return type would be a fixed-length vector with that length. (I was gonna crib together an example based on a project I did last year, but I'm not sure I should actually be recommending the libraries I was using. There are multiple options.) - You'll have to write a different
unpackX
function for every use-case. It looks like the derivable classData.Data.Data
(in base) is supposed to help you write a more generic version (by letting you handle constructors as values); it's not something I've done.
After tinkering around, I settled on this solution using optics and some type classes:
data Match a b = a :& b
infixr 1 :&
type family MatchLength (o :: Type) :: Nat
type instance MatchLength () = 0
type instance MatchLength (Match a b) = 1 + MatchLength b
class MatchOptic o s a | o -> s, o -> a where
matchOptic :: Text -> o -> s -> Lisp a
instance
( Is k An_AffineTraversal, Is k A_Review, Data s
) => MatchOptic (Optic' k i s a) s a where
matchOptic name optic value = either throws pure (matching optic value) where
throws _ = throwError (EArgType name
(show (toConstr (review optic undefined)))
(show (toConstr value)))
class MatchOptics o s r | o -> r where
matchOptics :: Name -> o -> [s] -> Lisp r
instance MatchOptics () s () where
matchOptics _ _ [] = pure ()
matchOptics name _ values = throwError (EArgCount name 0 (length values))
instance
( KnownNat (MatchLength os), MatchOptic o s a, MatchOptics os s as
) => MatchOptics (Match o os) s (Match a as) where
matchOptics name (optic :& optics) (value:values) = do
result <- matchOptic name optic value
results <- matchOptics name optics values `catchError` \case
EArgCount n x y -> throwError (EArgCount n (x + 1) (y + 1))
err -> throwError err
pure (result :& results)
matchOptics name optics [] = throwError (EArgCount name plen 0) where
plen = fromIntegral (natVal (Proxy @(MatchLength os)))
MatchOptic
matches a single Value
with an optic (more specifically a Prism
), and raises EArgType
if the optic does not produce a match. It uses Data
to automatically get the constructor of a value (e.g. show (toConstr (Just 1)) == "Just"
).
MatchOptics
matches a list of [Value]
with multiple optics, and uses :&
as a "polymorphic" list to store the Optic
s and the matched results.
The resulting primLambda
code looks like this:
primLambda :: PrimOp
primLambda values = do
names :& result :& () <- matchOptics "lambda"
(_VExpr :& simple @Value :& ()) values
names' <- mapM (matchOptic "lambda" _VName) names
gets (VCont names' result)
The code is much more concise and does length-matching and constructor-matching in one step. It also avoids the need to write many different unpackX
functions like in @ShapeOfMatter's answer, since optics are automatically generated using template-haskell.
The solution can be extended by providing alternative MatchOptic
instances. For example, the following instance and expression can be used to match aeson arrays that can also be null:
instance Is k A_Fold => MatchOptic (String, Optic' k i s a) s a where
matchOptic name (cons, optic) value = maybe throws pure (headOf optic value)
where throws = throwError (EArgType name cons "")
matchOptic "abc" ("optional array", failing
(_Null % to (const Nothing)) (_Array % to Just))
-
\$\begingroup\$ If you want a review of the updated code you need to add a follow up question rather than answering your own question. Since someone has already upvoted this answer, it is up to you to decide what you want. \$\endgroup\$2023年02月11日 19:29:33 +00:00Commented Feb 11, 2023 at 19:29
-
\$\begingroup\$ This is the solution I'm satisfied with, so I'm not looking for a review of it. \$\endgroup\$mingmingrr– mingmingrr2023年02月13日 01:16:52 +00:00Commented Feb 13, 2023 at 1:16
Explore related questions
See similar questions with these tags.
throwError
as part of theLisp
monad stack, because the user gave the interpreter an invalid program. Edited to clarify. \$\endgroup\$