2
\$\begingroup\$

Background

I'm building a lisp-like toy language in Haskell with the following (stripped down) AST:

type Lisp = ExceptT Error (StateT Env IO)
type PrimOp = [Value] -> Lisp Value
data Value
 -- | any identifier, e.g. "x", "lambda"
 = VName Name
 -- | s-expression, e.g. "(x y z)"
 | VExpr (NonEmpty Value)
 -- | curried lambda, e.g. "((lambda (x y) (y x)) 123)"
 -- or even simply "(lambda (x) x)"
 | VCurry (NonEmpty Name) Value Env
 -- | primitive / built-in syntactic operations, e.g. "lambda"
 | VPrimOp PrimOp

Part of the language is a primitive operation named lambda:

(lambda (<args>...) <value>) -- syntax
(lambda (x y) x) -- example: const function

... which gets translated to the following in Haskell AST:

VExpr (VName "lambda" :| [VExpr (VName "x" :| [VName "y"]), VName "x"])

... then evaluated by the primitive operation primLambda, converting it from a s-expression to a curried function:

VCurry ("x" :| ["y"]) (VName "x") <env> <- primlambda <above-ast>

The Problematic Code

Due to various quirks of the language, I cannot write the lambda into the parser. Thus I have ended up with the following implementation of lambda as a VPrimOp:

primLambda :: PrimOp
primLambda values = do
 (params, result) <- case values of
 [params, result] -> pure (params, result)
 _ -> throwError $ EArgCount "lambda" 2 (length values)
 names <- case params of
 VExpr names -> pure names
 _ -> throwError $ EArgType "lambda" "VExpr" params
 names' <- forM names $ \case
 VName name -> pure name
 name -> throwError $ EArgType "lambda" "VName" name
 gets (VCont names' result)

Obviously there is a repeated pattern here: I'm doing some pattern matching at each of the 3 steps, and calling throwError if the pattern does not match. However, the second and third step looks only at part of the first result, and the third step has to match everything in a list. Is there a more concise way to do this?

asked Dec 13, 2022 at 21:50
\$\endgroup\$
2
  • \$\begingroup\$ Welcome to Code Review! By "and throwing an error if the pattern does not match" does that mean that the code is not working as expected? \$\endgroup\$ Commented Dec 13, 2022 at 21:56
  • 2
    \$\begingroup\$ @sᴀᴍ-onᴇᴌᴀ I meant "error" as in the code uses throwError as part of the Lisp monad stack, because the user gave the interpreter an invalid program. Edited to clarify. \$\endgroup\$ Commented Dec 13, 2022 at 22:04

2 Answers 2

2
\$\begingroup\$

The amount of repeated code isn't large, I don't think that's worth worrying about. It does seem like you've got a lot of buried cases making your nice do block hard to read.

I don't think you can use MonadFail for this (although I'm sure you've got the instance on hand); I assume you want the rich custom errors you're making.

My first suggestion would be to move this logic to helpers that live with your Error class, something like

forceArgCount :: (MonadError Error m) => Int -> String -> [a] -> m [a]
forceArgCount neededLen label as = if neededLen == (length as)
 then pure as
 else throwError $ EArgCount label neededLen (length as)
...
primLambda values = do
 [params, result] <- forceArgCount 2 "lambda" values
 names <- unpackVExpr "lambda" params
 names' <- forM names (unpackVName "lambda")
 gets (VCont names' result)

This is unsatasfying in two ways:

  • Pattern matching on a list of fixed size is runtime-failable. Also value-constants that describe the shape of a data-structure (e.g. 2) are smelly.
    You could improve it (arguably) by using type-application to pass neededLen as a type-level Nat literal, and the return type would be a fixed-length vector with that length. (I was gonna crib together an example based on a project I did last year, but I'm not sure I should actually be recommending the libraries I was using. There are multiple options.)
  • You'll have to write a different unpackX function for every use-case. It looks like the derivable class Data.Data.Data (in base) is supposed to help you write a more generic version (by letting you handle constructors as values); it's not something I've done.
answered Dec 14, 2022 at 15:24
\$\endgroup\$
2
\$\begingroup\$

After tinkering around, I settled on this solution using optics and some type classes:

data Match a b = a :& b
infixr 1 :&
type family MatchLength (o :: Type) :: Nat
type instance MatchLength () = 0
type instance MatchLength (Match a b) = 1 + MatchLength b
class MatchOptic o s a | o -> s, o -> a where
 matchOptic :: Text -> o -> s -> Lisp a
instance
 ( Is k An_AffineTraversal, Is k A_Review, Data s
 ) => MatchOptic (Optic' k i s a) s a where
 matchOptic name optic value = either throws pure (matching optic value) where
 throws _ = throwError (EArgType name
 (show (toConstr (review optic undefined)))
 (show (toConstr value)))
class MatchOptics o s r | o -> r where
 matchOptics :: Name -> o -> [s] -> Lisp r
instance MatchOptics () s () where
 matchOptics _ _ [] = pure ()
 matchOptics name _ values = throwError (EArgCount name 0 (length values))
instance
 ( KnownNat (MatchLength os), MatchOptic o s a, MatchOptics os s as
 ) => MatchOptics (Match o os) s (Match a as) where
 matchOptics name (optic :& optics) (value:values) = do
 result <- matchOptic name optic value
 results <- matchOptics name optics values `catchError` \case
 EArgCount n x y -> throwError (EArgCount n (x + 1) (y + 1))
 err -> throwError err
 pure (result :& results)
 matchOptics name optics [] = throwError (EArgCount name plen 0) where
 plen = fromIntegral (natVal (Proxy @(MatchLength os)))

MatchOptic matches a single Value with an optic (more specifically a Prism), and raises EArgType if the optic does not produce a match. It uses Data to automatically get the constructor of a value (e.g. show (toConstr (Just 1)) == "Just").

MatchOptics matches a list of [Value] with multiple optics, and uses :& as a "polymorphic" list to store the Optics and the matched results.

The resulting primLambda code looks like this:

primLambda :: PrimOp
primLambda values = do
 names :& result :& () <- matchOptics "lambda"
 (_VExpr :& simple @Value :& ()) values
 names' <- mapM (matchOptic "lambda" _VName) names
 gets (VCont names' result)

The code is much more concise and does length-matching and constructor-matching in one step. It also avoids the need to write many different unpackX functions like in @ShapeOfMatter's answer, since optics are automatically generated using template-haskell.

The solution can be extended by providing alternative MatchOptic instances. For example, the following instance and expression can be used to match aeson arrays that can also be null:

instance Is k A_Fold => MatchOptic (String, Optic' k i s a) s a where
 matchOptic name (cons, optic) value = maybe throws pure (headOf optic value)
 where throws = throwError (EArgType name cons "")
matchOptic "abc" ("optional array", failing 
 (_Null % to (const Nothing)) (_Array % to Just))
answered Feb 11, 2023 at 18:34
\$\endgroup\$
2
  • \$\begingroup\$ If you want a review of the updated code you need to add a follow up question rather than answering your own question. Since someone has already upvoted this answer, it is up to you to decide what you want. \$\endgroup\$ Commented Feb 11, 2023 at 19:29
  • \$\begingroup\$ This is the solution I'm satisfied with, so I'm not looking for a review of it. \$\endgroup\$ Commented Feb 13, 2023 at 1:16

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.