Sparse: Test Generation for Simple S-expression Language Parsers🔗 i

Sparse is a tool (written in Racket) for generating test cases for parsers for simple S-expression languages. It is intended primarily for use by instructors of Programming Languages classes in which students implement a simple S-expression-based programming language. It takes as input a description of the grammar of the language to be implemented and produces test cases for the students’ parsers.

1Motivation🔗 i

Producing test cases for parsers by hand is tedious and difficult to do exhaustively. In addition, it is likely that students will have to implement several different versions of the language, in which case several different sets of test cases must be maintained. Furthermore, if any changes to the language are made from term to term, the test cases must be updated–and doing so may not always be a simple find-and-replace. It would be nice to have a tool which automatically generates exhaustive test cases based solely on the language’s grammar. This is exactly what Sparse does.

1.1Example🔗 i

Consider the following grammar for a simple language (dubbed PHYM4) which has conditionals, local variable binding, lambdas, function applications, numbers, strings, and identifiers:

Expr={ifExprExprExpr}
| {var {Id = Expr} ... Expr} |{lam{Id...}Expr}
| {Expr Expr ...} |Number
| String |Id

Here, Number, String, and Id are not to be taken literally, but rather as placeholders for literal numbers, strings, and identifiers. Additionally, if, var, lam, and = are to be forbidden as identifiers. Built-in operations, functions, and constants like +, *, <=, substring, or true are treated like any other identifier from the point of view of this language and so are not treated specially by the grammar. An example program from this language is

{{lam{xy}{if{<=xy}xy}}54}

which introduces a function to determine the minimum of two numbers, and then applies it to 5 and 4. Note, that we use braces as delimiters instead of parentheses to distinguish this tiny language from Racket. This language is very simple, and yet writing a suite of input programs to test students’ parsers would be tedious–and would likely fail to be comprehensive.

Sparse takes as input a description of the grammar which is not unlike the grammar shown above. It is worth noting here the largest limitation of Sparse. Sparse’s input grammar may only have one nonterminal. In this example, this is Expr. For languages whose grammars have more than one nonterminal, this can be worked around by only generating test cases for the core expression grammar, which is more likely to be able to be expressed as a grammar with one nonterminal, and leaving the full language to be tested with manually-written tests.

(definephym4-grammar
'([truefalsenull+-*/eq?<=substringarraynew-arrayarefaset!begin]
[lam=varif]
[Expr
{ifExprExprExpr}
{ExprExpr...}
{var{[id]=Expr}...Expr}
{lam{[id]...}Expr}
[id]
[string"Hello""World"""]
[number01-12.2-22/7]]))

The details of the format of the input grammar are described below. Using this grammar, one can call generate-valid-testcase to generate one large test case which will test to make sure the student’s parser accepts all valid programs. The following is produced for the grammar defined above:

{if{lam{}{lam{}{{lam{}{var{substring={lam{}{if{{{var{/={{if{var{-=
{var{+={if{iftrue{lam{truefalsenull+}false}{lam{-*/}"Hello"}}{null
{lam{eq?<=}0}{"World"{var{true=+}{1{if""{var{false="Hello"}{var{null
=-1}{if2.2{if{var""}-{{{{var/}-22/7}"World"}*}}{var0}}}}{if{var{lam
{substring}{if{if<=substring-1}1"World"}}}"Hello"eq?}}array}}new-arrayaref}
aset!begin}a}}b}}{*=c}d}ef}}}{eq?=g}{<==h}i}}}jk}}}{array=l}
{new-array=m}{aref=n}o}}}}}pq}

Additionally, one can call generate-invalid-testcases to generate a list of test cases which do *not* conform to the grammar to make sure that the student’s parser rejects all invalid programs. The following are produced for the grammar defined above.

{lamtruefalsenull}
{iflam"Hello""World"}
{if0lam1}
{if+-lam}
{}
{if}
{if""}
{if-1"Hello"}
{if0"Hello"eq?"World"}
{if<=1-1substring""}
{if"Hello"array2.2new-array"World"-22/7}
{=aref}
{""=}
{lam{true=e}f}
{var{lam=""}"World"}
{var{falselam1}1}
{var{null=var}j}
{var{}""}
{var{+}-1}
{var{-=}k}
{var{eq?=g"World"}l}
{var{<==2.2h""}"World"}
{var{substring=-22/7i"Hello"0}-22/7}
{var{array=m}if}
{var}
{var{new-array=""}}
{var{begin="World"}o""}
{var{a=-1}2.2p"Hello"}
{var{b=q}-22/7r"World"0}
{={true}s}
{lam{lam}""}
{lam{aref}=}
{lam}
{lam{aset!}}
{lam{b}2.2w}
{lam{c}"Hello"-22/7x}
{lam{d}"World"0y""}
lam
var

Since the "valid" test case produced is so large, it would likely not be very enlightening to a student trying to understand where exactly their parser is failing. To help alleviate this, Sparse includes a utility for producing minimal failing test cases for a student’s parser. For example, for one student’s parser, the following minimal failing test cases were identified:

{lam{truefalsenull+}false}
{lam{-*/}"Hello"}
{lam{eq?<=}1}
{lam{array}0}
{var{false="World"}x}
{var{+=0}x}

Here, the student’s mistake was forbidding the built-ins as parameter names; the language’s grammar and semantics allow redefining the names. It is worth noting that this student had actually passed all of the instructor’s original test cases. Since the instructor’s test cases only tried redefining a built-in once, a student whose submission was rejected the first time could simply special-case that one built-in and resubmit their assignment successfully. Manually writing test cases which redefine each built-in is rather tedious and requires a fair bit of foresight on the part of the instructor to know that that is an area where students might mistakes. By automating the generation of test cases, we can both reduce the tedium of writing test cases and make the test suite more comprehensive.

1.2Usage🔗 i

(require sparse ) package: sparse

Sparse exposes six functions: generate-valid-testcase , generate-invalid-testcases , generate-valid-testcase-for-minimization , conforms-to-grammar? produce-minimum-failing-testcases , and strip-minimization-info .

The first four of these functions take an S-expression representing the grammar for which to generate test cases. Several examples are provided in ‘example-grammars.rkt‘ Consider the example grammar from before:

(definephym4-grammar
;;ThefirstlistisforanylegalidentifiersyouwantSparsetouseexplicitly.
;;Thiscanbeusefulifyouthinkstudentsmightforbididentifiersthatarelegal.
'([truefalsenull+-*/eq?<=substringarraynew-arrayarefaset!begin]
;;Thesecondlistisforthesymbolsthatarenotlegalidentifiers
[lam=varif]
;;Thethirdlistisfortheproductionrulesofthegrammar.Thefirstelement
;;shouldbeasymbolforthegrammar'snonterminal.
;;Unfortunately,Sparsedoesnotsupportmorethanonenonterminal.
[Expr
;;Theproductionrules.Anynumbersandsymbolsotherthanthenameof
;;nonterminalaretreatedasliterals.
{ifExprExprExpr}
;;'...'meansthatthetermimmediatelytotheleftmayappearzeroormoretimes.
{ExprExpr...}
;;[id](or(id)or{id})referstovalididentifiers.Thiswillusethe
;;identifiersprovidedabove,aswellasthelettersofthealphabeta-z.
{var{[id]=Expr}...Expr}
{lam{[id]...}Expr}
[id]
;;Thisrepresentsanystring.Atleastoneexamplestringtousemustbeprovided.
[string"Hello""World"""]
;;Thisrepresentsanynumber.Atleastoneexamplenumbertousemustbeprovided
[number01-12.2-22/7]]))

procedure
( generate-valid-testcase grammar)→s-expression?
grammar:grammar?

Using the grammar, produces one large S-expression which conforms to the grammar and is comprehensive in the following sense: each different production rule in the grammar will be used in place of each instance of a nonterminal in each production rule. For example, using the above grammar, the produced S-expression is guaranteed to have a subexpression like {if_{var{_=_}_}_}. A var expression is guaranteed to appear in each of the three subexpressions of an if expression. Likewise, each different ’type’ of expression is guaranteed to appear in each spot where a subexpression is allowed. This process is deterministic; there’s no point in generating multiple expressions from the same grammar.

procedure
( generate-invalid-testcases grammar)→(listofs-expression?)
grammar:grammar?

Using the grammar, produces a list of simple S-expressions which do not conform to the grammar. The produced S-expressions are typically invalid because they use illegal identifiers or because they have incorrect lengths of lists. For instance, the above grammar will produce expressions that use lam in places where lam is not allowed, and will create if expressions that don’t have exactly three subexpressions.

procedure
( generate-valid-testcase-for-minimization grammar)
→s-expression?
grammar:grammar?

Produces an expression just like generate-valid-testcase , but that is additionally annotated with information that lets the test case minimization know which parts of the S-expression it is allowed to recurse into. Use strip-minimization-info to remove this information and get a pure S-expression.

procedure
( conforms-to-grammar? grammarexpression)→boolean?
grammar:grammar?
expression:sexp?

Checks whether the expression conforms the grammar specified. This will return true for the return value of generate-valid-testcase and false for all of the return values of generate-invalid-testcases .

procedure
( produce-minimum-failing-testcases test-case
parser
preprocess)
→(listofs-expression?)
test-case:s-expression?
parser:(->s-expression?any)
preprocess:(->s-expression?s-expression?)

Given a test case generated by generate-valid-testcase-for-minimization and a student’s parse function, produces a list of "minimal" test cases on which the student’s parser fails (i.e., throws an exception). The list of test cases is not exhaustive in the sense that there might be expressions on which the parser fails that won’t be in the list until other expressions in the list have successfully parsed. The list of expressions will only be empty if the student’s parser successfully parses the whole, large test case. If you find that the minimization takes longer than you would like, you might instruct your students to memoize their parse function. The preprocess argument is applied to any expression before it is passed to the student’s parser. This is useful if, for instance, the language in question has a top-level construct around which expressions must be wrapped.

procedure
( strip-minimization-info minimization-testcase)→s-expression?
minimization-testcase:s-expression?

Strips the annotations from the return values of generate-valid-testcase-for-minimization to give a pure S-expression. (strip-minimization-info (generate-valid-testcase-for-minimization grammar)) will produce the same value as (generate-valid-testcase grammar).

top ← prev up next →