10
\$\begingroup\$

I'm working on a grammar for the Visual Basic for Applications (VBA) programming language. I've discovered a way to make assertions about how a parse tree should be generated by using the Antlr ToStringTree method.

My method is this:

  1. Create a parser and give it a code snippet.
  2. Obtain the parse tree result and turn it into a string using ToStringTree.
  3. Perform a string comparison ignoring whitespace.

What I'm thinking though, is that I may be a little too prescriptive in the final step. As an example, I'm working through the valid identifier use cases in my unit tests:

public class ModuleStatementTests
{
 private readonly List<string> ambiguousIdentifiers = new List<string>()
 {
 "Access",
 "Alias",
 "Append",
 "Base",
 "Binary",
 "ClassInit",
 "ClassTerm",
 "CLngLng",
 "Compare",
 "Database",
 "DefLngLng",
 "Error",
 "Explicit",
 "Lib",
 "Line",
 "LongLong",
 "Mid",
 "MidB",
 "Module",
 "Object",
 "Output",
 "Property",
 "PtrSafe",
 "Random",
 "Read",
 "Reset",
 "Step",
 "Text",
 "Width"
 };
 [Fact]
 public void CanParseAmbiguousIdentifierInVariableDeclaration()
 {
 const string variableDeclarationTemplate = "Dim {0} As String";
 const string expectedVariableDeclaration =
 "(variableDeclaration Dim (variableDclList (variableDcl (untypedVariableDcl (identifier {0}) (asClause (asType As (typeSpec (typeExpression (builtInType (reservedTypeIdentifier String))))))))))";
 CanParseAllAmbiguousIdentifiers(variableDeclarationTemplate, expectedVariableDeclaration, p => p.variableDeclaration());
 }
 private void CanParseAllAmbiguousIdentifiers(string sourceTemplate, string expectedOutputTemplate, Func<VbaParser, ParserRuleContext> rule)
 {
 foreach (var id in ambiguousIdentifiers)
 {
 var source = string.Format(sourceTemplate, id);
 var expectedTree = string.Format(expectedOutputTemplate, id);
 CanParseSource(source, expectedTree, rule);
 }
 }
 private static void CanParseSource(string source, string expectedTree, Func<VbaParser, ParserRuleContext> rule)
 {
 var parser = VbaCompilerHelper.BuildVbaParser(source);
 var result = rule(parser);
 Assert.Null(result.exception);
 ParseTreeHelper.TreesAreEqual(expectedTree, result.ToStringTree(parser));
 }
}
internal static class ParseTreeHelper
{
 internal static void TreesAreEqual(string expected, string actual)
 {
 if (expected == null || actual == null)
 {
 Assert.True(false, "Expected and/or Actual are null.");
 }
 var filteredExpected = RemoveWhiteSpace(expected);
 var filteredActual = RemoveWhiteSpace(actual);
 Assert.Equal(filteredExpected, filteredActual);
 }
 private static string RemoveWhiteSpace(string input)
 {
 // the final \\t replacement is necessary because antlr seems to add it to the ToStringTree method. 
 return input.Replace("\t", "").Replace(" ", "").Replace("\\t", "");
 }
}

My concern is that I'm being very prescriptive in my expectations for the test in specifying the entire parse tree output (in string form). I've already broken a lot of unit tests, just refactoring my grammar, which should not have broken these tests (semantically at least). It means I have to alter my tests as much as my grammar making them highly coupled.

The purpose of this test is to see whether the grammar will accept any of the ambiguous identifiers, not so much the rest of the parse tree structure. This example is relatively simple, when a more complex snippet is used, the parse tree string gets crazy real quick!

PS: I've elided a bunch of code and if I've missed anything relevant to understanding this, let me know and I'll update. The full source is in the link earlier as well.

asked Jul 18, 2015 at 9:25
\$\endgroup\$
1
  • 1
    \$\begingroup\$ I think ParseTreeHelper.TreesAreEqual() may be really relevant to your question, but I'm not entirely sure. \$\endgroup\$ Commented Jul 18, 2015 at 10:36

1 Answer 1

3
\$\begingroup\$

I wanted to post an idea I had to address my concern over the brittleness of my unit test(s). I realised that since I am performing a string comparison, that I could use Regular Expressions. Then I can ignore parts of the parse tree string that are not interesting. So I have this as a potential solution:

[Fact]
public void CanParseAmbiguousIdentifierInVariableDeclaration()
{
 const string variableDeclarationTemplate = "Dim {0} As String";
 const string regexTemplate = @"(\(variableDeclaration.*\(identifier {0}\).*\))";
 CanParseAllAmbiguousIdentifiersRegex(variableDeclarationTemplate, regexTemplate,
 p => p.variableDeclaration());
}
private void CanParseAllAmbiguousIdentifiersRegex(string sourceTemplate, string regexTemplate, Func<VbaParser, ParserRuleContext> rule)
{
 foreach (var id in ambiguousIdentifiers)
 {
 var source = string.Format(sourceTemplate, id);
 var regex = string.Format(regexTemplate, id);
 CanParseSourceRegex(source, regex, rule);
 }
}
private static void CanParseSourceRegex(string source, string regex, Func<VbaParser, ParserRuleContext> rule)
{
 var parser = VbaCompilerHelper.BuildVbaParser(source);
 var result = rule(parser);
 Assert.Null(result.exception);
 Assert.True(Regex.IsMatch(result.ToStringTree(parser), regex));
}
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
answered Jul 20, 2015 at 3:30
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.