2
\$\begingroup\$

I had to implement a parser according to a given grammar as part of a school project. However, only the program execution was graded and I want to know if my code adheres to best practices. I would really appreciate it if you could give me pointers on how to improve my code.

Search conditions are of the form: a + b < 10 OR c = 1. The following code parses the search condition into an expression tree. The input to the function is a vector of the tokens in the search condition.

void recursiveSplit(node *parent, vector<string> const &tokens, string delimiter) {
 /* Given a string, parse it into an expression tree with nodes as the
 * operators and the children as the operands */
 if (tokens.size() == 1) {
 /* Base case for the recursion, the leaf nodes are either strings or
 * integers */
 node *mathNode;
 mathNode = new node(tokens[0]);
 parent->subTree.push_back(mathNode);
 return;
 } else {
 vector<string>::const_iterator it;
 for (it = tokens.begin(); it != tokens.end(); it++) {
 if (*it == delimiter) {
 break;
 }
 }
 if (it != tokens.end()) {
 /* Case 1 :delimiter was found */
 node *delimNode;
 delimNode = new node(delimiter);
 parent->subTree.push_back(delimNode);
 vector<string> leftTokens(tokens.begin(), it), rightTokens(it+1, tokens.end());
 string newDelimiter = switchDelimiter(delimiter);
 recursiveSplit(delimNode, leftTokens, newDelimiter);
 /* In rightTokens, we still need to search for the current
 * delimiter since we may have multiple instances of the delimiter
 * */
 recursiveSplit(delimNode, rightTokens, delimiter);
 } else {
 // delimiter is not found
 string newDelimiter = switchDelimiter(delimiter);
 recursiveSplit(parent, tokens, newDelimiter);
 }
 }
}
string switchDelimiter(string delimiter) {
 /* Return the next delimiter based on the priority. The priority of
 * operations is : OR, AND, <, >, =, +, -, * */
 char delim;
 if (delimiter == "OR") {
 delim = '|';
 } else if (delimiter == "AND") {
 delim = '&';
 } else {
 delim = delimiter[0];
 }
 switch(delim) {
 case '|':
 return "AND";
 case '&':
 return "<";
 case '<':
 return ">";
 case '>':
 return "=";
 case '=':
 return "+";
 case '+':
 return "-";
 case '-':
 return "*";
 case '*':
 return " ";
 }
}
node *createTree(const string& searchStrBuf) {
 string buf;
 stringstream ss(searchStrBuf);
 vector<string> tokens;
 while(ss >> buf) {
 if ((buf != ")") && (buf != "(")) {
 tokens.push_back(buf);
 }
 }
 node *root = new node("searchTreeRoot");
 recursiveSplit(root, tokens, "OR");
 return root;
}
class node {
 public:
 string nodeType;
 vector<node *> subTree;
 node(string strNodeType) {
 nodeType = strNodeType;
 }
};
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Jan 6, 2018 at 4:39
\$\endgroup\$

1 Answer 1

1
\$\begingroup\$

You don't show how this is being called. So we have to make some assumptions:

  1. I assume that the search conditions have been parsed into tokens as one-token-per-string in the vector? That is, for search conditions like "a + b < 10 OR c = 1" (your example), the tokens vector looks like { "a", "+", "b", "<", "10", "OR", "c", "=", "1" }.

  2. I assume that the node type contains a vector<node *> called subTree.

  3. You also don't specify if you expect the resulting tree to be a binary tree or not. Your code is written that way, but it doesn't seem to be a requirement. Other than the relational operators, the rest of your operators could easily handle more than two operands: a OR b OR c, 1 + 2 + 3, etc. I assume that the tree must be binary, since you coded it that way.

With those assumptions, I observe:

  1. You are not handling the case of a zero-length tokens vector. This would be an empty input, or an error of some kind.

  2. You are not handling syntax errors at all, really. What should your code be returning for something like a + * b?

  3. You are using recursion when it is not necessary. Specifically, your switchDelimiter function is really just a way to hide iteration. It would be better, IMO, to write your code to loop over the delimiters instead of calling a function and then recursing with the same tokens and a new delimiter.

  4. You are using recursion when it is not necessary. Specifically, your "right subtree" case recurses with the same delimiter, and it + 1. That's a sign that you can just continue your existing loop, after updating the parent node to a new value. (You will have to move your "new node" code inside the loop, though.)

  5. 3.
answered Jan 8, 2018 at 3:38
\$\endgroup\$
2
  • \$\begingroup\$ I can't seem to figure out how to write my code with iterations. I make a loop over the delimiters and for each delimiter, I add all its occurrences to the tree. But then, I need to connect the next delimiter to its parent node, how do I choose the parent node among the multiple instances found earlier? \$\endgroup\$ Commented Jan 13, 2018 at 20:40
  • \$\begingroup\$ You don't. When you find a delimiter, you recurse to build a subtree for the left node using a higher-precedence delimiter (you can pass the current delimiter, or just let it run). That recursion should produce your left subtree going from tokens.begin() up to the current it location. When you come back from the recursion, update your parent node to point to the node you just created, and keep looping. Because you update parent, the next delimiter will create a new right subtree. \$\endgroup\$ Commented Jan 13, 2018 at 21:00

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.