Recursive Descent Parser Stack Overflow

Question 1

I have the following code for a parser that accepts a linked list as a parameter from a parent class. It throws a stack overflow error after I input the expression.

I pass the input expression from a jTextField in a Swing GUI class, then return the boolean result to a jLabel in the same class. What could be causing the stack overflow error? Help please, Thanks!!

Example input:

1+2+3

(1+2)/(3+4)

import java.util.LinkedList;
class Token {
 public static final int PLUS_MINUS = 0;
 public static final int MULTIPLY_DIVIDE = 1;
 public static final int EXPONENT = 2;
 public static final int NUMBER = 3;
 public static final int IDENTIFIER = 4;
 public static final int OPEN = 5;
 public static final int CLOSE = 6;
 //public static final int NEGATIVE = 7;
 public final int token; // FIELDS TO HOLD DATA PER TOKEN
 public final String sequence;
 public Token (int token, String sequence) {
 super();
 this.token = token;
 this.sequence = sequence;
 }
}
public class Parser {
 private Token next; // POINTER FOR NEXT TOKEN
 private LinkedList<Token> tokens; // LIST OF TOKENS PRODUCED BY TOKENIZER
 private int counter = 0;
 public Parser(LinkedList tokens) 
 {
 this.tokens = (LinkedList<Token>) tokens.clone(); // GET LINKEDLIST
 this.tokens.getFirst(); // ASSIGNS FIRST ELEMENT OF LINKEDLIST
 }
 //////// START OF PARSING METHODS ////////
 /*
 GRAMMAR:
 E -> E+T | E-T | T | -E
 T -> T*X | T/X | X
 X -> X^F | F
 F -> (E) | NUMBERS | IDENTIFIERS
 F -> (E) | N | I
 N -> D | ND
 I -> IDENTIFIERS
 */
 public boolean Parse ()
 {
 return E(); // INVOKE START SYMBOL
 }
 private boolean term (int token) // GETS NEXT TOKEN
 {
 boolean flag = false;
 if(next.token == token)
 flag = true;
 counter++; // INCREMENT COUNTER
 if(counter < tokens.size()) // POINT TO NEXT TOKEN
 next = tokens.get(counter);
 return flag;
 }
 ///////// START OF LIST OF PRODUCTIONS /////////
 //////// E -> E+T | E-T | T | -E ////////
 private boolean E() 
 {
 return E1() || E2() || E3();
 }
 private boolean E1 ()
 {
 // E -> E+T | E-T
 int flag = counter;
 boolean result = true;
 if(!(E() && term(Token.PLUS_MINUS) && T() ))
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
 private boolean E2 ()
 {
 // E -> T
 int flag = counter;
 boolean result = true;
 if(!T())
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
 private boolean E3 ()
 {
 // E -> -E
 int flag = counter;
 boolean result = true;
 if(!(term(Token.PLUS_MINUS) && E() ))
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
 //////// T -> T*X | T/X | X ////////
 private boolean T()
 {
 return T1() || T2();
 }
 private boolean T1 ()
 {
 // T -> T*X | T/X
 int flag = counter;
 boolean result = true;
 if(!(T() && term(Token.MULTIPLY_DIVIDE) && X() ))
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
 private boolean T2 ()
 {
 // T -> X
 int flag = counter;
 boolean result = true;
 if(!X())
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
 //////// X -> X^F | F ////////
 private boolean X()
 {
 return X1() || X2();
 }
 private boolean X1()
 {
 // X-> X^F
 int flag = counter;
 boolean result = true;
 if(!(X() && term(Token.EXPONENT) && F()))
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
 private boolean X2()
 {
 // X-> F
 int flag = counter;
 boolean result = true;
 if(!F())
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
 //////// F -> (E) | NUMBERS | IDENTIFIERS ////////
 private boolean F()
 {
 return F1() || F2() || F3();
 }
 private boolean F1()
 {
 // F -> (E)
 int flag = counter;
 boolean result = true;
 if(!(term(Token.OPEN) && E() && term(Token.CLOSE)))
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
 private boolean F2()
 {
 // F -> NUMBERS
 int flag = counter;
 boolean result = true;
 if(!term(Token.NUMBER))
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
 private boolean F3()
 {
 // F -> IDENTIFIERS
 int flag = counter;
 boolean result = true;
 if(!term(Token.IDENTIFIER))
 {
 counter = flag; // BACKTRACK
 if(counter < tokens.size()) // POINT TO PREVIOUS TOKEN
 next = tokens.get(counter);
 result = false;
 }
 return result;
 }
}

Question 2

Do you have an example of the input?

Question 3

Yes hello this is Stack Overflow how may we help you? (And so you know, it's generally better to trim your code just to the minimal amount. And some of your comments are kinda trash -- explain why, not what! And your brackets are on the wrong line for Java. And... Er, why am I criticizing your code instead o answering? And why am I whispering? Oh well. Gonna go pretend to be smart somewhere else.)

Question 4

For handling left recursive rules in a left recursive parser, see stackoverflow.com/questions/2245962/…

Question 5

Standard reminder that source-level debuggers, or print statements in lieu thereof, are the obvious way to try to understand what the code is doing when you can't figure it out by inspection. (Programming: the art of debugging an empty file.)

Question 6

Your problem is that recursive descent parsing cannot handle left recursive grammars. your first production says "E -> E + T", we call this left recursive, because the first thing being derived is the thing you are defining. The way recursive descent works is to first match an "E" then a "+" then a "T". The problem is that your "E" method first calls the "E1" method, which then immediately calls "E" again, which calls "E1" and encounters an infinite recursive loop. You need to left factor your grammar if you want to use recursive descent. I've copied a link that has more information on left factoring: http://en.wikipedia.org/wiki/Left_recursion. So in summary, you are overflowing the stack because you have an infinite recursive loop.

Question 7

Usually when you get a stack overflow it means that the program recursively calls one or methods without end, so let's look at how your program would execute.

It appears that your code is executed using the Parse method (by the way in Java the naming convention is to have lower case method names)

public boolean Parse() {
 return E(); // INVOKE START SYMBOL
}

Not too bad so far; the grammar specifies that E is first to be parsed, so E() is called. Let's look at the definition of E().

private boolean E() {
 return E1() || E2() || E3();
}

Let's see what happens when this executes. Java will evaluate this boolean expression by doing E1(), then E2(), and finally E3(), so let's look at E1 now.

private boolean E1 () {
 // E -> E+T | E-T
 int flag = counter;
 boolean result = true;
 if(!(E() && term(Token.PLUS_MINUS) && T() )) {
 counter = flag; // BACKTRACK
 ...

Here is the problem. Your flag is set, result is set to true, and the if statement immediately executes E(). Remember that E() was what was just being evaluated, and now E1() is calling E() again, which will execute E1(), forever (and if you debugged the program you would see in the application stack alternating calls to E1() and E()).

This is part of the trouble of recursive descent parsing. It is an elegant solution to parsing, but the grammar sometimes requires some rewriting or else this is the exact problem you run into, where you get trapped in a grammar rule. In order for this to work you need to rewrite the grammar (see this document on recursive descent parsing I found with a quick Google search).

There are two requirements for the grammar: it must be deterministic and it can't contain left recursion.

The problem you run into is left recursion, in nearly every rule:

E -> E+T | E-T | T | -E

This says that the token E can be the token E + T, and to recognize this you have to recognize the token E, which can be E + T, ... (forever). This caused the problem in your program.

Rewriting the grammar by eliminating the left recursion will solve this problem, and make sure when you finish you have a deterministic grammar.

Matt 4,0573 gold badges22 silver badges37 bronze badges · Accepted Answer · 2015-05-25 02:01:16Z

Your problem is that recursive descent parsing cannot handle left recursive grammars. your first production says "E -> E + T", we call this left recursive, because the first thing being derived is the thing you are defining. The way recursive descent works is to first match an "E" then a "+" then a "T". The problem is that your "E" method first calls the "E1" method, which then immediately calls "E" again, which calls "E1" and encounters an infinite recursive loop. You need to left factor your grammar if you want to use recursive descent. I've copied a link that has more information on left factoring: http://en.wikipedia.org/wiki/Left_recursion. So in summary, you are overflowing the stack because you have an infinite recursive loop.

CollectivesTM on Stack Overflow

Recursive Descent Parser Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related