Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Code too large #2161

Answered by MarcMazas
ssteinhauser asked this question in Q&A
Feb 10, 2025 · 17 comments · 11 replies
Discussion options

Hi,

I am currently struggling with the Code too large problem while I am trying to implement the IMPORT and EXPORT statements of Exasol (https://docs.exasol.com/db/latest/sql/import.htm / https://docs.exasol.com/db/latest/sql/export.htm).
You can see the current progress of my implementation here: https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-export

Did anyone already experience this problem and can provide a solution for that?

Thank you in advance!

You must be logged in to vote

Hi.
We advice you to take some time checking that JavaCC 8 solves your issue without functional & performance regressions.
I cloned https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-export, modified pom.xml:

...
 <!-- needed for parsing the Keywords via JTree in ParserKeywordsUtils -->
 <dependency>
<!-- <groupId>net.java.dev.javacc</groupId>-->
<!-- <artifactId>javacc</artifactId>-->
<!-- <version>[7.0.13,)</version>-->
 <groupId>org.javacc.generator</groupId>
 <artifactId>java</artifactId>
 <version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
 <scope>te...

Replies: 17 comments 11 replies

Comment options

Greetings.

I was worried that we will hit this at a time. Its a limitation of JavaCC when the Grammar is too complex and the generated Parser becomes to large (beyond the 64kb mark).
As far as I do understand it, there is not immediate solution to it. You can only try to split your production in smaller chunks.

In general, its a JavaCC issue and should be discussed there.
The only prompt ideas I have are:

  1. split Productions into smaller chunks
  2. split off DDL from DML and Queries
  3. switch to CongoCC which appears to be the superior, but toxic technology
You must be logged in to vote
0 replies
Comment options

I opened a case at JavaCC, see javacc/javacc#297

You must be logged in to vote
0 replies
Comment options

Hi @manticore-projects ,
thank you for the quick reply.
That's bad to hear, that there is not actual solution for that. However, thank you for opening the issue at JavaCC.

The error occurs in the TokenManager class, so I guess splitting the productions further won't fix the issue. From my understanding (possibly I am wrong), it is due to the number of tokens.

So I guess my implementation is being blocked by that problem and we need to wait for a solution or workaround provided at JavaCC.

You must be logged in to vote
0 replies
Comment options

@ssteinhauser: please give it a few days to collect feedback and then we will look for an acceptable way forward.

You must be logged in to vote
2 replies
Comment options

@ssteinhauser: I have slept over this and cleared my thoughts and without any promise, the most likely scenario now looks like this

  1. I am aiming for a final JSQLParser 5.2 release (based on JavaCC) because I need the new PipeSQL parser released

  2. after that, I will likely look for a migration to CongoCC although I have many concerns around that:

    a) I will need to overhaul the complete Grammar

    b) I will need to rewrite parts of the tool chain (drop Maven, write a Gradle module, rewrite the Rail Road builder, rewrite the Reserved Keywords logic)

    c) there are some "peoples issues" to overcome

So it looks like an aim for a JSQLParser 6 release with a lot of work to-do.

@wumpz: Your opinion and endorsement will be needed for this one to work.

Comment options

Hi @manticore-projects ,
alright, thank you for sharing your thoughts.

Regarding the overhaul of the complete Grammar, you can give the official syntax converter a try.

Please also let me know, if I can or should already provide a PR with my changes, so in case you can probably also convert that syntax to CongoCC so I don't have to do the same thing again on my end for my newly implemented syntax.

Comment options

Hi.
We advice you to take some time checking that JavaCC 8 solves your issue without functional & performance regressions.
I cloned https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-export, modified pom.xml:

...
 <!-- needed for parsing the Keywords via JTree in ParserKeywordsUtils -->
 <dependency>
<!-- <groupId>net.java.dev.javacc</groupId>-->
<!-- <artifactId>javacc</artifactId>-->
<!-- <version>[7.0.13,)</version>-->
 <groupId>org.javacc.generator</groupId>
 <artifactId>java</artifactId>
 <version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
 <scope>test</scope>
 </dependency>
 <dependency>
 <groupId>org.javacc</groupId>
 <artifactId>core</artifactId>
 <version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
 <scope>test</scope>
 </dependency>
...
 <plugin>
 <groupId>org.javacc.plugin</groupId>
 <artifactId>javacc-maven-plugin</artifactId>
 <version>3.0.3</version>
 <executions>
 <execution>
 <id>javacc</id>
 <phase>generate-sources</phase>
 <goals>
 <goal>jjtree-javacc</goal>
 </goals>
 <configuration>
 <codeGenerator>java</codeGenerator>
<!-- <grammarEncoding>UTF-8</grammarEncoding>-->
<!-- <isStatic>false</isStatic>-->
<!-- <jdkVersion>1.8</jdkVersion>-->
 </configuration>
 </execution>
 </executions>
 <dependencies>
<!-- <dependency>-->
<!-- <groupId>net.java.dev.javacc</groupId>-->
<!-- <artifactId>javacc</artifactId>-->
<!-- <version>[7.0.13,)</version>-->
<!-- </dependency>-->
 <dependency>
 <groupId>org.javacc.generator</groupId>
 <artifactId>java</artifactId>
 <version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
 </dependency>
 <dependency>
 <groupId>org.javacc</groupId>
 <artifactId>core</artifactId>
 <version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
 </dependency>
 </dependencies>
 </plugin>
...

You'll have to change yous SimpleNode references to Node (and other things in your tests, on which I did not investigate).

Tell me how things evolve.

You must be logged in to vote
1 reply
Comment options

@MarcMazas: Thank you a lot! I definitely will give this a try soonest and appreciate your help and support. I will return with findings and hopefully a success story.

Cheers and best
Andreas

Answer selected by manticore-projects
Comment options

Hi By the way: the version you�ll find on Maven Central is 8.0.1; I�m on a 8.1.0 version, currently not released; it has a lot of �laundry� changes but with normally no impact; main benefits for users should be prettier generated code and better debug traces (lookahead, token manager, parser). Will it be possible for you to check without too much effort that JSqlParser works fine with 8.0.1, and then spend more time checking it works with 8.1.0 (for example on performance, traces�)? TIA Regards Marc De : manticore-projects ***@***.***> Envoyé : mardi 11 février 2025 12:16 À : JSQLParser/JSqlParser ***@***.***> Cc : Marc MAZAS ***@***.***>; Mention ***@***.***> Objet : Re: [JSQLParser/JSqlParser] Code too large (Discussion #2161) @MarcMazas <https://github.com/MarcMazas> : Thank you a lot! I definitely will give this a try soonest and appreciate your help and support. I will return with findings and hopefully a success story. Cheers and best Andreas � Reply to this email directly, view it on GitHub <#2161 (comment) -12139319> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFQZRCZ7KCWMSSRUQHJOKUL2P HLVHAVCNFSM6AAAAABW2GOQPOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTE MJTHEZTCOI> . You are receiving this because you were mentioned. <https://github.com/notifications/beacon/AFQZRCYXTCVE7Z6DJ6MD5GL2PHLVHA5CNFS M6AAAAABW2GOQPOWGG33NNVSW45C7OR4XAZNRIRUXGY3VONZWS33OINXW23LFNZ2KUY3PNVWWK3T UL5UWJTQAXE5TO.gif> Message ID: ***@***.*** ***@***.*** .com> >
You must be logged in to vote
1 reply
Comment options

Sure, I will try this out too -- but only next week.
Thank you again for helping!

Comment options

Hi,

I am currently struggling with the Code too large problem while I am trying to implement the IMPORT and EXPORT statements of Exasol (https://docs.exasol.com/db/latest/sql/import.htm / https://docs.exasol.com/db/latest/sql/export.htm). You can see the current progress of my implementation here: https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-export

Did anyone already experience this problem and can provide a solution for that?

Thank you in advance!

About the "Code too large" issue: from what I've seen, you had 2 errrors:

  • one for the static part (which is the one discussed in the article you mentioned)
  • one for a generated method (and the author of the article does not advertise he guarantees that the same problem will not happen with his parser).
You must be logged in to vote
0 replies
Comment options

@ssteinhauser: There is some good news. After running into Code too large myself I have been finally forced to take this seriously and I managed to start a JavaCC-8 migration.

  1. it solves indeed the code too large problem from too many tokens
  2. it works in general
  3. but it fails on two instances of token manipulation (T-SQL squared brackets and advanced text escaping)

Here is the new branch: https://github.com/JSQLParser/JSqlParser/tree/javacc8

You can continue your implementation on that.

You must be logged in to vote
0 replies
Comment options

@ssteinhauser: We have gotten a fully functional JavaCC-8 based JSQLParser, which passed all tests and is on par performance wise and function wise. Please see: https://github.com/JSQLParser/JSqlParser/tree/javacc8_without_semantic_lookahead

It has no limitations to tokens, will be the basis for the next major release and can be used as a drop in replacement already.
Please continue your development on that while I am sorting out the logistics in the background.

You must be logged in to vote
0 replies
Comment options

JSQLParser 5.4 Snapshot is running on JavaCC 8.1 Snapshot via 675e8b6.
This solves the restrictions on the token manager effectively.

You must be logged in to vote
6 replies
Comment options

I've just merged the master branch into my feature branch and tried to run a build, but it fails with "Could not find artifact org.javacc.generator:java:jar:8.1.0-SNAPSHOT".
Can you please help and provide guidance with that?

Greetings @ssteinhauser.

Unfortunately I can't reproduce your challenge because I have cloned your branch from GIT and after I was able to compile with Gradle as well as with Maven:

are@archlinux ~ [1]> mkdir test
are@archlinux ~> cd test
are@archlinux ~/test> git clone --single-branch --branch feature/exasol-import-export https://github.com/ssteinhauser/JSqlParser.git
are@archlinux ~/test> cd JSqlParser/
are@archlinux ~/t/JSqlParser (feature/exasol-import-export)> gradle check
are@archlinux ~/t/JSqlParser (feature/exasol-import-export)> mvn verify -DskipTests

I get errors regarding abstract classes but this seems to be plain java development:

[ERROR] COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
[ERROR] /home/are/test/JSqlParser/src/main/java/net/sf/jsqlparser/statement/imprt/Import.java:[23,7] error: Import is not abstract and does not override abstract method setSampleClause(SampleClause) in FromItem
[INFO] 1 error

Everything looks good to me.
We can share log files or even screen when needed.

Good luck and cheers!

Comment options

In case the artifact still can't be found (for whatever reason), you can deploy to local by yourself:

mkdir javacc-8
cd javacc-8
git clone git@github.com:javacc/javacc-8.git
cd javacc-8
sed -i 's|<javacc.java.version>[^<]*</javacc.java.version>|<javacc.java.version>8.0.1</javacc.java.version>|' pom.xml
mvn clean install -P!run-its
cd ..
git clone git@github.com:javacc/javacc-8-core.git
cd javacc-8-core
mvn clean install -P!run-its
cd ..
git clone git@github.com:javacc/javacc-8-java.git
cd javacc-8-java
mvn clean install -P!run-its
cd ..
Comment options

I've just merged the master branch into my feature branch and tried to run a build, but it fails with "Could not find artifact org.javacc.generator:java:jar:8.1.0-SNAPSHOT".
Can you please help and provide guidance with that?

Greetings @ssteinhauser.

Unfortunately I can't reproduce your challenge because I have cloned your branch from GIT and after I was able to compile with Gradle as well as with Maven:

are@archlinux ~ [1]> mkdir test
are@archlinux ~> cd test
are@archlinux ~/test> git clone --single-branch --branch feature/exasol-import-export https://github.com/ssteinhauser/JSqlParser.git
are@archlinux ~/test> cd JSqlParser/
are@archlinux ~/t/JSqlParser (feature/exasol-import-export)> gradle check
are@archlinux ~/t/JSqlParser (feature/exasol-import-export)> mvn verify -DskipTests

I get errors regarding abstract classes but this seems to be plain java development:

[ERROR] COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
[ERROR] /home/are/test/JSqlParser/src/main/java/net/sf/jsqlparser/statement/imprt/Import.java:[23,7] error: Import is not abstract and does not override abstract method setSampleClause(SampleClause) in FromItem
[INFO] 1 error

Everything looks good to me. We can share log files or even screen when needed.

Good luck and cheers!

Thanks for the quick reply. I've retried using a fresh installation of maven, but I still receive the same error.
Can you possibly share the link to the artifact in the corresponding public repository? Because I can't even find it when searching manually.

Before I proceed with deploying the artifact locally from github, I'd like to mention, that you are referring to version 8.0.1 in the sed command, but the pom.xml of JSqlParser is looking for version 8.1.0-SNAPSHOT:

In case the artifact still can't be found (for whatever reason), you can deploy to local by yourself:

mkdir javacc-8
cd javacc-8
git clone git@github.com:javacc/javacc-8.git
cd javacc-8
sed -i 's|<javacc.java.version>[^<]*</javacc.java.version>|<javacc.java.version>8.0.1</javacc.java.version>|' pom.xml
mvn clean install -P!run-its
cd ..
git clone git@github.com:javacc/javacc-8-core.git
cd javacc-8-core
mvn clean install -P!run-its
cd ..
git clone git@github.com:javacc/javacc-8-java.git
cd javacc-8-java
mvn clean install -P!run-its
cd ..
Comment options

Hi
You will find the javacc-8 snapshots under https://central.sonatype.com/service/rest/repository/browse/maven-snapshots/org/javacc/.
For example, the published snapshots for the java generator 8.1.0-SNAPSHOT are under https://central.sonatype.com/service/rest/repository/browse/maven-snapshots/org/javacc/generator/java/8.1.0-SNAPSHOT/.
The reason why the sed command replaces 8.1.0-SNAPSHOT by 8.1.0 is to avoid the chicken-and-egg circular dependency between the core and the java generator.

Comment options

Thanks again for the quick reply. Now I was able to mitigate this issue by manually installing the dependencies in my local repository. I've also fixed the missing implementation of SampleClause.
Unfortunately, I am still facing the code too large issue for the CCJSqlParserTokenManager: .../JSqlParser/target/generated-sources/javacc/net/sf/jsqlparser/parser/CCJSqlParserTokenManager.java:[62,1] code too large

Since there are many new tokens in this feature branch, I assume it's due to them. Do you have any suggestions on how to proceed?

Comment options

I will open another issue at JavaCC: javacc/javacc#309

You must be logged in to vote
0 replies
Comment options

I have just published a javacc-8-java snapshot that solves the error (I passed some static data in an inner static class).

You must be logged in to vote
0 replies
Comment options

@ssteinhauser:

I can confirm this working now by hard refreshing dependencies:

gradle clean test --no-build-cache --refresh-dependencies

@MarcMazas : thank you so much!

You must be logged in to vote
0 replies
Comment options

@ssteinhauser: please also keep an eye on the JHM benchmarks. I will not merge any changes resulting in any significant performance deterioration especially when then the dialect/feature was rather exotic.

The following command will indicate problems early:

gradle jmh

And so far, your implementation looks very good:

Benchmark (version) Mode Cnt Score Error Units
JSQLParserBenchmark.parseSQLStatements latest avgt 15 80.159 ± 2.467 ms/op
JSQLParserBenchmark.parseSQLStatements 5.3 avgt 15 80.410 ± 2.729 ms/op
JSQLParserBenchmark.parseSQLStatements 5.1 avgt 15 81.835 ± 2.689 ms/op
You must be logged in to vote
0 replies
Comment options

I take the opportuinity to suggest you to consider not defining as tokens the unfrequent keywords but rather keep them like identifiers matched by a semantic lookahead.
I have the strong feeling that if you keep only the most common keywords (select, table...) as tokens it would lead to better performances, because of less work for the token manager and the parser jj2 routines.

Example 1 on simple cases:

void SqlAndPlSqlFunctions0() : // NOT to be used with semantic lookahead
{ }
{
 (
 LOOKAHEAD( { "EMPTY_BLOB".equalsIgnoreCase(getToken(1).image)
 || "EMPTY_CLOB".equalsIgnoreCase(getToken(1).image) } )
 < IDENTIFIER > "(" ")" // has no parameters but has parentheses
 | LOOKAHEAD( { "SYSDATE".equalsIgnoreCase(getToken(1).image) } )
 < IDENTIFIER >
 [
 LOOKAHEAD(2)
 "(" ")"
 ] // found with parentheses in prc010g01.fmb.txt line 42060 and without in other programs
 | LOOKAHEAD( { "SYSTIMESTAMP".equalsIgnoreCase(getToken(1).image)
 || "USER".equalsIgnoreCase(getToken(1).image) } )
 < IDENTIFIER >
 )
}

Example 2 on more choices:
in the grammar

...
 |
 LOOKAHEAD( { PFMOS_PlSqlMate.laucFormsBuiltInsProcedures0N(getTk1UC()) } )
 FormsBuiltInsProcedures0N()
 [
 "(" [ ArgumentList() ] ")"
 ]
 |
...

in the java section or in a helper class

 /**
 * Set used for semantic lookahead for FormsBuiltInsFunctions0
 */
 public static final Set<String> FBIF0 = new HashSet<>(16, 1f);
 static {
 FBIF0.add("DBMS_ERROR_CODE");
 FBIF0.add("DBMS_ERROR_TEXT");
 FBIF0.add("ERROR_CODE"); // also a PlSqlCursorAttributeTokens
 FBIF0.add("ERROR_TEXT");
 FBIF0.add("ERROR_TYPE");
 FBIF0.add("FORM_FAILURE");
 FBIF0.add("FORM_FATAL");
 FBIF0.add("FORM_SUCCESS");
 FBIF0.add("GET_MESSAGE");
 // | < LAST_OLE_ERROR // 6i
 FBIF0.add("MESSAGE_CODE");
 FBIF0.add("MESSAGE_TEXT");
 FBIF0.add("MESSAGE_TYPE");
 }
 /**
 * Semantic lookahead for FormsBuiltInsFunctions0.
 *
 * @param aTokenImageUC - the (next) token image in upper case
 * @return true if the token is a FormsBuiltInsFunctions0, false otherwise
 */
 final static boolean laucFormsBuiltInsFunctions0(final String aTokenImageUC) {
 return FBIF0.contains(aTokenImageUC);
 }
 /**
 * Gets the next token image in upper case (for semantic lookaheads).
 *
 * @return the next token image in upper case
 */
 final String getTk1UC()
 {
 return getToken(1).image.toUpperCase();
 }
You must be logged in to vote
1 reply
Comment options

LOOKAHEAD( { "EMPTY_BLOB".equalsIgnoreCase(getToken(1).image)

Makes sense, but does not work with current JavaCC-8.
Please see javacc/javacc#307 (comment)

Comment options

I’ll look at it next week. I may have played a little too much with labels and images. Checking before what getToken(1).image returns. And I believe you need an Identifier() after the LOOKAHEAD. Le mer. 18 juin 2025 à 10:02, manticore-projects ***@***.***> a écrit :
...
LOOKAHEAD( { "EMPTY_BLOB".equalsIgnoreCase(getToken(1).image) Makes sense, but does not work with current JavaCC-8. Please see javacc/javacc#307 (comment) <javacc/javacc#307 (comment)> — Reply to this email directly, view it on GitHub <#2161 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFQZRC4UATCHKPHBZHI6A4D3EEMKDAVCNFSM6AAAAABW2GOQPOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTGNJQGU3TINA> . You are receiving this because you were mentioned.Message ID: ***@***.***>
You must be logged in to vote
0 replies
Comment options

@MarcMazas: Thank you!

We can continue this discussion here: javacc/javacc#307 (comment)
I have set up a simplified Grammar to test those observed issues with minimal complexity.

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

AltStyle によって変換されたページ (->オリジナル) /