-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Code too large #2161
-
Hi,
I am currently struggling with the Code too large problem while I am trying to implement the IMPORT
and EXPORT
statements of Exasol (https://docs.exasol.com/db/latest/sql/import.htm / https://docs.exasol.com/db/latest/sql/export.htm).
You can see the current progress of my implementation here: https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-export
Did anyone already experience this problem and can provide a solution for that?
Thank you in advance!
Beta Was this translation helpful? Give feedback.
All reactions
Hi.
We advice you to take some time checking that JavaCC 8 solves your issue without functional & performance regressions.
I cloned https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-export, modified pom.xml:
...
<!-- needed for parsing the Keywords via JTree in ParserKeywordsUtils -->
<dependency>
<!-- <groupId>net.java.dev.javacc</groupId>-->
<!-- <artifactId>javacc</artifactId>-->
<!-- <version>[7.0.13,)</version>-->
<groupId>org.javacc.generator</groupId>
<artifactId>java</artifactId>
<version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
<scope>te...
Replies: 17 comments 11 replies
-
Greetings.
I was worried that we will hit this at a time. Its a limitation of JavaCC when the Grammar is too complex and the generated Parser becomes to large (beyond the 64kb mark).
As far as I do understand it, there is not immediate solution to it. You can only try to split your production in smaller chunks.
In general, its a JavaCC issue and should be discussed there.
The only prompt ideas I have are:
- split Productions into smaller chunks
- split off DDL from DML and Queries
- switch to
CongoCC
which appears to be the superior, but toxic technology
Beta Was this translation helpful? Give feedback.
All reactions
-
I opened a case at JavaCC, see javacc/javacc#297
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi @manticore-projects ,
thank you for the quick reply.
That's bad to hear, that there is not actual solution for that. However, thank you for opening the issue at JavaCC.
The error occurs in the TokenManager
class, so I guess splitting the productions further won't fix the issue. From my understanding (possibly I am wrong), it is due to the number of tokens.
So I guess my implementation is being blocked by that problem and we need to wait for a solution or workaround provided at JavaCC.
Beta Was this translation helpful? Give feedback.
All reactions
-
@ssteinhauser: please give it a few days to collect feedback and then we will look for an acceptable way forward.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
@ssteinhauser: I have slept over this and cleared my thoughts and without any promise, the most likely scenario now looks like this
-
I am aiming for a final JSQLParser 5.2 release (based on JavaCC) because I need the new
PipeSQL
parser released -
after that, I will likely look for a migration to CongoCC although I have many concerns around that:
a) I will need to overhaul the complete Grammar
b) I will need to rewrite parts of the tool chain (drop Maven, write a Gradle module, rewrite the Rail Road builder, rewrite the
Reserved Keywords
logic)c) there are some "peoples issues" to overcome
So it looks like an aim for a JSQLParser 6 release with a lot of work to-do.
@wumpz: Your opinion and endorsement will be needed for this one to work.
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi @manticore-projects ,
alright, thank you for sharing your thoughts.
Regarding the overhaul of the complete Grammar, you can give the official syntax converter a try.
Please also let me know, if I can or should already provide a PR with my changes, so in case you can probably also convert that syntax to CongoCC so I don't have to do the same thing again on my end for my newly implemented syntax.
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi.
We advice you to take some time checking that JavaCC 8 solves your issue without functional & performance regressions.
I cloned https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-export, modified pom.xml:
...
<!-- needed for parsing the Keywords via JTree in ParserKeywordsUtils -->
<dependency>
<!-- <groupId>net.java.dev.javacc</groupId>-->
<!-- <artifactId>javacc</artifactId>-->
<!-- <version>[7.0.13,)</version>-->
<groupId>org.javacc.generator</groupId>
<artifactId>java</artifactId>
<version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.javacc</groupId>
<artifactId>core</artifactId>
<version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
<scope>test</scope>
</dependency>
...
<plugin>
<groupId>org.javacc.plugin</groupId>
<artifactId>javacc-maven-plugin</artifactId>
<version>3.0.3</version>
<executions>
<execution>
<id>javacc</id>
<phase>generate-sources</phase>
<goals>
<goal>jjtree-javacc</goal>
</goals>
<configuration>
<codeGenerator>java</codeGenerator>
<!-- <grammarEncoding>UTF-8</grammarEncoding>-->
<!-- <isStatic>false</isStatic>-->
<!-- <jdkVersion>1.8</jdkVersion>-->
</configuration>
</execution>
</executions>
<dependencies>
<!-- <dependency>-->
<!-- <groupId>net.java.dev.javacc</groupId>-->
<!-- <artifactId>javacc</artifactId>-->
<!-- <version>[7.0.13,)</version>-->
<!-- </dependency>-->
<dependency>
<groupId>org.javacc.generator</groupId>
<artifactId>java</artifactId>
<version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
</dependency>
<dependency>
<groupId>org.javacc</groupId>
<artifactId>core</artifactId>
<version>8.0.1</version>
<!-- <version>8.1.0-SNAPSHOT</version>-->
</dependency>
</dependencies>
</plugin>
...
You'll have to change yous SimpleNode
references to Node
(and other things in your tests, on which I did not investigate).
Tell me how things evolve.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
@MarcMazas: Thank you a lot! I definitely will give this a try soonest and appreciate your help and support. I will return with findings and hopefully a success story.
Cheers and best
Andreas
Beta Was this translation helpful? Give feedback.
All reactions
-
Beta Was this translation helpful? Give feedback.
All reactions
-
Sure, I will try this out too -- but only next week.
Thank you again for helping!
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi,
I am currently struggling with the Code too large problem while I am trying to implement the
IMPORT
andEXPORT
statements of Exasol (https://docs.exasol.com/db/latest/sql/import.htm / https://docs.exasol.com/db/latest/sql/export.htm). You can see the current progress of my implementation here: https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-exportDid anyone already experience this problem and can provide a solution for that?
Thank you in advance!
About the "Code too large" issue: from what I've seen, you had 2 errrors:
- one for the static part (which is the one discussed in the article you mentioned)
- one for a generated method (and the author of the article does not advertise he guarantees that the same problem will not happen with his parser).
Beta Was this translation helpful? Give feedback.
All reactions
-
@ssteinhauser: There is some good news. After running into Code too large
myself I have been finally forced to take this seriously and I managed to start a JavaCC-8 migration.
- it solves indeed the
code too large
problem from too many tokens - it works in general
- but it fails on two instances of token manipulation (T-SQL squared brackets and advanced text escaping)
Here is the new branch: https://github.com/JSQLParser/JSqlParser/tree/javacc8
You can continue your implementation on that.
Beta Was this translation helpful? Give feedback.
All reactions
-
@ssteinhauser: We have gotten a fully functional JavaCC-8 based JSQLParser, which passed all tests and is on par performance wise and function wise. Please see: https://github.com/JSQLParser/JSqlParser/tree/javacc8_without_semantic_lookahead
It has no limitations to tokens, will be the basis for the next major release and can be used as a drop in replacement already.
Please continue your development on that while I am sorting out the logistics in the background.
Beta Was this translation helpful? Give feedback.
All reactions
-
JSQLParser 5.4 Snapshot is running on JavaCC 8.1 Snapshot via 675e8b6.
This solves the restrictions on the token manager effectively.
Beta Was this translation helpful? Give feedback.
All reactions
-
I've just merged the master branch into my feature branch and tried to run a build, but it fails with "Could not find artifact org.javacc.generator:java:jar:8.1.0-SNAPSHOT".
Can you please help and provide guidance with that?
Greetings @ssteinhauser.
Unfortunately I can't reproduce your challenge because I have cloned your branch from GIT and after I was able to compile with Gradle as well as with Maven:
are@archlinux ~ [1]> mkdir test are@archlinux ~> cd test are@archlinux ~/test> git clone --single-branch --branch feature/exasol-import-export https://github.com/ssteinhauser/JSqlParser.git are@archlinux ~/test> cd JSqlParser/ are@archlinux ~/t/JSqlParser (feature/exasol-import-export)> gradle check are@archlinux ~/t/JSqlParser (feature/exasol-import-export)> mvn verify -DskipTests
I get errors regarding abstract classes but this seems to be plain java development:
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /home/are/test/JSqlParser/src/main/java/net/sf/jsqlparser/statement/imprt/Import.java:[23,7] error: Import is not abstract and does not override abstract method setSampleClause(SampleClause) in FromItem
[INFO] 1 error
Everything looks good to me.
We can share log files or even screen when needed.
Good luck and cheers!
Beta Was this translation helpful? Give feedback.
All reactions
-
In case the artifact still can't be found (for whatever reason), you can deploy to local by yourself:
mkdir javacc-8 cd javacc-8 git clone git@github.com:javacc/javacc-8.git cd javacc-8 sed -i 's|<javacc.java.version>[^<]*</javacc.java.version>|<javacc.java.version>8.0.1</javacc.java.version>|' pom.xml mvn clean install -P!run-its cd .. git clone git@github.com:javacc/javacc-8-core.git cd javacc-8-core mvn clean install -P!run-its cd .. git clone git@github.com:javacc/javacc-8-java.git cd javacc-8-java mvn clean install -P!run-its cd ..
Beta Was this translation helpful? Give feedback.
All reactions
-
I've just merged the master branch into my feature branch and tried to run a build, but it fails with "Could not find artifact org.javacc.generator:java:jar:8.1.0-SNAPSHOT".
Can you please help and provide guidance with that?Greetings @ssteinhauser.
Unfortunately I can't reproduce your challenge because I have cloned your branch from GIT and after I was able to compile with Gradle as well as with Maven:
are@archlinux ~ [1]> mkdir test are@archlinux ~> cd test are@archlinux ~/test> git clone --single-branch --branch feature/exasol-import-export https://github.com/ssteinhauser/JSqlParser.git are@archlinux ~/test> cd JSqlParser/ are@archlinux ~/t/JSqlParser (feature/exasol-import-export)> gradle check are@archlinux ~/t/JSqlParser (feature/exasol-import-export)> mvn verify -DskipTestsI get errors regarding abstract classes but this seems to be plain java development:
[ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /home/are/test/JSqlParser/src/main/java/net/sf/jsqlparser/statement/imprt/Import.java:[23,7] error: Import is not abstract and does not override abstract method setSampleClause(SampleClause) in FromItem [INFO] 1 error
Everything looks good to me. We can share log files or even screen when needed.
Good luck and cheers!
Thanks for the quick reply. I've retried using a fresh installation of maven, but I still receive the same error.
Can you possibly share the link to the artifact in the corresponding public repository? Because I can't even find it when searching manually.
Before I proceed with deploying the artifact locally from github, I'd like to mention, that you are referring to version 8.0.1
in the sed
command, but the pom.xml
of JSqlParser is looking for version 8.1.0-SNAPSHOT
:
In case the artifact still can't be found (for whatever reason), you can deploy to local by yourself:
mkdir javacc-8 cd javacc-8 git clone git@github.com:javacc/javacc-8.git cd javacc-8 sed -i 's|<javacc.java.version>[^<]*</javacc.java.version>|<javacc.java.version>8.0.1</javacc.java.version>|' pom.xml mvn clean install -P!run-its cd .. git clone git@github.com:javacc/javacc-8-core.git cd javacc-8-core mvn clean install -P!run-its cd .. git clone git@github.com:javacc/javacc-8-java.git cd javacc-8-java mvn clean install -P!run-its cd ..
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi
You will find the javacc-8 snapshots under https://central.sonatype.com/service/rest/repository/browse/maven-snapshots/org/javacc/.
For example, the published snapshots for the java generator 8.1.0-SNAPSHOT are under https://central.sonatype.com/service/rest/repository/browse/maven-snapshots/org/javacc/generator/java/8.1.0-SNAPSHOT/.
The reason why the sed command replaces 8.1.0-SNAPSHOT by 8.1.0 is to avoid the chicken-and-egg circular dependency between the core and the java generator.
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks again for the quick reply. Now I was able to mitigate this issue by manually installing the dependencies in my local repository. I've also fixed the missing implementation of SampleClause
.
Unfortunately, I am still facing the code too large
issue for the CCJSqlParserTokenManager
: .../JSqlParser/target/generated-sources/javacc/net/sf/jsqlparser/parser/CCJSqlParserTokenManager.java:[62,1] code too large
Since there are many new tokens in this feature branch, I assume it's due to them. Do you have any suggestions on how to proceed?
Beta Was this translation helpful? Give feedback.
All reactions
-
I will open another issue at JavaCC: javacc/javacc#309
Beta Was this translation helpful? Give feedback.
All reactions
-
I have just published a javacc-8-java snapshot that solves the error (I passed some static data in an inner static class).
Beta Was this translation helpful? Give feedback.
All reactions
-
I can confirm this working now by hard refreshing dependencies:
gradle clean test --no-build-cache --refresh-dependencies
@MarcMazas : thank you so much!
Beta Was this translation helpful? Give feedback.
All reactions
-
@ssteinhauser: please also keep an eye on the JHM benchmarks. I will not merge any changes resulting in any significant performance deterioration especially when then the dialect/feature was rather exotic.
The following command will indicate problems early:
gradle jmh
And so far, your implementation looks very good:
Benchmark (version) Mode Cnt Score Error Units
JSQLParserBenchmark.parseSQLStatements latest avgt 15 80.159 ± 2.467 ms/op
JSQLParserBenchmark.parseSQLStatements 5.3 avgt 15 80.410 ± 2.729 ms/op
JSQLParserBenchmark.parseSQLStatements 5.1 avgt 15 81.835 ± 2.689 ms/op
Beta Was this translation helpful? Give feedback.
All reactions
-
I take the opportuinity to suggest you to consider not defining as tokens the unfrequent keywords but rather keep them like identifiers matched by a semantic lookahead.
I have the strong feeling that if you keep only the most common keywords (select, table...) as tokens it would lead to better performances, because of less work for the token manager and the parser jj2 routines.
Example 1 on simple cases:
void SqlAndPlSqlFunctions0() : // NOT to be used with semantic lookahead
{ }
{
(
LOOKAHEAD( { "EMPTY_BLOB".equalsIgnoreCase(getToken(1).image)
|| "EMPTY_CLOB".equalsIgnoreCase(getToken(1).image) } )
< IDENTIFIER > "(" ")" // has no parameters but has parentheses
| LOOKAHEAD( { "SYSDATE".equalsIgnoreCase(getToken(1).image) } )
< IDENTIFIER >
[
LOOKAHEAD(2)
"(" ")"
] // found with parentheses in prc010g01.fmb.txt line 42060 and without in other programs
| LOOKAHEAD( { "SYSTIMESTAMP".equalsIgnoreCase(getToken(1).image)
|| "USER".equalsIgnoreCase(getToken(1).image) } )
< IDENTIFIER >
)
}
Example 2 on more choices:
in the grammar
...
|
LOOKAHEAD( { PFMOS_PlSqlMate.laucFormsBuiltInsProcedures0N(getTk1UC()) } )
FormsBuiltInsProcedures0N()
[
"(" [ ArgumentList() ] ")"
]
|
...
in the java section or in a helper class
/**
* Set used for semantic lookahead for FormsBuiltInsFunctions0
*/
public static final Set<String> FBIF0 = new HashSet<>(16, 1f);
static {
FBIF0.add("DBMS_ERROR_CODE");
FBIF0.add("DBMS_ERROR_TEXT");
FBIF0.add("ERROR_CODE"); // also a PlSqlCursorAttributeTokens
FBIF0.add("ERROR_TEXT");
FBIF0.add("ERROR_TYPE");
FBIF0.add("FORM_FAILURE");
FBIF0.add("FORM_FATAL");
FBIF0.add("FORM_SUCCESS");
FBIF0.add("GET_MESSAGE");
// | < LAST_OLE_ERROR // 6i
FBIF0.add("MESSAGE_CODE");
FBIF0.add("MESSAGE_TEXT");
FBIF0.add("MESSAGE_TYPE");
}
/**
* Semantic lookahead for FormsBuiltInsFunctions0.
*
* @param aTokenImageUC - the (next) token image in upper case
* @return true if the token is a FormsBuiltInsFunctions0, false otherwise
*/
final static boolean laucFormsBuiltInsFunctions0(final String aTokenImageUC) {
return FBIF0.contains(aTokenImageUC);
}
/**
* Gets the next token image in upper case (for semantic lookaheads).
*
* @return the next token image in upper case
*/
final String getTk1UC()
{
return getToken(1).image.toUpperCase();
}
Beta Was this translation helpful? Give feedback.
All reactions
-
LOOKAHEAD( { "EMPTY_BLOB".equalsIgnoreCase(getToken(1).image)
Makes sense, but does not work with current JavaCC-8.
Please see javacc/javacc#307 (comment)
Beta Was this translation helpful? Give feedback.
All reactions
-
Beta Was this translation helpful? Give feedback.
All reactions
-
@MarcMazas: Thank you!
We can continue this discussion here: javacc/javacc#307 (comment)
I have set up a simplified Grammar to test those observed issues with minimal complexity.
Beta Was this translation helpful? Give feedback.