How to add double dot `".."` token? · JSQLParser/JSqlParser · Discussion #2157

ssteinhauser
Feb 5, 2025

Hi,

I am currently implementing the IMPORT statement of Exasol DBMS: https://docs.exasol.com/db/latest/sql/import.htm
In the csv_cols part of the syntax, it uses double dots "..", which I wanted to use as token to match without any whitespace character in between. Unfortunately, this breaks some SelectTest unit tests like testMultiPartTableNameWithDatabaseName, since it matches the double dots in the identifier chain as ".." instead of matching each dot separately as a single dot ".":

net.sf.jsqlparser.JSQLParserException: Encountered unexpected token: ".." ".."
 at line 1, column 36.
Was expecting one of:
 <EOF>
 <ST_SEMICOLON>
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParserManager.parse(CCJSqlParserManager.java:25)
	at net.sf.jsqlparser/net.sf.jsqlparser.statement.select.SelectTest.testMultiPartTableNameWithDatabaseName(SelectTest.java:158)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
Caused by: net.sf.jsqlparser.parser.ParseException: Encountered unexpected token: ".." ".."
 at line 1, column 36.
Was expecting one of:
 <EOF>
 <ST_SEMICOLON>
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParser.generateParseException(CCJSqlParser.java:54047)
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParser.jj_consume_token(CCJSqlParser.java:53862)
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParser.Statement(CCJSqlParser.java:341)
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParserManager.parse(CCJSqlParserManager.java:23)
	... 4 more

You can see the current progress of my implementation here: https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-statement

Do you have any advice or idea how to overcome this issue?

Thank you in advance!

Answered by manticore-projects

Feb 7, 2025

Good Morning @ssteinhauser .

I believe I have been able to fix/improve the dot-handling. The following 2 productions work:

// table names seem to allow ":" delimiters, e.g. for Informix see #1134
ObjectNames RelObjectNames() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) (
 ( delimiter = "..." { delimiters.add("."); data.add(null); delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( delimiter = ".." { delimiters.add("."); data.add(null); delimi...

View full answer

Replies: 8 comments 3 replies

manticore-projects
Feb 5, 2025
Collaborator

Greetings @ssteinhauser

I will look into this tomorrow morning and find a solution (its evening here already).

0 replies

manticore-projects
Feb 5, 2025
Collaborator

I believe, you need to define you own Identified and QuotedIdentifier without any double dots.
Or we need to find a way to disable the double dots in the multipart table names (because it applies only for Sybase/SQLServer).

0 replies

manticore-projects
Feb 5, 2025
Collaborator

Basically, we can define another Feature like allowDoubleDotMultipartIdentifers and then make conditional LookAheads.
Or you define your own special RelObjectNames without allowing delimiters.

// table names seem to allow ":" delimiters, e.g. for Informix see #1134
ObjectNames RelObjectNames() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) ( delimiter = "." | delimiter = ":" ) { delimiters.add(delimiter.image); } (( delimiter = "." | delimiter = ":" ) { data.add(null); delimiters.add(delimiter.image); })*
 token = RelObjectNameExt2() { data.add(token); }
 ) *
 { return new ObjectNames(data, delimiters); }
}
// column names do not allow ":" delimeters as those represent JSON `GET` operators
ObjectNames ColumnIdentifier() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) ( delimiter = "." ) { delimiters.add(delimiter.image); } (( delimiter = "." ) { data.add(null); delimiters.add(delimiter.image); })*
 token = RelObjectNameExt2() { data.add(token); }
 ) *
 { return new ObjectNames(data, delimiters); }
}

0 replies

ssteinhauser
Feb 5, 2025
Author

Thank you for your quick reply @manticore-projects ,

can you please provide deeper insights of your thoughts?

Maybe I miss something, but up to my knowledge, introducing a feature flag or implementing my own RelObjectNames won't solve the issue. The pure existence and usage of the ".." token (no matter in which context) causes the unit tests to fail.
In the case of the IMPORT statement, I don't use it with RelObjectNames, it's being used in between <S_LONG> tokens.

If we would implement a feature flag, we would have to implement it that way, that the support of double dot multipart identifier is mutually exclusive with the IMPORT statement. Event then, I am not sure if it will work.

0 replies

manticore-projects
Feb 6, 2025
Collaborator

Good Morning @ssteinhauser.

Please see 9f51831 (sorry for committing directly, it was an accident).
I did not see any test for the IMPORT statement though, but it should work when setting the new feature allowSkippingPartsInIdentifiers to FALSE.

1 reply

@ssteinhauser

ssteinhauser Feb 6, 2025
Author

Hi @manticore-projects ,

thank you for your contribution, I really appreciate your help on this.

I still have concerns on this. Like I feared, this feature makes skipping parts in identifiers mutual exclusive to that specific part of the IMPORT statement.
However, when splitting the double dot ".." into two single dots"." "." it allows to have whitespaces in between those two dots (which is not supported at this point by Exasol), but when we keep using the double dots "..", we still have the same problem as before even with the feature.
So the feature won't fix the problem. We may think about living with the circumstance, that we use two single dots and allow whitespaces in between those?

Sorry for not implementing the unit tests yet, will do that asap.

manticore-projects
Feb 6, 2025
Collaborator

Good Morning @ssteinhauser .

I fully understand your concern, but .. is really a nightmare.

We could rework those RelObjectNames() productions and introduce the same tokens there, e.g. . .. or ... -- but I am honestly not very eager on this because it just works while ExaSol is a rather exotic usecase.

Right now, I would like to suggest to ignore that there can be whitespace in between. JSQLParser is meant to be a parser and not a validation tool.

Suggestion:

finish your work and ignore the possible whitespace between the dots
we merge in order to have a full working copy
we try to rework those productions and introduce proper tokens ., .., ...

1 reply

@ssteinhauser

ssteinhauser Feb 6, 2025
Author

Alright, I agree and will continue as you proposed. I will provide the PR as soon as it's ready to be merged.
Thank you again for your support.

manticore-projects
Feb 6, 2025
Collaborator

I tried to work with Tokens in those two productions instead of chaining the . -- could not get it working in a reasonable time.

0 replies

manticore-projects
Feb 7, 2025
Collaborator

Good Morning @ssteinhauser .

I believe I have been able to fix/improve the dot-handling. The following 2 productions work:

// table names seem to allow ":" delimiters, e.g. for Informix see #1134
ObjectNames RelObjectNames() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) (
 ( delimiter = "..." { delimiters.add("."); data.add(null); delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( delimiter = ".." { delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( ( delimiter = "." | delimiter = ":" ) { delimiters.add(delimiter.image); } )
 )
 token = RelObjectNameExt2() { data.add(token); }
 ) *
 { return new ObjectNames(data, delimiters); }
}
// column names do not allow ":" delimeters as those represent JSON `GET` operators
ObjectNames ColumnIdentifier() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) (
 ( delimiter = "..." { delimiters.add("."); data.add(null); delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( delimiter = ".." { delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( delimiter = "." { delimiters.add(delimiter.image); } )
 )
 token = RelObjectNameExt2() { data.add(token); }
 ) *
 { return new ObjectNames(data, delimiters); }
}

This should allow you to work with tokens like .. and ... in your Import production and also is more correct r/ white space handling.
I have taken this upstream already via 38d0e36. Sorry for delayed solution.

1 reply

@manticore-projects

manticore-projects Feb 7, 2025
Collaborator

Totally unrelated I would like to plead for cutting your mega-productions into smaller chunks wherever possible -- with the (auto-generated) railroad diagrams in mind. Thank you!.

Answer selected by ssteinhauser

Uh oh!

How to add double dot ".." token? #2157

Uh oh!

ssteinhauser Feb 5, 2025

Replies: 8 comments · 3 replies

Uh oh!

manticore-projects Feb 5, 2025 Collaborator

Uh oh!

manticore-projects Feb 5, 2025 Collaborator

Uh oh!

manticore-projects Feb 5, 2025 Collaborator

Uh oh!

ssteinhauser Feb 5, 2025 Author

Uh oh!

manticore-projects Feb 6, 2025 Collaborator

Uh oh!

ssteinhauser Feb 6, 2025 Author

Uh oh!

manticore-projects Feb 6, 2025 Collaborator

Uh oh!

ssteinhauser Feb 6, 2025 Author

Uh oh!

manticore-projects Feb 6, 2025 Collaborator

Uh oh!

manticore-projects Feb 7, 2025 Collaborator

Uh oh!

manticore-projects Feb 7, 2025 Collaborator

How to add double dot `".."` token? #2157

ssteinhauser
Feb 5, 2025

Replies: 8 comments 3 replies

manticore-projects
Feb 5, 2025
Collaborator

manticore-projects
Feb 5, 2025
Collaborator

manticore-projects
Feb 5, 2025
Collaborator

ssteinhauser
Feb 5, 2025
Author

manticore-projects
Feb 6, 2025
Collaborator

ssteinhauser Feb 6, 2025
Author

manticore-projects
Feb 6, 2025
Collaborator

ssteinhauser Feb 6, 2025
Author

manticore-projects
Feb 6, 2025
Collaborator

manticore-projects
Feb 7, 2025
Collaborator

manticore-projects Feb 7, 2025
Collaborator