Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

How to add double dot ".." token? #2157

Discussion options

Hi,

I am currently implementing the IMPORT statement of Exasol DBMS: https://docs.exasol.com/db/latest/sql/import.htm
In the csv_cols part of the syntax, it uses double dots "..", which I wanted to use as token to match without any whitespace character in between. Unfortunately, this breaks some SelectTest unit tests like testMultiPartTableNameWithDatabaseName, since it matches the double dots in the identifier chain as ".." instead of matching each dot separately as a single dot ".":

net.sf.jsqlparser.JSQLParserException: Encountered unexpected token: ".." ".."
 at line 1, column 36.
Was expecting one of:
 <EOF>
 <ST_SEMICOLON>
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParserManager.parse(CCJSqlParserManager.java:25)
	at net.sf.jsqlparser/net.sf.jsqlparser.statement.select.SelectTest.testMultiPartTableNameWithDatabaseName(SelectTest.java:158)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
Caused by: net.sf.jsqlparser.parser.ParseException: Encountered unexpected token: ".." ".."
 at line 1, column 36.
Was expecting one of:
 <EOF>
 <ST_SEMICOLON>
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParser.generateParseException(CCJSqlParser.java:54047)
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParser.jj_consume_token(CCJSqlParser.java:53862)
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParser.Statement(CCJSqlParser.java:341)
	at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParserManager.parse(CCJSqlParserManager.java:23)
	... 4 more

You can see the current progress of my implementation here: https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-statement

Do you have any advice or idea how to overcome this issue?

Thank you in advance!

You must be logged in to vote

Good Morning @ssteinhauser .

I believe I have been able to fix/improve the dot-handling. The following 2 productions work:

// table names seem to allow ":" delimiters, e.g. for Informix see #1134
ObjectNames RelObjectNames() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) (
 ( delimiter = "..." { delimiters.add("."); data.add(null); delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( delimiter = ".." { delimiters.add("."); data.add(null); delimi...

Replies: 8 comments 3 replies

Comment options

Greetings @ssteinhauser

I will look into this tomorrow morning and find a solution (its evening here already).

You must be logged in to vote
0 replies
Comment options

I believe, you need to define you own Identified and QuotedIdentifier without any double dots.
Or we need to find a way to disable the double dots in the multipart table names (because it applies only for Sybase/SQLServer).

You must be logged in to vote
0 replies
Comment options

Basically, we can define another Feature like allowDoubleDotMultipartIdentifers and then make conditional LookAheads.
Or you define your own special RelObjectNames without allowing delimiters.

// table names seem to allow ":" delimiters, e.g. for Informix see #1134
ObjectNames RelObjectNames() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) ( delimiter = "." | delimiter = ":" ) { delimiters.add(delimiter.image); } (( delimiter = "." | delimiter = ":" ) { data.add(null); delimiters.add(delimiter.image); })*
 token = RelObjectNameExt2() { data.add(token); }
 ) *
 { return new ObjectNames(data, delimiters); }
}
// column names do not allow ":" delimeters as those represent JSON `GET` operators
ObjectNames ColumnIdentifier() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) ( delimiter = "." ) { delimiters.add(delimiter.image); } (( delimiter = "." ) { data.add(null); delimiters.add(delimiter.image); })*
 token = RelObjectNameExt2() { data.add(token); }
 ) *
 { return new ObjectNames(data, delimiters); }
}
You must be logged in to vote
0 replies
Comment options

Thank you for your quick reply @manticore-projects ,

can you please provide deeper insights of your thoughts?

Maybe I miss something, but up to my knowledge, introducing a feature flag or implementing my own RelObjectNames won't solve the issue. The pure existence and usage of the ".." token (no matter in which context) causes the unit tests to fail.
In the case of the IMPORT statement, I don't use it with RelObjectNames, it's being used in between <S_LONG> tokens.

If we would implement a feature flag, we would have to implement it that way, that the support of double dot multipart identifier is mutually exclusive with the IMPORT statement. Event then, I am not sure if it will work.

You must be logged in to vote
0 replies
Comment options

Good Morning @ssteinhauser.

Please see 9f51831 (sorry for committing directly, it was an accident).
I did not see any test for the IMPORT statement though, but it should work when setting the new feature allowSkippingPartsInIdentifiers to FALSE.

You must be logged in to vote
1 reply
Comment options

Hi @manticore-projects ,

thank you for your contribution, I really appreciate your help on this.

I still have concerns on this. Like I feared, this feature makes skipping parts in identifiers mutual exclusive to that specific part of the IMPORT statement.
However, when splitting the double dot ".." into two single dots"." "." it allows to have whitespaces in between those two dots (which is not supported at this point by Exasol), but when we keep using the double dots "..", we still have the same problem as before even with the feature.
So the feature won't fix the problem. We may think about living with the circumstance, that we use two single dots and allow whitespaces in between those?

Sorry for not implementing the unit tests yet, will do that asap.

Comment options

Good Morning @ssteinhauser .

I fully understand your concern, but .. is really a nightmare.

We could rework those RelObjectNames() productions and introduce the same tokens there, e.g. . .. or ... -- but I am honestly not very eager on this because it just works while ExaSol is a rather exotic usecase.

Right now, I would like to suggest to ignore that there can be whitespace in between. JSQLParser is meant to be a parser and not a validation tool.

Suggestion:

  1. finish your work and ignore the possible whitespace between the dots
  2. we merge in order to have a full working copy
  3. we try to rework those productions and introduce proper tokens ., .., ...
You must be logged in to vote
1 reply
Comment options

Alright, I agree and will continue as you proposed. I will provide the PR as soon as it's ready to be merged.
Thank you again for your support.

Comment options

I tried to work with Tokens in those two productions instead of chaining the . -- could not get it working in a reasonable time.

You must be logged in to vote
0 replies
Comment options

Good Morning @ssteinhauser .

I believe I have been able to fix/improve the dot-handling. The following 2 productions work:

// table names seem to allow ":" delimiters, e.g. for Informix see #1134
ObjectNames RelObjectNames() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) (
 ( delimiter = "..." { delimiters.add("."); data.add(null); delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( delimiter = ".." { delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( ( delimiter = "." | delimiter = ":" ) { delimiters.add(delimiter.image); } )
 )
 token = RelObjectNameExt2() { data.add(token); }
 ) *
 { return new ObjectNames(data, delimiters); }
}
// column names do not allow ":" delimeters as those represent JSON `GET` operators
ObjectNames ColumnIdentifier() : {
 String token = null;
 Token delimiter = null;
 List<String> data = new ArrayList<String>();
 List<String> delimiters = new ArrayList<String>();
} {
 token = RelObjectNameExt() { data.add(token); }
 (
 LOOKAHEAD (2) (
 ( delimiter = "..." { delimiters.add("."); data.add(null); delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( delimiter = ".." { delimiters.add("."); data.add(null); delimiters.add("."); } )
 |
 ( delimiter = "." { delimiters.add(delimiter.image); } )
 )
 token = RelObjectNameExt2() { data.add(token); }
 ) *
 { return new ObjectNames(data, delimiters); }
}

This should allow you to work with tokens like .. and ... in your Import production and also is more correct r/ white space handling.
I have taken this upstream already via 38d0e36. Sorry for delayed solution.

You must be logged in to vote
1 reply
Comment options

Totally unrelated I would like to plead for cutting your mega-productions into smaller chunks wherever possible -- with the (auto-generated) railroad diagrams in mind. Thank you!.

Answer selected by ssteinhauser
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /