-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
How to add double dot ".."
token?
#2157
-
Hi,
I am currently implementing the IMPORT
statement of Exasol DBMS: https://docs.exasol.com/db/latest/sql/import.htm
In the csv_cols
part of the syntax, it uses double dots ".."
, which I wanted to use as token to match without any whitespace character in between. Unfortunately, this breaks some SelectTest
unit tests like testMultiPartTableNameWithDatabaseName
, since it matches the double dots in the identifier chain as ".."
instead of matching each dot separately as a single dot "."
:
net.sf.jsqlparser.JSQLParserException: Encountered unexpected token: ".." ".." at line 1, column 36. Was expecting one of: <EOF> <ST_SEMICOLON> at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParserManager.parse(CCJSqlParserManager.java:25) at net.sf.jsqlparser/net.sf.jsqlparser.statement.select.SelectTest.testMultiPartTableNameWithDatabaseName(SelectTest.java:158) at java.base/java.lang.reflect.Method.invoke(Method.java:569) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) Caused by: net.sf.jsqlparser.parser.ParseException: Encountered unexpected token: ".." ".." at line 1, column 36. Was expecting one of: <EOF> <ST_SEMICOLON> at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParser.generateParseException(CCJSqlParser.java:54047) at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParser.jj_consume_token(CCJSqlParser.java:53862) at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParser.Statement(CCJSqlParser.java:341) at net.sf.jsqlparser/net.sf.jsqlparser.parser.CCJSqlParserManager.parse(CCJSqlParserManager.java:23) ... 4 more
You can see the current progress of my implementation here: https://github.com/ssteinhauser/JSqlParser/tree/feature/exasol-import-statement
Do you have any advice or idea how to overcome this issue?
Thank you in advance!
Beta Was this translation helpful? Give feedback.
All reactions
Good Morning @ssteinhauser .
I believe I have been able to fix/improve the dot-handling. The following 2 productions work:
// table names seem to allow ":" delimiters, e.g. for Informix see #1134
ObjectNames RelObjectNames() : {
String token = null;
Token delimiter = null;
List<String> data = new ArrayList<String>();
List<String> delimiters = new ArrayList<String>();
} {
token = RelObjectNameExt() { data.add(token); }
(
LOOKAHEAD (2) (
( delimiter = "..." { delimiters.add("."); data.add(null); delimiters.add("."); data.add(null); delimiters.add("."); } )
|
( delimiter = ".." { delimiters.add("."); data.add(null); delimi...
Replies: 8 comments 3 replies
-
Greetings @ssteinhauser
I will look into this tomorrow morning and find a solution (its evening here already).
Beta Was this translation helpful? Give feedback.
All reactions
-
I believe, you need to define you own Identified
and QuotedIdentifier
without any double dots.
Or we need to find a way to disable the double dots in the multipart table names (because it applies only for Sybase/SQLServer).
Beta Was this translation helpful? Give feedback.
All reactions
-
Basically, we can define another Feature
like allowDoubleDotMultipartIdentifers
and then make conditional LookAheads.
Or you define your own special RelObjectNames
without allowing delimiters.
// table names seem to allow ":" delimiters, e.g. for Informix see #1134
ObjectNames RelObjectNames() : {
String token = null;
Token delimiter = null;
List<String> data = new ArrayList<String>();
List<String> delimiters = new ArrayList<String>();
} {
token = RelObjectNameExt() { data.add(token); }
(
LOOKAHEAD (2) ( delimiter = "." | delimiter = ":" ) { delimiters.add(delimiter.image); } (( delimiter = "." | delimiter = ":" ) { data.add(null); delimiters.add(delimiter.image); })*
token = RelObjectNameExt2() { data.add(token); }
) *
{ return new ObjectNames(data, delimiters); }
}
// column names do not allow ":" delimeters as those represent JSON `GET` operators
ObjectNames ColumnIdentifier() : {
String token = null;
Token delimiter = null;
List<String> data = new ArrayList<String>();
List<String> delimiters = new ArrayList<String>();
} {
token = RelObjectNameExt() { data.add(token); }
(
LOOKAHEAD (2) ( delimiter = "." ) { delimiters.add(delimiter.image); } (( delimiter = "." ) { data.add(null); delimiters.add(delimiter.image); })*
token = RelObjectNameExt2() { data.add(token); }
) *
{ return new ObjectNames(data, delimiters); }
}
Beta Was this translation helpful? Give feedback.
All reactions
-
Thank you for your quick reply @manticore-projects ,
can you please provide deeper insights of your thoughts?
Maybe I miss something, but up to my knowledge, introducing a feature flag or implementing my own RelObjectNames
won't solve the issue. The pure existence and usage of the ".."
token (no matter in which context) causes the unit tests to fail.
In the case of the IMPORT
statement, I don't use it with RelObjectNames
, it's being used in between <S_LONG>
tokens.
If we would implement a feature flag, we would have to implement it that way, that the support of double dot multipart identifier is mutually exclusive with the IMPORT
statement. Event then, I am not sure if it will work.
Beta Was this translation helpful? Give feedback.
All reactions
-
Good Morning @ssteinhauser.
Please see 9f51831 (sorry for committing directly, it was an accident).
I did not see any test for the IMPORT
statement though, but it should work when setting the new feature allowSkippingPartsInIdentifiers
to FALSE
.
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi @manticore-projects ,
thank you for your contribution, I really appreciate your help on this.
I still have concerns on this. Like I feared, this feature makes skipping parts in identifiers mutual exclusive to that specific part of the IMPORT
statement.
However, when splitting the double dot ".."
into two single dots"." "."
it allows to have whitespaces in between those two dots (which is not supported at this point by Exasol), but when we keep using the double dots ".."
, we still have the same problem as before even with the feature.
So the feature won't fix the problem. We may think about living with the circumstance, that we use two single dots and allow whitespaces in between those?
Sorry for not implementing the unit tests yet, will do that asap.
Beta Was this translation helpful? Give feedback.
All reactions
-
Good Morning @ssteinhauser .
I fully understand your concern, but ..
is really a nightmare.
We could rework those RelObjectNames()
productions and introduce the same tokens there, e.g. .
..
or ...
-- but I am honestly not very eager on this because it just works while ExaSol is a rather exotic usecase.
Right now, I would like to suggest to ignore that there can be whitespace in between. JSQLParser is meant to be a parser and not a validation tool.
Suggestion:
- finish your work and ignore the possible whitespace between the dots
- we merge in order to have a full working copy
- we try to rework those productions and introduce proper tokens
.
,..
,...
Beta Was this translation helpful? Give feedback.
All reactions
-
Alright, I agree and will continue as you proposed. I will provide the PR as soon as it's ready to be merged.
Thank you again for your support.
Beta Was this translation helpful? Give feedback.
All reactions
-
I tried to work with Tokens in those two productions instead of chaining the .
-- could not get it working in a reasonable time.
Beta Was this translation helpful? Give feedback.
All reactions
-
Good Morning @ssteinhauser .
I believe I have been able to fix/improve the dot-handling. The following 2 productions work:
// table names seem to allow ":" delimiters, e.g. for Informix see #1134
ObjectNames RelObjectNames() : {
String token = null;
Token delimiter = null;
List<String> data = new ArrayList<String>();
List<String> delimiters = new ArrayList<String>();
} {
token = RelObjectNameExt() { data.add(token); }
(
LOOKAHEAD (2) (
( delimiter = "..." { delimiters.add("."); data.add(null); delimiters.add("."); data.add(null); delimiters.add("."); } )
|
( delimiter = ".." { delimiters.add("."); data.add(null); delimiters.add("."); } )
|
( ( delimiter = "." | delimiter = ":" ) { delimiters.add(delimiter.image); } )
)
token = RelObjectNameExt2() { data.add(token); }
) *
{ return new ObjectNames(data, delimiters); }
}
// column names do not allow ":" delimeters as those represent JSON `GET` operators
ObjectNames ColumnIdentifier() : {
String token = null;
Token delimiter = null;
List<String> data = new ArrayList<String>();
List<String> delimiters = new ArrayList<String>();
} {
token = RelObjectNameExt() { data.add(token); }
(
LOOKAHEAD (2) (
( delimiter = "..." { delimiters.add("."); data.add(null); delimiters.add("."); data.add(null); delimiters.add("."); } )
|
( delimiter = ".." { delimiters.add("."); data.add(null); delimiters.add("."); } )
|
( delimiter = "." { delimiters.add(delimiter.image); } )
)
token = RelObjectNameExt2() { data.add(token); }
) *
{ return new ObjectNames(data, delimiters); }
}
This should allow you to work with tokens like ..
and ...
in your Import
production and also is more correct r/ white space handling.
I have taken this upstream already via 38d0e36. Sorry for delayed solution.
Beta Was this translation helpful? Give feedback.
All reactions
-
Totally unrelated I would like to plead for cutting your mega-productions into smaller chunks wherever possible -- with the (auto-generated) railroad diagrams in mind. Thank you!.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1