-
-
Notifications
You must be signed in to change notification settings - Fork 135
Modifying Spark SQL query #23
-
Hello folks,
I've been trying to capture and modify sql queries for certain use-cases. Currently my approach is as follows,
- Override the parsePlan in AbstractSqlParser, capture the sql query and modify it if needed
- Then pass the modified query into the original parsePlan
- Build a jar using the extended parsePlan and set the spark.jars and spark.sql.extensions in spark-conf.sh
After doing this if we execute spark-sql commands from the client (spark-shell, spark-submit, etc.,) we don't need any additional signaling and the queries are modified as per my logic.
But I just want to know whether I am modifying it in the right place or is there any other options available for modification, I am worried that this modification being done, is in a middle part, and that could cause a problem for the client.
My requirements are as follows:
- I am looking for some hooks/extensions so that I can use those and modify the query
- I don't want to modify directly the spark-source code
- Also the client shouldn't do any additonal signalling for using my custom jars, etc., except adding them in a conf.
Any answer from the community will be much helpful and any references sites for the answers are also much appreciated. Thanks in advance !!!
P.S. Sorry if this question is irrelevant to this repo
Beta Was this translation helpful? Give feedback.