Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Question] TPC-DS 1TB Benchmarking results for Non-Partitioned Delta tables with Velox Backend #11463

shadowmmu started this conversation in General
Discussion options

Hi Gluten Community,

I am currently exploring the performance of Apache Gluten with the Velox backend specifically for Delta Lake workloads.

While there are several TPC-DS benchmark reports available for Parquet/ORC, I am looking for insights or existing benchmarking results for the following specific setup:

  • Scale Factor: 1TB (TPC-DS)
  • Data Format: Delta Lake (non-partitioned)
  • Backend: Velox
  • Storage: GCS

Context:
We are evaluating the overhead of the Delta Log reading process versus the native acceleration provided by Velox. Specifically, we are interested in:

  1. How non-partitioned Delta tables perform compared to standard Parquet in a Gluten environment.
  2. If anyone has observed specific bottlenecks in metadata handling or scan performance with this configuration.
  3. Recommended Spark/Gluten configurations to optimize the Delta-Velox scan path for large-scale non-partitioned data.

If anyone has run these benchmarks or has a performance comparison (Native Spark vs. Gluten+Velox) for this setup, I would greatly appreciate it if you could share your findings or any tuning tips!

Thanks!

You must be logged in to vote

Replies: 2 comments

Comment options

Delta tables has a bit lower performance than pure hive table. Delta uses SQL to query metadata during the SQL processing. But some operators are not supported in the metadata query which caused frequent C2R, R2C in some cases and perform worse than vanilla spark. Welcome to fix.

You must be logged in to vote
0 replies
Comment options

Thanks @FelixYBW for your detailed response.
I am up for any kind of contribution, please guide me how can I proceed with.

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /