-
Notifications
You must be signed in to change notification settings - Fork 618
February 27, 2026: Weekly Status Update in Gluten #11671
GlutenPerfBot
started this conversation in
General
-
This weekly update is generated by LLMs. You're welcome to join our Github for in-depth discussions.
Overall Activity Summary
The Apache Gluten project has been highly active over the past 7 days with 42 pull requests and 20+ issues, focusing on major infrastructure improvements, performance optimizations, and Spark 4.x compatibility. The community is preparing for the 1.6.0 release while advancing multiple backend enhancements.
Key Ongoing Projects
- Build System Modernization: @baibaichen delivered major improvements ([GLUTEN-11559][Build] Improve incremental build time for test-compile phase #11560 , [GLUTEN-11559][VL] Add incremental C++ build script for fast development iteration #11595 ) reducing incremental build times from ~3 minutes to under 30 seconds through Ninja build system adoption and smart caching
- Bloop Integration: @liuneng1994 added Bloop build server integration ([CORE] Add Bloop integration for faster Scala incremental compilation #11645 ) achieving 35.9x speedup for incremental compilation
- Spark 4.x Compatibility: Multiple contributors working on test suite stabilization (Spark 4.x: Tracking disabled test suites #11550 , [GLUTEN-11550][UT] Enable GlutenXmlExpressionsSuite for spark4x and exclude 'from_xml- invalid data' #11580 ) with 51 unique suites across Spark 4.0/4.1 versions
- Performance Optimizations: @JkSelf implemented broadcast hash join optimization ([GLUTEN-7548][VL] Optimize BHJ in velox backend #8931 ) showing 1.29x performance improvement in TPC-DS benchmarks
- Iceberg Integration: @rui-mo and @jinchengchenghh working on enabling Iceberg tests ([VL] Iceberg tests failed and were skipped #11630 , [GLUTEN-11630][VL] Enable iceberg tests #11631 , [GLUTEN-11630][VL] Enable iceberg tests #11641 )
Priority Items
- GPU CI Infrastructure: @zhouyuan needs help with GPU CI job restoration ([VL] GPU CI job is down #11611 ) - container updates required due to FBOS upgrade
- Memory Management: @wForget's RSS shuffle writer OOM issue ([VL][1.5] Not enough spark off-heap execution memory on rss shuffle writer #11542 ) requires immediate attention for production stability
- Arm64 Build Issues: @odidev and @huangshiyou reporting compilation failures on Azure Arm64 (Build failures when building Apache Gluten with Velox on Azure Arm64 (Ubuntu 24.04) #11633 , Bundle build failure on Azure Cobalt aarch64 #11639 )
- Scala Compilation: @baibaichen fixed incremental compilation mode ([GLUTEN-11658][CORE] Restore scala.recompile.mode default to 'all' and introduce fast-build profile #11659 ) - critical for developer productivity
Notable Discussions
- Gluten Release 1.6.0 #11603 : Gluten 1.6.0 release coordination by @zhztheplayer - Spark 3.2 support deprecated, preview Spark 4.0 support included
- [VL] useful Velox PRs not merged into upstream #11585 : @FelixYBW tracking useful Velox PRs not merged upstream - community coordination effort
- Add a new backend: Bolt #10929 : @WangGuangxin proposing Bolt backend integration - Velox fork from ByteDance with production features
Emerging Trends
- AI-Driven Development: Multiple PRs explicitly mention AI tooling usage (Claude, GitHub Copilot) for development acceleration
- Developer Experience Focus: Significant engineering effort on build performance and tooling improvements
- Production Readiness: Shift from basic functionality to production-ready features like memory management and comprehensive testing
- Multi-Backend Strategy: Growing interest in supporting multiple execution backends beyond Velox
Good First Issues
- [VL] Input_file_name() returns "" on iceberg tables #11513 : Input_file_name() returns "" on iceberg tables - Well-defined bug with clear scope, good introduction to Gluten's data lake integration
- [VL] Caching java dependencies in testing docker #11501 : Docker dependency caching - Straightforward infrastructure improvement to pre-install Java dependencies in CI Docker images
- [VL] Add ANSI mode support #10134 : ANSI mode support - Comprehensive tracking issue with individual tasks that can be picked up independently, excellent for learning Spark SQL internals
- [VL] Track on Spark-4.1.x failed unit tests #11400 : Spark 4.1.x failed unit tests - Multiple test failures need investigation, good for understanding Gluten's test framework
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment