February 20, 2026: Weekly Status Update in Gluten · apache/gluten · Discussion #11638

GlutenPerfBot
Feb 20, 2026

This weekly update is generated by LLMs. You're welcome to join our Github for in-depth discussions.

Overall Activity Summary

The Apache Gluten project has been highly active over the past 7 days with 42 pull requests and 20+ issues, focusing on major infrastructure improvements, performance optimizations, and Spark 4.x compatibility. The community is preparing for the 1.6.0 release while advancing multiple backend enhancements.

Key Ongoing Projects

Build System Modernization

Gradle Build Support: @liuneng1994 is leading a comprehensive effort ([WIP] [BUILD] Add Gradle build support and replace Maven in CI #11576 ) to add Gradle as an alternative to Maven, featuring multi-version support, native C++ integration, and significant build performance improvements
Incremental Build Optimization: @baibaichen delivered major improvements ([GLUTEN-11559][Build] Improve incremental build time for test-compile phase #11560 , [GLUTEN-11559][VL] Add incremental C++ build script for fast development iteration #11595 ) reducing incremental build times from ~3 minutes to under 30 seconds through Ninja build system adoption and smart caching

Performance & Memory Management

Native Delta Statistics Writer: @zhztheplayer achieved remarkable 61% performance improvement ([GLUTEN-10215][VL] Delta write: Native statistics tracker to eliminate C2R overhead #11419 ) by eliminating C2R overhead through native Velox aggregation tasks
Broadcast Hash Join Optimization: @JkSelf implemented executor-level hash table caching ([GLUTEN-7548][VL] Optimize BHJ in velox backend #8931 ) showing 1.29x performance improvement in TPC-DS benchmarks
Memory Management: Multiple PRs addressing off-heap memory issues in shuffle operations ([VL][1.5] Not enough spark off-heap execution memory on rss shuffle writer #11542 , [VL][1.5] Not enough spark off-heap execution memory on window #11540 )

Spark 4.x Compatibility

Python 3.10 Migration: @ReemaAlzaid completed CI updates ([VL] Update CI Python to 3.10 for Spark 4.1 and enable ArrowEvalPythonExecSuite tests #11481 , [VL][CI] Migrate Spark 4.1 tests to CentOS 9 #11519 ) to support Spark 4.1's Python requirements
Test Suite Stabilization: @baibaichen and team are systematically fixing disabled test suites (Spark 4.x: Tracking disabled test suites #11550 , [GLUTEN-11550][UT] Enable GlutenXmlExpressionsSuite for spark4x and exclude 'from_xml- invalid data' #11580 ) with 51 unique suites across Spark 4.0/4.1 versions

Priority Items

Critical Infrastructure

GPU CI Infrastructure: @zhouyuan temporarily disabled GPU CI ([GLUTEN-11611][VL] Temporary disable GPU CI job #11612 ) due to FBOS upgrade compatibility issues - needs container updates
S3 Integration Testing: @Mariamalmesfer enabled comprehensive S3 integration tests ([VL] Add S3 integration gluten tests #11516 ) closing a long-standing gap

Function Support Expansion

ANSI Mode Implementation: @philo-he is coordinating comprehensive ANSI SQL compliance ([VL] Add ANSI mode support #10134 ) with multiple contributors working on type casting and arithmetic functions
Missing Spark Functions: @zhztheplayer added support for approx_count_distinct_for_intervals ([VL] Add support for approx_count_distinct_for_intervals #11599 ) essential for Spark CBO + histogram functionality

Notable Discussions

Release Planning

Gluten 1.6.0 Release: @zhztheplayer is coordinating the upcoming release (Gluten Release 1.6.0 #11603 ) with version bump completed ([CORE] Bump version to 1.7.0-SNAPSHOT #11592 )

New Backend Introduction

Bolt Backend Integration: @WangGuangxin initiated discussion (Add a new backend: Bolt #10929 ) about integrating Bolt, a Velox fork from ByteDance with production-hardened features and LLVM-based JIT compilation

Emerging Trends

AI-Driven Development: Multiple PRs explicitly mention AI tooling usage (Claude, GitHub Copilot) for development acceleration
Production Optimization: Focus shifting from basic functionality to production-ready features like memory management, performance tuning, and comprehensive testing
Multi-Backend Strategy: Growing interest in supporting multiple execution backends beyond Velox
Build Performance: Significant engineering effort on developer experience improvements

Good First Issues

#10134: ANSI Mode Support
Skills needed: Scala, SQL, Type Systems
Why it's good: Comprehensive tracking issue with individual tasks that can be picked up independently, excellent for learning Spark SQL internals

#11513: Input_file_name() returns "" on iceberg tables
Skills needed: Java/Scala, Iceberg integration
Why it's good: Well-defined bug with clear scope, good introduction to Gluten's data lake integration

#11501: Docker Dependency Caching
Skills needed: Docker, CI/CD, Maven
Why it's good: Straightforward infrastructure improvement with clear requirements to pre-install Java dependencies in CI Docker images for faster builds

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

February 20, 2026: Weekly Status Update in Gluten #11638

Uh oh!

{{title}}

Uh oh!

GlutenPerfBot
Feb 20, 2026

Overall Activity Summary

Key Ongoing Projects

Priority Items

Notable Discussions

Emerging Trends

Good First Issues

Replies: 0 comments

Select a reply

Uh oh!

February 20, 2026: Weekly Status Update in Gluten #11638

Uh oh!

GlutenPerfBot Feb 20, 2026

Overall Activity Summary

Key Ongoing Projects

Priority Items

Notable Discussions

Emerging Trends

Good First Issues

Replies: 0 comments

GlutenPerfBot
Feb 20, 2026