-
Notifications
You must be signed in to change notification settings - Fork 506
feat: add alter column nullable to non-nullable support #5589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Code Review
P1: Bug - Nested column validation will fail
In validate_no_nulls_before_making_non_nullable, the code uses batch.column_by_name(path) where path can be a nested path like "b.c". However, RecordBatch::column_by_name only searches top-level column names, not nested paths.
When you call scanner.project(&["b.c"]), the resulting batch schema depends on how Lance handles nested projections - the column won't be accessible via batch.column_by_name("b.c").
Suggestion: Consider using the batch's first (and only) column directly since the projection contains exactly one column:
let col = batch.column(0);
Or alternatively, extract the leaf field name from the path:
let leaf_name = path.rsplit('.').next().unwrap_or(path); let col = batch.column_by_name(leaf_name)...
Recommendation: Add a test for nested columns
The current tests only cover top-level columns. Given that alter_columns explicitly supports nested paths (as seen in test_rename_columns at line 1324), please add a test case for making a nested nullable column non-nullable.
Otherwise the implementation looks correct - the validation approach of scanning all data to check for nulls before allowing the schema change is sound.
wjones127
commented
Dec 30, 2025
+1 on the review comment. Would like to have this supported in nested columns.
This PR will add alter column nullable to non-nullable support
Parts of this PR were drafted with assistance from Codex (with
gpt-5.2) and fully reviewed and edited by me. I take full responsibility for all changes.