-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Preserve input field nullability in ArrayAgg return field #19868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Preserve input field nullability in ArrayAgg return field #19868
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR attempts to preserve the input field's nullability in the ArrayAgg aggregate function's return field. When the input field is non-nullable, the result field would be marked as non-nullable; when the input is nullable, the result field would remain nullable.
Changes:
- Added a
return_fieldmethod override to theArrayAggimplementation that sets the result field's nullability to match the input field's nullability - Added a unit test to verify that the schema metadata correctly reflects the input field's nullability
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
668ce8c to
b833445
Compare
114f5bc to
2b2a41b
Compare
- Override return_field() to preserve input field nullability in list elements - Add input_nullable field to all ArrayAgg accumulators - List itself remains nullable (NULL for empty groups) - Inner elements nullability now matches input field nullability
024785b to
24e57cb
Compare
- Separate state() from evaluate() in all three accumulators - state() always uses nullable=true for inner elements (matches state_fields()) - evaluate() uses input_nullable to preserve nullability for final output - Fix clippy: use std::slice::from_ref() instead of .clone() on FieldRef
- Fix array_sort return_type() to preserve input inner field nullability - Fix clippy: use Arc::clone() instead of .clone() on FieldRef - Update unnest_with_redundant_columns snapshot for non-nullable inner
- Make return_type() return error since return_field() is implemented - Simplify comments in return_field() - Revert array_sort changes (unrelated to original issue) - Revert spark/collect.rs changes (UDAF return_field not fixed)
84c066f to
402927a
Compare
|
ok ill update this all |
|
@GaneshPatil7517 Please refrain from closing a discussion when no action has been taken or no comment was left. My comments regarding the state related functions don't seem to have been addressed. |
- Update state_fields() to preserve input nullability (same as return_field) - Simplify state() to just call evaluate() since they now share the same nullability - Remove duplicate handling code in all three accumulators
fce0b8b to
f7c3206
Compare
array_sort must preserve the input list's inner field nullability to match the type returned by array_agg which now preserves input nullability.
e744abe to
4d0118d
Compare
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GaneshPatil7517 I strongly urge you to do a self-review on PRs; I'm seeing an unrelated file be accidentally committed, as well as the array-sort changes coming back without any reason given
| } | ||
|
|
||
| fn state(&mut self) -> Result<Vec<ScalarValue>> { | ||
| if !self.is_input_pre_ordered { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing this line?
| DataType::Null => Ok(DataType::Null), | ||
| DataType::List(field) => { | ||
| Ok(DataType::new_list(field.data_type().clone(), true)) | ||
| // Preserve the inner field's nullability from input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these changes coming back again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this file
Which issue does this PR close?
Fixes #19768
Rationale for this change
Currently,
ArrayAggalways produces an output with data type: