SPARK-55589: TransformingEncoder support from Option to nullable#54366
SPARK-55589: TransformingEncoder support from Option to nullable#54366eejbyfeldt wants to merge 1 commit intoapache:masterfrom
Conversation
ea24cb3 to
a498d31
Compare
|
@hvanhovell and @chris-twiner you have both be involved in previous MRs making the AgnosticEncoders more extensible. Would you mind having a look at this one? |
|
Do you have any tests which show what wasn't working before, I feel this is something that would have been handled as part of the value class logic. |
|
The added test case does not pass if the other change is removed. You could also add an assert https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala#L709 here that out it never null and that will not break any existing tests. |
lol, blame it on a caffeine deficit. Yeah that makes sense, it's interesting that it wasn't picked up by some of the frameless tests, I'm probably working around it without realising. |
What changes were proposed in this pull request?
Make it possible to use TransformingEncoder to map from an
Optionto a nullable type and back to theOption.Why are the changes needed?
We are currently using Expression based decoders to derive some custom encoders. Our custom encoders supported nested
Optionby automatically inserting a struct the include to format for each extra Option. Trying to migrate this code to use the AgnosticEncoders ran into this issue as null are not passed into the decode function so that it can be wrapped back to a Option.Here are some related MRs also trying to add feature parity to AgnosticEncoders: #50023, #51319, #51313
I believe this was missed in #50023 as that introduced a codec with null handling here https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala#L709 but because of this bug that null check will never be reached. But that particular implementation does propagates null manually that will not be noticed.
Does this PR introduce any user-facing change?
Yes, libraries trying to derive custom encoders with have more flexibility.
How was this patch tested?
Existing and new unittests.
Was this patch authored or co-authored using generative AI tooling?
No.