Skip to content

Comments

SPARK-55589: TransformingEncoder support from Option to nullable#54366

Open
eejbyfeldt wants to merge 1 commit intoapache:masterfrom
eejbyfeldt:SPARK-55589
Open

SPARK-55589: TransformingEncoder support from Option to nullable#54366
eejbyfeldt wants to merge 1 commit intoapache:masterfrom
eejbyfeldt:SPARK-55589

Conversation

@eejbyfeldt
Copy link
Contributor

@eejbyfeldt eejbyfeldt commented Feb 18, 2026

What changes were proposed in this pull request?

Make it possible to use TransformingEncoder to map from an Option to a nullable type and back to the Option.

Why are the changes needed?

We are currently using Expression based decoders to derive some custom encoders. Our custom encoders supported nested Option by automatically inserting a struct the include to format for each extra Option. Trying to migrate this code to use the AgnosticEncoders ran into this issue as null are not passed into the decode function so that it can be wrapped back to a Option.

Here are some related MRs also trying to add feature parity to AgnosticEncoders: #50023, #51319, #51313

I believe this was missed in #50023 as that introduced a codec with null handling here https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala#L709 but because of this bug that null check will never be reached. But that particular implementation does propagates null manually that will not be noticed.

Does this PR introduce any user-facing change?

Yes, libraries trying to derive custom encoders with have more flexibility.

How was this patch tested?

Existing and new unittests.

Was this patch authored or co-authored using generative AI tooling?

No.

@eejbyfeldt eejbyfeldt marked this pull request as ready for review February 19, 2026 13:20
@eejbyfeldt
Copy link
Contributor Author

@hvanhovell and @chris-twiner you have both be involved in previous MRs making the AgnosticEncoders more extensible. Would you mind having a look at this one?

@chris-twiner
Copy link
Contributor

chris-twiner commented Feb 19, 2026

Do you have any tests which show what wasn't working before, I feel this is something that would have been handled as part of the value class logic.
That said it intuitively looks right and similar to other propagate logic.

@eejbyfeldt
Copy link
Contributor Author

The added test case does not pass if the other change is removed.

You could also add an assert https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala#L709 here that out it never null and that will not break any existing tests.

@chris-twiner
Copy link
Contributor

The added test case does not pass if the other change is removed.

You could also add an assert https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala#L709 here that out it never null and that will not break any existing tests.

lol, blame it on a caffeine deficit. Yeah that makes sense, it's interesting that it wasn't picked up by some of the frameless tests, I'm probably working around it without realising.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants