Skip to content

Conversation

@Shekharrajak
Copy link
Contributor

Closes #3181

Rationale for this change

Enable native execution for Spark's LEFT(str, len) function to improve query performance by avoiding JVM fallback.

What changes are included in this PR?

  • Added CometLeft serializer that transforms LEFT(str, len) to Substring(str, 1, len) protobuf
  • Registered Left expression in QueryPlanSerde string expressions map
  • No protobuf or Rust changes required (reuses existing Substring implementation)

How are these changes tested?

Added 4 test suites in CometExpressionSuite:

  • Basic functionality (various lengths: 0, -1, positive, exceeds length)
  • Unicode character handling (emoji, multi-byte chars)
  • Equivalence verification with SUBSTRING(str, 1, len)
  • Dictionary encoding preservation

@andygrove
Copy link
Member

@Shekharrajak can you run make to update the generated configs.md

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending CI. Thanks @Shekharrajak

@andygrove
Copy link
Member

@Shekharrajak could you take a look at the CI failures?

@Shekharrajak
Copy link
Contributor Author

@Shekharrajak could you take a look at the CI failures?

Updated configs.md now the checks should succeed. Please trigger the workflow

Moved Left.enabled config entry to come after LastDay.enabled
to maintain alphabetical ordering in the expression configs table.
@Shekharrajak
Copy link
Contributor Author

Looks like order was not correct last time.

Moved Left.enabled config entry to come after LastDay.enabled - in configs.md

@codecov-commenter
Copy link

codecov-commenter commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 10.52632% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.99%. Comparing base (f09f8af) to head (476e04a).
⚠️ Report is 864 commits behind head on main.

Files with missing lines Patch % Lines
...rc/main/scala/org/apache/comet/serde/strings.scala 5.55% 17 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3206      +/-   ##
============================================
+ Coverage     56.12%   59.99%   +3.86%     
- Complexity      976     1418     +442     
============================================
  Files           119      170      +51     
  Lines         11743    15793    +4050     
  Branches       2251     2610     +359     
============================================
+ Hits           6591     9475    +2884     
- Misses         4012     4998     +986     
- Partials       1140     1320     +180     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Shekharrajak
Copy link
Contributor Author

Checks are looking fine now.

@andygrove andygrove merged commit 035aeff into apache:main Jan 22, 2026
119 checks passed
@andygrove
Copy link
Member

Thanks again @Shekharrajak

@Shekharrajak Shekharrajak deleted the feature/support-left-expression branch January 23, 2026 04:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support Spark expression: left

3 participants