perf: Optimize scalar performance for cot #19888

kumarUjjawal · 2026-01-19T15:29:08Z

Which issue does this PR close?

Part of [EPIC] Optimize performance for slow expressions datafusion-comet#2986.

Rationale for this change

The cot function currently converts scalar inputs to arrays before processing, even for single scalar values. This adds unnecessary overhead from array allocation and conversion. Adding a scalar fast path avoids this overhead.

What changes are included in this PR?

Added scalar fast path
Added benchmark
Update tests

Type	Before	After	Speedup
cot_f64_scalar	229 ns	67 ns	3.4x
cot_f32_scalar	247 ns	59 ns	4.2x

Are these changes tested?

Are there any user-facing changes?

martin-g · 2026-01-20T11:03:35Z

datafusion/functions/src/math/cot.rs

+            ColumnarValue::Scalar(_) => {
+                panic!("Expected an array value")
+            }
+        }


There are no tests for the Scalar input/output (the fast path).
Also it would be good to add tests for inputs like NULL, 0.0 and f64::consts::Pi

The existing sqllogictests should already cover the functionality. Aren't the changes just optimization.

There is no .slt test for the non-Spark cot function:

❯ rg cot datafusion/sqllogictest/ datafusion/sqllogictest/test_files/spark/math/cot.slt 24:## Original Query: SELECT cot(1); 27:#SELECT cot(1::int); datafusion/sqllogictest/test_files/aggregates_topk.slt 203:('y', 'apricot'), datafusion/sqllogictest/test_files/imdb.slt 850: (24, 'Ridley Scott', NULL, NULL, 'm', NULL, NULL, NULL, NULL),

Or maybe datafusion/sqllogictest/test_files/spark/math/cot.slt is not really for Spark because I see no cot in https://github.com/apache/datafusion/blob/main/datafusion/spark/src/function/math/mod.rs

Anyway, https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/test_files/spark/math/cot.slt contains only commented out code, so there are no SLT tests for cot.

Added unit tests for these. Thanks for the feedback.

martin-g · 2026-01-20T11:07:49Z

datafusion/functions/src/math/cot.rs

-                .unary::<_, Float32Type>(|x: f32| compute_cot32(x)),
-        ) as ArrayRef),
-        other => exec_err!("Unsupported data type {other:?} for function cot"),
+        let return_type = args.return_type().clone();


This variable is used just once - it could be moved inside if scalar.is_null() { to avoid the cloning if not used.

martin-g · 2026-01-20T11:09:25Z

datafusion/functions/src/math/cot.rs

+                        .unary::<_, Float32Type>(compute_cot32),
+                ))),
+                other => {
+                    internal_err!("Unexpected data type {other:?} for function cot")


Is it intentional to use internal_err!() instead of exec_err!() (old line 116) ?!

If we reach the other => branch, it means the type coercion/signature code has a bug, this should never happen in normal execution, hence internal_err.

martin-g · 2026-01-20T11:25:48Z

datafusion/functions/benches/cot.rs

+                        .invoke_with_args(ScalarFunctionArgs {
+                            args: scalar_f32_args.clone(),
+                            arg_fields: scalar_f32_arg_fields.clone(),
+                            number_rows: 1,


Suggested change

number_rows: 1,

number_rows: size,

Currently the input is always the same for all values of size. Maybe the number_rows could be used to make it a bit different ?!

The benchmark loop already varies size for array benchmarks. For scalar, the point is to measure single-value performance regardless of batch size context.

In that case there is no need the Scalar bench to be inside for size in [1024, 4096, 8192] {. Currently it executes the very same logic with the very same config three times (once for each size).

Yeah right!

martin-g · 2026-01-20T11:25:57Z

datafusion/functions/benches/cot.rs

+                        .invoke_with_args(ScalarFunctionArgs {
+                            args: scalar_f64_args.clone(),
+                            arg_fields: scalar_f64_arg_fields.clone(),
+                            number_rows: 1,


Suggested change

number_rows: 1,

number_rows: size,

Jefffrey · 2026-01-21T03:44:50Z

Thanks @kumarUjjawal & @martin-g

perf: Optimize scalar performance for cot

7f789c5

github-actions bot added the functions Changes to functions implementation label Jan 19, 2026

Jefffrey approved these changes Jan 19, 2026

View reviewed changes

martin-g reviewed Jan 20, 2026

View reviewed changes

kumarUjjawal added 2 commits January 20, 2026 20:29

moved return type to avoid clone & scalar benchmark ourside size loop

6d3bfd3

added unit tests for scalar path

1314484

martin-g approved these changes Jan 20, 2026

View reviewed changes

Jefffrey added this pull request to the merge queue Jan 21, 2026

Merged via the queue into apache:main with commit 4d8d48c Jan 21, 2026
28 checks passed

perf: Optimize scalar performance for cot #19888

perf: Optimize scalar performance for cot #19888

Conversation

kumarUjjawal commented Jan 19, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Jefffrey commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants