-
Notifications
You must be signed in to change notification settings - Fork 1.9k
perf: Optimize scalar performance for cot #19888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| ColumnarValue::Scalar(_) => { | ||
| panic!("Expected an array value") | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no tests for the Scalar input/output (the fast path).
Also it would be good to add tests for inputs like NULL, 0.0 and f64::consts::Pi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing sqllogictests should already cover the functionality. Aren't the changes just optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no .slt test for the non-Spark cot function:
❯ rg cot datafusion/sqllogictest/
datafusion/sqllogictest/test_files/spark/math/cot.slt
24:## Original Query: SELECT cot(1);
27:#SELECT cot(1::int);
datafusion/sqllogictest/test_files/aggregates_topk.slt
203:('y', 'apricot'),
datafusion/sqllogictest/test_files/imdb.slt
850: (24, 'Ridley Scott', NULL, NULL, 'm', NULL, NULL, NULL, NULL),
Or maybe datafusion/sqllogictest/test_files/spark/math/cot.slt is not really for Spark because I see no cot in https://github.com/apache/datafusion/blob/main/datafusion/spark/src/function/math/mod.rs
Anyway, https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/test_files/spark/math/cot.slt contains only commented out code, so there are no SLT tests for cot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added unit tests for these. Thanks for the feedback.
datafusion/functions/src/math/cot.rs
Outdated
| .unary::<_, Float32Type>(|x: f32| compute_cot32(x)), | ||
| ) as ArrayRef), | ||
| other => exec_err!("Unsupported data type {other:?} for function cot"), | ||
| let return_type = args.return_type().clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable is used just once - it could be moved inside if scalar.is_null() { to avoid the cloning if not used.
| .unary::<_, Float32Type>(compute_cot32), | ||
| ))), | ||
| other => { | ||
| internal_err!("Unexpected data type {other:?} for function cot") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intentional to use internal_err!() instead of exec_err!() (old line 116) ?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we reach the other => branch, it means the type coercion/signature code has a bug, this should never happen in normal execution, hence internal_err.
datafusion/functions/benches/cot.rs
Outdated
| .invoke_with_args(ScalarFunctionArgs { | ||
| args: scalar_f32_args.clone(), | ||
| arg_fields: scalar_f32_arg_fields.clone(), | ||
| number_rows: 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| number_rows: 1, | |
| number_rows: size, |
Currently the input is always the same for all values of size. Maybe the number_rows could be used to make it a bit different ?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmark loop already varies size for array benchmarks. For scalar, the point is to measure single-value performance regardless of batch size context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case there is no need the Scalar bench to be inside for size in [1024, 4096, 8192] {. Currently it executes the very same logic with the very same config three times (once for each size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah right!
datafusion/functions/benches/cot.rs
Outdated
| .invoke_with_args(ScalarFunctionArgs { | ||
| args: scalar_f64_args.clone(), | ||
| arg_fields: scalar_f64_arg_fields.clone(), | ||
| number_rows: 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| number_rows: 1, | |
| number_rows: size, |
|
Thanks @kumarUjjawal & @martin-g |
Which issue does this PR close?
Rationale for this change
The cot function currently converts scalar inputs to arrays before processing, even for single scalar values. This adds unnecessary overhead from array allocation and conversion. Adding a scalar fast path avoids this overhead.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?