Skip to content

Fix or remove num_traits for performance #520

@LegNeato

Description

@LegNeato

We can't get rid of num_traits for silly rust reasons:
core::f32::powf is marked as #[rustc_allow_incoherent_impl], meaning it doesn't actually "exist" unless some other crate declares it. I'm not too familiar on the details, but it seems like when you have std it exists and if you only have core it's declared but doesn't exist and using it will fail with:

  error[E0599]: no function or associated item named `powf` found for type `f32` in the current scope
     --> crates/restir-shader/src/material/pbr/eval.rs:159:25
      |
  159 |     f0 + (1.0 - f0) * f32::powf(f32::clamp(1.0 - cos_theta, 0.0, 1.0), 5.0)
      |                            ^^^^ function or associated item not found in `f32`
      |
      = help: items from traits can only be used if the trait is in scope

see rust-lang/rust#149347 on this sillyness

The workaround has usually been this use statement:

#[cfg(target_arch = "spirv")]
use spirv_std::num_traits::Float;

num_traits with std feature will call std functions and with libm feature / in #[no_std] environments call the appropriate libm function. For SPIR-V, our spirv-std configures the feature for you and we then intercept those libm calls and replace them with intrinsics.

While researching this, I've noticed this has been discussed in 2021 on the embark repo already, and their conclusion was:

I think this is a fully external issue, then - rust-gpu is working "as intended", faithfully compiling the code that it is given, which doesn't use an intrinsic. The way to fix this, then, would be to change upstream code to be less branchy (e.g. requesting that libm adds a powi intrinsic, and having num-traits call it).

I don't mind if we want to merge this anyway, so it's not a trap to end users @LegNeato.

But num-traits isn't a perfect solution either: Technically, it doesn't actually reimplement all f32 functions. For example, it excludes euclid, for which you need to use the Euclid trait, which takes args by reference instead of by value, so you'll get awful code like this (which initially sparked my investigation into this last summer):

#[cfg(feature = "std")]
let rem = |x: f32| x.rem_euclid(1.);
#[cfg(not(feature = "std"))]
let rem = |x: f32| x.rem_euclid(&1.);

Originally posted by @Firestar99 in #518 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions