Skip to content

Conversation

@fitzgen
Copy link
Member

@fitzgen fitzgen commented Dec 12, 2025

This new Error has an API that is 99% identical to anyhow::Error's API, but additionally handles memory exhaustion.

This commit only introduces the wasmtime_internal_error crate into our workspace, along with its regular tests and OOM tests. This commit does not, however, migrate Wasmtime's internals or public-facing API over to the new error type yet. That is left for follow up work.

In order to continue fitting Result<(), Error> in one word, there is quite a bit of unsafe code in Error's implementation, mostly surrounding the manual creation of our own moral equivalent of Box<dyn Error> with explicit vtables and type erasure so that we get a thin pointer to a trait object rather than Box<dyn Error>'s fat pointer. To alleviate the associated risks, I've been testing this code under MIRI throughout its whole development, as well as thoroughly testing the API so that MIRI can dynamically exercise all the code paths. Furthermore, I've enabled testing this crate under MIRI in CI.

Part of #12069

@fitzgen fitzgen requested review from a team as code owners December 12, 2025 18:17
@fitzgen fitzgen requested review from alexcrichton and removed request for a team December 12, 2025 18:17
@fitzgen fitzgen force-pushed the wasmtime-error branch 2 times, most recently from ab7ef38 to 1329403 Compare December 12, 2025 18:33
@fitzgen
Copy link
Member Author

fitzgen commented Dec 12, 2025

I am not sure why the version bump followed by cargo vet is failing. Digging in...

https://github.com/bytecodealliance/wasmtime/actions/runs/20176633165/job/57926334414?pr=12163

@fitzgen
Copy link
Member Author

fitzgen commented Dec 12, 2025

I am not sure why the version bump followed by cargo vet is failing. Digging in...

Third bullet point over in #12164

This new `Error` has an API that is 99% identical to `anyhow::Error`'s API, but
additionally handles memory exhaustion.

This commit only introduces the `wasmtime_internal_error` crate into our
workspace, along with its regular tests and OOM tests. This commit does not,
however, migrate Wasmtime's internals or public-facing API over to the new error
type yet. That is left for follow up work.

In order to continue fitting `Result<(), Error>` in one word, there is quite a
bit of unsafe code in `Error`'s implementation, mostly surrounding the manual
creation of our own moral equivalent of `Box<dyn Error>` with explicit vtables
and type erasure so that we get a thin pointer to a trait object rather than
`Box<dyn Error>`'s fat pointer. To alleviate the associated risks, I've been
testing this code under MIRI throughout its whole development, as well as
thoroughly testing the API so that MIRI can dynamically exercise all the code
paths. Furthermore, I've enabled testing this crate under MIRI in CI.
For compatibility with `anyhow`'s API.

This unfortunately means we have to move the actual error implementation out
into a submodule so that this function isn't visible, or else every `match .. {
Ok(_) => ... }` stops type checking.
@fitzgen
Copy link
Member Author

fitzgen commented Dec 12, 2025

Third bullet point over in #12164

Rebased on top of that PR now that it merged

@fitzgen
Copy link
Member Author

fitzgen commented Dec 12, 2025

Grrrr this bump-and-vet CI failure is really annoying me. Works when I run the exact same commands locally:

Details
nick@hamilton :: (wasmtime-error) :: ~/wasmtime
    $ rustc scripts/publish.rs && ./publish bump-patch
bumping `component-macro-test-helpers`...
bumping `component-async-tests`...
bumping `wizer-fuzz`...
bumping `regex-test`...
bumping `regex-bench`...
bumping `uap-bench`...
bumping `wasmtime-environ-fuzz`...
bumping `wasmtime-bench-api`...
bumping `wasi-preview1-component-adapter`...
bumping `verify-component-adapter`...
bumping `byte-array-literals`...
bumping `wasi-nn-example-winml`...
bumping `wasi-nn-example-named`...
bumping `wasi-nn-example`...
bumping `classification-component-onnx`...
bumping `wasi-nn-example-pytorch`...
bumping `wasmtime-test-macros`...
bumping `wiggle-test`...
bumping `test-programs`...
bumping `test-programs-artifacts`...
bumping `wasmtime-test-util`...
bumping `wasmtime-fuzzing`...
bumping `wasm-spec-interpreter`...
bumping `wasmtime-c-api`...
bumping `cranelift-tools`...
bumping `cranelift-assembler-x64-fuzz`...
bumping `cranelift-fuzzgen`...
bumping `cranelift-filetests`...
bumping `isle-fuzz`...
bumping `veri_ir`...
bumping `veri_engine`...
bumping `islec`...
bumping `pulley-interpreter-fuzz`...
bumping `cranelift-bitset`...
  0.128.0 => 0.128.1
bumping `wasmtime-internal-math`...
bumping `pulley-macros`...
bumping `pulley-interpreter`...
bumping `cranelift-srcgen`...
  0.128.0 => 0.128.1
bumping `cranelift-assembler-x64-meta`...
  0.128.0 => 0.128.1
bumping `cranelift-assembler-x64`...
  0.128.0 => 0.128.1
bumping `cranelift-isle`...
  0.128.0 => 0.128.1
bumping `cranelift-entity`...
  0.128.0 => 0.128.1
bumping `cranelift-bforest`...
  0.128.0 => 0.128.1
bumping `cranelift-codegen-shared`...
  0.128.0 => 0.128.1
bumping `cranelift-codegen-meta`...
  0.128.0 => 0.128.1
bumping `cranelift-control`...
  0.128.0 => 0.128.1
bumping `cranelift-codegen`...
  0.128.0 => 0.128.1
bumping `cranelift-reader`...
  0.128.0 => 0.128.1
bumping `cranelift-serde`...
  0.128.0 => 0.128.1
bumping `cranelift-module`...
  0.128.0 => 0.128.1
bumping `cranelift-frontend`...
  0.128.0 => 0.128.1
bumping `cranelift-native`...
  0.128.0 => 0.128.1
bumping `cranelift-object`...
  0.128.0 => 0.128.1
bumping `cranelift-interpreter`...
  0.128.0 => 0.128.1
bumping `wasmtime-internal-jit-icache-coherence`...
bumping `wasmtime-internal-unwinder`...
bumping `cranelift-jit`...
  0.128.0 => 0.128.1
bumping `cranelift`...
  0.128.0 => 0.128.1
bumping `wiggle-generate`...
bumping `wiggle-macro`...
bumping `wasmtime-internal-error`...
bumping `wasmtime-internal-versioned-export-macros`...
bumping `wasmtime-internal-slab`...
bumping `wasmtime-internal-component-util`...
bumping `wasmtime-internal-wit-bindgen`...
bumping `wasmtime-internal-component-macro`...
bumping `wasmtime-internal-jit-debug`...
bumping `wasmtime-internal-fiber`...
bumping `wasmtime-environ`...
bumping `wasmtime-internal-wmemcheck`...
bumping `wasmtime-internal-cranelift`...
bumping `wasmtime-internal-cache`...
bumping `winch-codegen`...
bumping `wasmtime-internal-winch`...
bumping `wasmtime`...
bumping `wiggle`...
bumping `wasi-common`...
bumping `wasmtime-wasi-io`...
bumping `wasmtime-wasi`...
bumping `wasmtime-wasi-http`...
bumping `wasmtime-wasi-nn`...
bumping `wasmtime-wasi-config`...
bumping `wasmtime-wasi-keyvalue`...
bumping `wasmtime-wasi-threads`...
bumping `wasmtime-wasi-tls`...
bumping `wasmtime-wasi-tls-nativetls`...
bumping `wasmtime-wast`...
bumping `wasmtime-internal-c-api-macros`...
bumping `wasmtime-c-api-impl`...
bumping `wasmtime-wizer`...
bumping `wasmtime-cli-flags`...
bumping `wasmtime-internal-explorer`...
bumping `wasmtime-cli`...
  41.0.0 => 41.0.1
Running: `"cargo" "fetch" "--offline"`

nick@hamilton :: (wasmtime-error *) :: ~/wasmtime
    $ echo $?
0

Edit: d'oh I missed a command from CI. Can repro.

@fitzgen
Copy link
Member Author

fitzgen commented Dec 12, 2025

Looks like fbac398 did indeed fix the cargo vet issues.

@github-actions github-actions bot added the fuzzing Issues related to our fuzzing infrastructure label Dec 12, 2025
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen

Details This issue or pull request has been labeled: "fuzzing"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@fitzgen
Copy link
Member Author

fitzgen commented Dec 15, 2025

@cfallin do you feel comfortable reviewing this since Alex is out-of-office right now?

@cfallin
Copy link
Member

cfallin commented Dec 15, 2025

Sure, happy to do so!

@cfallin cfallin self-requested a review December 15, 2025 19:05
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review but I wanted to checkpoint these comments at least -- will keep reading from lib.rs onward in the new crate tomorrow!

This is very nice code overall; just little nits mostly.

use backtrace::Backtrace;
use std::{alloc::GlobalAlloc, cell::Cell, mem, ptr, time};
use wasmtime::{Error, Result};
use wasmtime_error::{Error, OutOfMemory, Result, bail};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that we are doing this direct import from the error crate for now since it's not integrated into Wasmtime, but is it the long-term goal to reexport the public error types from wasmtime itself? If so, should we leave a TODO/FIXME here noting that?

/// until the pass (or fail).
pub fn new() -> Self {
let _ = env_logger::try_init();
wasmtime_error::disable_backtrace();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that backtracing is not yet OOM-safe? (Will it be eventually or will we document the settings needed for OOM-safety, including disabling backtraces?) Can we leave a comment here describing why disable?


/// A `dyn ErrorExt` trait object that we know the concrete type of.
///
/// XXX: Must have a compatible layout with `DynError`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a static-assert somewhere that (instantiating for some error-type E) asserts that vtable and backtrace have the same offsets in both types? It's almost tautological today as they're right next to each other, list the same field prefix, and are repr(C), but it can't hurt to double-check if someone adds fields in the future...

}

/// Morally a `dyn ErrorExt` trait object that holds its own vtable.
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a comment here describing why we need to manually implement vtables rather than (say) use a dyn Trait fat-pointer? I know there are some interesting layout-size tradeoffs you navigated here to arrive at this.

/// When the `"anyhow"` feature is enabled, there is a `From<wasmtime::Error>
/// for anyhow::Error` implementation. You can always call that implementation
/// explicitly if needed, but `?`-propagation allows the conversion to happen
/// seemlessly from functions that return a `Result<T, wasmtime::Error>` to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/seemlessly/seamlessly/

// `ErrorExt::ext_is` implementation and the casts that are performed
// using that method's return value.
#[repr(transparent)]
struct ForeignError<E>(E);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this has to be nested within new()? I understand the encapsulation is nice (it's only instantiated here, nowhere else) but the indents are a little deep and it makes the code harder to read for me at least -- in-place struct impls are nicest when they're small or ad-hoc IMHO.

(Not a huge deal, just my aesthetic preference)

impl OomOrDynError {
const _SIZE: () = assert!(mem::size_of::<OomOrDynError>() == mem::size_of::<usize>());

// Our pointer tagging relies on this property.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very nice and clever trick, on the one hand. On the other, I worry that it feels a little brittle: for example if there is some weird platform in the future (perhaps one that we support only Pulley on) that has restrictive alignment or padding requirements, or if we change the fields of the two structs below such that they get the same alignment, or whatever, or if dangling no longer returns the first valid pointer (such that it will be unaligned for the higher alignment), this is no longer true.

In particular, I see the docs for NonNull::dangling only specify that it returns a valid, aligned pointer, but we aren't asserting here that that pointer is not aligned for DynError. Concretely, if the alignments are 8 and 16 respectively, nothing in the docs seems to disallow NonNull::<OutOfMemory>::dangling() from returning 0x10 rather than 0x08, I think? Or, randomly, some bits that turn out to be a valid box allocation address for a DynError?

The docs say:

Note that the address of the returned pointer may potentially be that of a valid pointer, which means this must not be used as a “not yet initialized” sentinel value.

So: perhaps we could do traditional LSB pointer tagging/masking? Or is that not viable for other reasons?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fuzzing Issues related to our fuzzing infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants