-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Introduce an OOM-handling Error type for Wasmtime
#12163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ab7ef38 to
1329403
Compare
|
I am not sure why the version bump followed by https://github.com/bytecodealliance/wasmtime/actions/runs/20176633165/job/57926334414?pr=12163 |
Third bullet point over in #12164 |
This new `Error` has an API that is 99% identical to `anyhow::Error`'s API, but additionally handles memory exhaustion. This commit only introduces the `wasmtime_internal_error` crate into our workspace, along with its regular tests and OOM tests. This commit does not, however, migrate Wasmtime's internals or public-facing API over to the new error type yet. That is left for follow up work. In order to continue fitting `Result<(), Error>` in one word, there is quite a bit of unsafe code in `Error`'s implementation, mostly surrounding the manual creation of our own moral equivalent of `Box<dyn Error>` with explicit vtables and type erasure so that we get a thin pointer to a trait object rather than `Box<dyn Error>`'s fat pointer. To alleviate the associated risks, I've been testing this code under MIRI throughout its whole development, as well as thoroughly testing the API so that MIRI can dynamically exercise all the code paths. Furthermore, I've enabled testing this crate under MIRI in CI.
For compatibility with `anyhow`'s API.
This unfortunately means we have to move the actual error implementation out
into a submodule so that this function isn't visible, or else every `match .. {
Ok(_) => ... }` stops type checking.
557d9f8 to
2b871b0
Compare
Rebased on top of that PR now that it merged |
|
Grrrr this bump-and-vet CI failure is really annoying me. DetailsEdit: d'oh I missed a command from CI. Can repro. |
|
Looks like fbac398 did indeed fix the |
Subscribe to Label Actioncc @fitzgen DetailsThis issue or pull request has been labeled: "fuzzing"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
|
@cfallin do you feel comfortable reviewing this since Alex is out-of-office right now? |
|
Sure, happy to do so! |
cfallin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review but I wanted to checkpoint these comments at least -- will keep reading from lib.rs onward in the new crate tomorrow!
This is very nice code overall; just little nits mostly.
| use backtrace::Backtrace; | ||
| use std::{alloc::GlobalAlloc, cell::Cell, mem, ptr, time}; | ||
| use wasmtime::{Error, Result}; | ||
| use wasmtime_error::{Error, OutOfMemory, Result, bail}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that we are doing this direct import from the error crate for now since it's not integrated into Wasmtime, but is it the long-term goal to reexport the public error types from wasmtime itself? If so, should we leave a TODO/FIXME here noting that?
| /// until the pass (or fail). | ||
| pub fn new() -> Self { | ||
| let _ = env_logger::try_init(); | ||
| wasmtime_error::disable_backtrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea that backtracing is not yet OOM-safe? (Will it be eventually or will we document the settings needed for OOM-safety, including disabling backtraces?) Can we leave a comment here describing why disable?
|
|
||
| /// A `dyn ErrorExt` trait object that we know the concrete type of. | ||
| /// | ||
| /// XXX: Must have a compatible layout with `DynError`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have a static-assert somewhere that (instantiating for some error-type E) asserts that vtable and backtrace have the same offsets in both types? It's almost tautological today as they're right next to each other, list the same field prefix, and are repr(C), but it can't hurt to double-check if someone adds fields in the future...
| } | ||
|
|
||
| /// Morally a `dyn ErrorExt` trait object that holds its own vtable. | ||
| /// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a comment here describing why we need to manually implement vtables rather than (say) use a dyn Trait fat-pointer? I know there are some interesting layout-size tradeoffs you navigated here to arrive at this.
| /// When the `"anyhow"` feature is enabled, there is a `From<wasmtime::Error> | ||
| /// for anyhow::Error` implementation. You can always call that implementation | ||
| /// explicitly if needed, but `?`-propagation allows the conversion to happen | ||
| /// seemlessly from functions that return a `Result<T, wasmtime::Error>` to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/seemlessly/seamlessly/
| // `ErrorExt::ext_is` implementation and the casts that are performed | ||
| // using that method's return value. | ||
| #[repr(transparent)] | ||
| struct ForeignError<E>(E); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this has to be nested within new()? I understand the encapsulation is nice (it's only instantiated here, nowhere else) but the indents are a little deep and it makes the code harder to read for me at least -- in-place struct impls are nicest when they're small or ad-hoc IMHO.
(Not a huge deal, just my aesthetic preference)
| impl OomOrDynError { | ||
| const _SIZE: () = assert!(mem::size_of::<OomOrDynError>() == mem::size_of::<usize>()); | ||
|
|
||
| // Our pointer tagging relies on this property. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very nice and clever trick, on the one hand. On the other, I worry that it feels a little brittle: for example if there is some weird platform in the future (perhaps one that we support only Pulley on) that has restrictive alignment or padding requirements, or if we change the fields of the two structs below such that they get the same alignment, or whatever, or if dangling no longer returns the first valid pointer (such that it will be unaligned for the higher alignment), this is no longer true.
In particular, I see the docs for NonNull::dangling only specify that it returns a valid, aligned pointer, but we aren't asserting here that that pointer is not aligned for DynError. Concretely, if the alignments are 8 and 16 respectively, nothing in the docs seems to disallow NonNull::<OutOfMemory>::dangling() from returning 0x10 rather than 0x08, I think? Or, randomly, some bits that turn out to be a valid box allocation address for a DynError?
The docs say:
Note that the address of the returned pointer may potentially be that of a valid pointer, which means this must not be used as a “not yet initialized” sentinel value.
So: perhaps we could do traditional LSB pointer tagging/masking? Or is that not viable for other reasons?
This new
Errorhas an API that is 99% identical toanyhow::Error's API, but additionally handles memory exhaustion.This commit only introduces the
wasmtime_internal_errorcrate into our workspace, along with its regular tests and OOM tests. This commit does not, however, migrate Wasmtime's internals or public-facing API over to the new error type yet. That is left for follow up work.In order to continue fitting
Result<(), Error>in one word, there is quite a bit of unsafe code inError's implementation, mostly surrounding the manual creation of our own moral equivalent ofBox<dyn Error>with explicit vtables and type erasure so that we get a thin pointer to a trait object rather thanBox<dyn Error>'s fat pointer. To alleviate the associated risks, I've been testing this code under MIRI throughout its whole development, as well as thoroughly testing the API so that MIRI can dynamically exercise all the code paths. Furthermore, I've enabled testing this crate under MIRI in CI.Part of #12069