Cranelift: update regalloc2 to 0.15.0 to permit more VRegs. by cfallin · Pull Request #12611 · bytecodealliance/wasmtime

cfallin · 2026-02-17T22:26:35Z

This pulls in bytecodealliance/regalloc2#257 to permit more VRegs to be used in a single function body, addressing #12229 and our followup discussions about supporting function body sizes up to the Wasm implementation limit standard.

In addition to the RA2 upgrade, this also includes a bit more explicit limit-checking on the Cranelift side: note that we don't directly use regalloc2::VReg but instead we further bitpack it into Reg, which is logically a sum type of VReg, PReg and SpillSlot (the last one needed to represent stack allocation locations on defs, e.g. on callsites with many returns). PRegs are packed into the beginning of the VReg index space but SpillSlots are distinguished by stealing the upper bit of a u32. This was previously not a problem given the smaller VReg index space but now we need to check explicitly; hence Reg::from_virtual_reg_checked and its use in the lowering vreg allocator. Because the VReg index packs the class into the bottom two bits, and index into the upper 30, but we steal one bit at the top, the true limit for VReg count is thus actually 2^29, or 512M.

Fixes #12229.

This pulls in bytecodealliance/regalloc2#257 to permit more VRegs to be used in a single function body, addressing bytecodealliance#12229 and our followup discussions about supporting function body sizes up to the Wasm implementation limit standard. In addition to the RA2 upgrade, this also includes a bit more explicit limit-checking on the Cranelift side: note that we don't directly use `regalloc2::VReg` but instead we further bitpack it into `Reg`, which is logically a sum type of `VReg`, `PReg` and `SpillSlot` (the last one needed to represent stack allocation locations on defs, e.g. on callsites with many returns). `PReg`s are packed into the beginning of the `VReg` index space but `SpillSlot`s are distinguished by stealing the upper bit of a `u32`. This was previously not a problem given the smaller `VReg` index space but now we need to check explicitly; hence `Reg::from_virtual_reg_checked` and its use in the lowering vreg allocator. Because the `VReg` index packs the class into the bottom two bits, and index into the upper 30, but we steal one bit at the top, the true limit for VReg count is thus actually 2^29, or 512M. Fixes bytecodealliance#12229.

cfallin · 2026-02-17T23:22:42Z

The failing test (code_too_large) fails as expected because we now support the generated function body size. A few thoughts though:

In theory it should be impossible to write this test once we fully hit the goal of "compile any valid Wasm function that wasmparser's implementation-limit checks pass on";
But there's still the interesting question of whether there's a "gap" as we sweep the code size upward where we start to throw errors;
In attempting to check this by increasing the test's N, I found that our ValueDataPacked encoding gives 24 bits to value numbers, so I'm working on reclaiming some bits there. In particular the Type encoding seems pretty inefficient so I'll try to steal some of those bits back.

alexcrichton · 2026-02-17T23:27:26Z

I think that test already takes forever to compile/run in debug mode on all platforms so jettisoning that test doesn't seem unreasonable to me, and agreed it in theory should not be possible to write. Maybe it's sufficient to have a local script to run or some online corpus of "must compile big modules" or something like that?

Either that or we could create a dedicated test job for "compile wasmtime in release on one fast platform and compile big modules" to make sure everything succeeds.

This updates the `ValueDataPacked` scheme from the old ``` (enum tag) (CLIF type) (value 1) (value 2) /// | tag:2 | type:14 | x:24 | y:24 | ``` encoding in a `u64` to a new ``` /// | tag:2 | type:14 | x:32 | y:32 | ``` encoding, with a `packed` tag attribute to ensure the struct fits in 10 bytes. This permits the full range of `Value` (a `u32` entity index) to be encoded, removing the remaining major limit on function body size after the work in bytecodealliance#12611 to address bytecodealliance#12229. Curiously, this appears to be a *speedup* in compile time of 3-5% on bz2 and 3% on spidermonkey-json (Sightglass, 50 data points each). My best guess as to why is that putting the value fields in their own `u32`s allows for quick access without shifts/masks, which is actually better than the unaligned accesses (caused by 10-byte size) -- which have no penalty on modern mainstream CPUs -- and 25% size inflation of the value-definitions array.

cfallin · 2026-02-18T00:11:14Z

Sure thing -- I'll push the "corpus of big inputs to test" as followup (after #12613 lands as well); removed the code_too_large test here.

This updates the `ValueDataPacked` scheme from the old ``` (enum tag) (CLIF type) (value 1) (value 2) /// | tag:2 | type:14 | x:24 | y:24 ``` encoding in a `u64` to a new ``` /// | tag:2 | type:14 | x:32 | y:32 ``` encoding, with a `packed` tag attribute to ensure the struct fits in 10 bytes. This permits the full range of `Value` (a `u32` entity index) to be encoded, removing the remaining major limit on function body size after the work in bytecodealliance#12611 to address bytecodealliance#12229. Curiously, this appears to be a *speedup* in compile time of 3-5% on bz2 and 3% on spidermonkey-json (Sightglass, 50 data points each). My best guess as to why is that putting the value fields in their own `u32`s allows for quick access without shifts/masks, which is actually better than the unaligned accesses (caused by 10-byte size) -- which have no penalty on modern mainstream CPUs -- and 25% size inflation of the value-definitions array.

This updates the `ValueDataPacked` scheme from the old ``` (enum tag) (CLIF type) (value 1) (value 2) /// | tag:2 | type:14 | x:24 | y:24 ``` encoding in a `u64` to a new ``` /// | tag:2 | type:14 | x:32 | y:32 ``` encoding, with a `packed` tag attribute to ensure the struct fits in 10 bytes. This permits the full range of `Value` (a `u32` entity index) to be encoded, removing the remaining major limit on function body size after the work in #12611 to address #12229. Curiously, this appears to be a *speedup* in compile time of 3-5% on bz2 and 3% on spidermonkey-json (Sightglass, 50 data points each). My best guess as to why is that putting the value fields in their own `u32`s allows for quick access without shifts/masks, which is actually better than the unaligned accesses (caused by 10-byte size) -- which have no penalty on modern mainstream CPUs -- and 25% size inflation of the value-definitions array.

cfallin requested review from a team as code owners February 17, 2026 22:26

cfallin requested review from alexcrichton and removed request for a team February 17, 2026 22:26

alexcrichton approved these changes Feb 17, 2026

View reviewed changes

cfallin mentioned this pull request Feb 18, 2026

Cranelift: update ValueDataPacked to support full Value range. #12613

Merged

Drop code_too_large test.

48665a7

cfallin requested a review from a team as a code owner February 18, 2026 00:10

cfallin requested review from fitzgen and removed request for a team and fitzgen February 18, 2026 00:10

cfallin enabled auto-merge February 18, 2026 00:11

cfallin added this pull request to the merge queue Feb 18, 2026

Merged via the queue into bytecodealliance:main with commit b5d2ff5 Feb 18, 2026
45 checks passed

cfallin deleted the regalloc2-upgrade-vreg-bounds branch February 18, 2026 01:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cranelift: update regalloc2 to 0.15.0 to permit more VRegs.#12611

Cranelift: update regalloc2 to 0.15.0 to permit more VRegs.#12611
cfallin merged 2 commits intobytecodealliance:mainfrom
cfallin:regalloc2-upgrade-vreg-bounds

cfallin commented Feb 17, 2026

Uh oh!

cfallin commented Feb 17, 2026

Uh oh!

alexcrichton commented Feb 17, 2026

Uh oh!

cfallin commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

cfallin commented Feb 17, 2026

Uh oh!

cfallin commented Feb 17, 2026

Uh oh!

alexcrichton commented Feb 17, 2026

Uh oh!

cfallin commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments