Skip to content

Conversation

@owenhilyard
Copy link
Contributor

Many AMD iGPUs are paired with a dGPU, and the BIOS has the option to disable the iGPU. Doing so can fix a variety of software problems related to multi-GPU-incompatible software. When this happens, rocm-smi prints an error message to stderr since the amdgpu driver is not initalized, but returns 0. Checking that there was actually data returned fixes the crash (since amd-smi correctly returns 255) for those with this hardware configuration.

Many AMD iGPUs are paired with a dGPU, and the BIOS has the option
to disable the iGPU. Doing so can fix a variety of software problems
related to multi-GPU-incompatible software. When this happens, rocm-smi
prints an error message to stderr since the amdgpu driver is not
initalized, but returns 0. Checking that there was actually data
returned fixes the crash (since amd-smi correctly returns 255) for those
with this hardware configuration.

Signed-off-by: Owen Hilyard <hilyard.owen@gmail.com>
@keith keith merged commit 0e1a699 into modular:main Jan 5, 2026
8 checks passed
@keith
Copy link
Contributor

keith commented Jan 5, 2026

thanks!

modularbot pushed a commit to modular/modular that referenced this pull request Jan 6, 2026
To pull in modular/rules_mojo#69

MODULAR_ORIG_COMMIT_REV_ID: 4b48679f15145911d973acf0df57d5cf474e62a7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants