Skip to content

wsl: report a single "all" device to kubelet#1671

Open
elezar wants to merge 1 commit intoNVIDIA:mainfrom
elezar:wsl-single-all-device
Open

wsl: report a single "all" device to kubelet#1671
elezar wants to merge 1 commit intoNVIDIA:mainfrom
elezar:wsl-single-all-device

Conversation

@elezar
Copy link
Copy Markdown
Member

@elezar elezar commented Mar 21, 2026

On WSL, there is no isolation across different GPUs on a system. This is because they are all accessed through the same /dev/dgx device. This is reflected in in the CDI spec generated by the NVIDIA Container Toolkit to always generate a single all device.

This is incompatible with the device plugin when using a CDI-based device list strategy, since the device name reported by the plugin will include the device UUID or index.

The change in this PR ensures that the device plugin always reports a SINGLE device with a UUID and INDEX (all) so that this is compatible with the generated CDI spec.

@elezar elezar force-pushed the wsl-single-all-device branch from 9beb67b to b55dfe1 Compare March 21, 2026 20:51
On WSL, all GPUs are accessed through /dev/dxg. Replace the per-GPU
wslDevice (which reported one device per physical GPU with individual
UUIDs) with a stateless wslAllGPUsDevice that always returns UUID "all"
and path "/dev/dxg". This causes the device map to collapse to a single
entry per resource, so kubelet sees exactly one GPU device on WSL.

When allocated, this flows naturally through all strategy paths
(envvar, CDI, volume mounts) to set NVIDIA_VISIBLE_DEVICES=all, which
is what nvidia-container-runtime on WSL expects.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant