Skip to content

Fix build failure with CUDA < 13.0: unused private fields in symmetric_tensor.h#6032

Open
x41lakazam wants to merge 1 commit intomainfrom
fix/symmetric-tensor-unused-fields
Open

Fix build failure with CUDA < 13.0: unused private fields in symmetric_tensor.h#6032
x41lakazam wants to merge 1 commit intomainfrom
fix/symmetric-tensor-unused-fields

Conversation

@x41lakazam
Copy link
Collaborator

…cTensor

Building with CUDA 12.x and Clang (-Werror,-Wunused-private-field) fails with 5 errors in symmetric_tensor.h:

error: private field 'mcast_handle_' is not used [-Werror,-Wunused-private-field]
error: private field 'cu_dev_' is not used [-Werror,-Wunused-private-field]
error: private field 'mc_base_ptr_' is not used [-Werror,-Wunused-private-field]
error: private field 'exporter_rank_' is not used [-Werror,-Wunused-private-field]
error: private field 'peer_fd_' is not used [-Werror,-Wunused-private-field]

These five fields are only used inside #if (CUDA_VERSION >= 13000) blocks (in the destructor and setupMulticast), but they are declared unconditionally. When building with CUDA 12.x, the usage is compiled out but the declarations remain, triggering Clang's -Wunused-private-field.

…cTensor

Building with CUDA 12.x and Clang (`-Werror,-Wunused-private-field`) fails with
5 errors in `symmetric_tensor.h`:

```
error: private field 'mcast_handle_' is not used [-Werror,-Wunused-private-field]
error: private field 'cu_dev_' is not used [-Werror,-Wunused-private-field]
error: private field 'mc_base_ptr_' is not used [-Werror,-Wunused-private-field]
error: private field 'exporter_rank_' is not used [-Werror,-Wunused-private-field]
error: private field 'peer_fd_' is not used [-Werror,-Wunused-private-field]
```

These five fields are only used inside `#if (CUDA_VERSION >= 13000)` blocks
(in the destructor and `setupMulticast`), but they are declared unconditionally.
When building with CUDA 12.x, the usage is compiled out but the declarations
remain, triggering Clang's `-Wunused-private-field`.

`mc_ptr_` and `is_multicast_setup_` are intentionally left outside the guard
because they are used unconditionally in `multicastPtr()`.
@x41lakazam x41lakazam requested a review from samnordmann March 10, 2026 10:28
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 10, 2026

Greptile Summary

This PR fixes a build failure when compiling with CUDA < 13.0 and Clang's -Werror,-Wunused-private-field by wrapping the five private fields that are exclusively used inside #if (CUDA_VERSION >= 13000) blocks into matching preprocessor guards in the class declaration.

Key changes in csrc/multidevice/symmetric_tensor.h:

  • mcast_handle_, cu_dev_, mc_base_ptr_, exporter_rank_, and peer_fd_ are now declared inside #if (CUDA_VERSION >= 13000), matching their actual usage in the destructor and setupMulticast() in the .cpp file.
  • mc_ptr_ is correctly moved outside the guard — it is accessed unconditionally by multicastPtr(), which has no version check of its own (it relies on is_multicast_setup_ being false at runtime on CUDA < 13.0 to prevent returning the field).

Confidence Score: 5/5

  • This PR is safe to merge — it is a minimal, correct fix for a Clang diagnostic with no behavioral changes.
  • The change is purely a header-level preprocessor guard adjustment. The fields being guarded are only ever accessed inside #if (CUDA_VERSION >= 13000) blocks in the implementation, so guarding their declarations is correct. The one field that needs to remain unconditional (mc_ptr_) is correctly left outside the guard. There is no change to runtime behavior, memory layout concerns are limited to compile-time selection, and the fix exactly matches the stated problem.
  • No files require special attention.

Important Files Changed

Filename Overview
csrc/multidevice/symmetric_tensor.h Wraps five private fields (mcast_handle_, cu_dev_, mc_base_ptr_, exporter_rank_, peer_fd_) inside #if (CUDA_VERSION >= 13000) guards while correctly keeping mc_ptr_ unconditional since it is returned by the non-guarded multicastPtr() accessor.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[SymmetricTensor fields] --> B{CUDA_VERSION >= 13000?}

    B -- Yes --> C[All fields available:\n- mcast_handle_\n- cu_dev_\n- mc_base_ptr_\n- exporter_rank_\n- peer_fd_\n- mc_ptr_]
    B -- No --> D[Only unconditional fields:\n- mc_ptr_\n- is_multicast_setup_]

    C --> E[setupMulticast runs fully]
    D --> F[setupMulticast throws NVF_ERROR\n'Multicast requires CUDA 13.0+']

    E --> G[multicastPtr returns mc_ptr_]
    F --> H[is_multicast_setup_ stays false\nmulticastPtr NVF_CHECK fails]

    G --> I[Destructor: cleans up\nmc_base_ptr_, mcast_handle_,\npeer_fd_ inside #if guard]
    H --> J[Destructor: skips multicast\ncleanup block entirely]
Loading

Last reviewed commit: d7ad13a

@x41lakazam
Copy link
Collaborator Author

!test

1 similar comment
@samnordmann
Copy link
Collaborator

!test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants