Skip to content

Add SVE2 BitPerm intrinsics (BDEP, BEXT, BGRP)#2069

Open
ADunfield wants to merge 1 commit intorust-lang:mainfrom
ADunfield:sve2-bitperm
Open

Add SVE2 BitPerm intrinsics (BDEP, BEXT, BGRP)#2069
ADunfield wants to merge 1 commit intorust-lang:mainfrom
ADunfield:sve2-bitperm

Conversation

@ADunfield
Copy link
Copy Markdown

@ADunfield ADunfield commented Mar 31, 2026

Summary

Add Rust bindings for the three SVE2 bit-permutation instructions (FEAT_SVE_BitPerm):

  • BDEP (svbdep_u{8,16,32,64}) — Bit Deposit: scatter consecutive low bits to mask-selected positions. SVE2 equivalent of x86 BMI2 PDEP.
  • BEXT (svbext_u{8,16,32,64}) — Bit Extract: gather bits from mask-selected positions into consecutive low bits. SVE2 equivalent of x86 BMI2 PEXT.
  • BGRP (svbgrp_u{8,16,32,64}) — Bit Group: partition bits into two groups by mask (selected bits packed low, unselected packed high). No x86 equivalent.

12 intrinsics total, all gated by #[target_feature(enable = "sve2-bitperm")].

Implementation

  • SVE types (svuint8_t through svint64_t): defined using #[rustc_scalable_vector(N)] from rustc_scalable_vector(N) rust#143924
  • LLVM intrinsics: llvm.aarch64.sve.{bdep,bext,bgrp}.x.nxv{N}i{M} — verified correct via Clang assembly output on Graviton 4
  • Feature gates: stdarch_aarch64_sve (types), stdarch_aarch64_sve2_bitperm (intrinsics)
  • Scalar _n_ variants deferred until base SVE1 intrinsics (svdup_n_u*) land

Testing

  • 13 scalar reference tests verifying BDEP/BEXT/BGRP correctness (exhaustive u8 roundtrip, property-based inverse tests)
  • LLVM intrinsic names validated via Clang -emit-llvm on aarch64
  • Assembly output confirmed (bdep z0.d, z0.d, z1.d etc.) via Clang -O2 on AWS Graviton 4 (c8g, Neoverse V2)
  • cargo check --target aarch64-unknown-linux-gnu passes on nightly 1.96.0

Dependencies

Hardware tested on

  • AWS Graviton 4 (c8g.medium, Neoverse V2) — /proc/cpuinfo confirms svebitperm feature flag
  • AWS Graviton 3 (c7g) has SVE but NOT SVE2/BitPerm

Add Rust bindings for the three SVE2 bit-permutation instructions
(FEAT_SVE_BitPerm): BDEP (bit deposit), BEXT (bit extract), and
BGRP (bit group). These are the scalable vector equivalents of
x86 BMI2 PDEP/PEXT.

Intrinsics added (12 total):
- svbdep_u{8,16,32,64} — scatter source bits to mask positions
- svbext_u{8,16,32,64} — gather bits from mask positions
- svbgrp_u{8,16,32,64} — partition bits into two groups by mask

All gated by #[target_feature(enable = "sve2-bitperm")] and the
unstable feature flag stdarch_aarch64_sve2_bitperm.

Also adds SVE scalable vector type definitions (svuint8_t through
svint64_t) using #[rustc_scalable_vector(N)] from rust#143924,
gated by stdarch_aarch64_sve.

Tested on AWS Graviton 4 (c8g) which has FEAT_SVE2_BitPerm.
LLVM intrinsic names verified via Clang assembly output.
Scalar reference tests pass on both x86_64 and aarch64.
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 31, 2026

r? @sayantn

rustbot has assigned @sayantn.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @Amanieu, @folkertdev, @sayantn
  • @Amanieu, @folkertdev, @sayantn expanded to Amanieu, folkertdev, sayantn
  • Random selection from Amanieu, folkertdev, sayantn

@folkertdev
Copy link
Copy Markdown
Contributor

Does this even make sense yet? The arm work in the main repo is still ongoing. We might instead want to generate this from the yaml specification instead too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants