Add SVE2 BitPerm intrinsics (BDEP, BEXT, BGRP)#2069
Open
ADunfield wants to merge 1 commit intorust-lang:mainfrom
Open
Add SVE2 BitPerm intrinsics (BDEP, BEXT, BGRP)#2069ADunfield wants to merge 1 commit intorust-lang:mainfrom
ADunfield wants to merge 1 commit intorust-lang:mainfrom
Conversation
Add Rust bindings for the three SVE2 bit-permutation instructions
(FEAT_SVE_BitPerm): BDEP (bit deposit), BEXT (bit extract), and
BGRP (bit group). These are the scalable vector equivalents of
x86 BMI2 PDEP/PEXT.
Intrinsics added (12 total):
- svbdep_u{8,16,32,64} — scatter source bits to mask positions
- svbext_u{8,16,32,64} — gather bits from mask positions
- svbgrp_u{8,16,32,64} — partition bits into two groups by mask
All gated by #[target_feature(enable = "sve2-bitperm")] and the
unstable feature flag stdarch_aarch64_sve2_bitperm.
Also adds SVE scalable vector type definitions (svuint8_t through
svint64_t) using #[rustc_scalable_vector(N)] from rust#143924,
gated by stdarch_aarch64_sve.
Tested on AWS Graviton 4 (c8g) which has FEAT_SVE2_BitPerm.
LLVM intrinsic names verified via Clang assembly output.
Scalar reference tests pass on both x86_64 and aarch64.
Collaborator
|
r? @sayantn rustbot has assigned @sayantn. Use Why was this reviewer chosen?The reviewer was selected based on:
|
Contributor
|
Does this even make sense yet? The arm work in the main repo is still ongoing. We might instead want to generate this from the yaml specification instead too? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add Rust bindings for the three SVE2 bit-permutation instructions (FEAT_SVE_BitPerm):
svbdep_u{8,16,32,64}) — Bit Deposit: scatter consecutive low bits to mask-selected positions. SVE2 equivalent of x86 BMI2PDEP.svbext_u{8,16,32,64}) — Bit Extract: gather bits from mask-selected positions into consecutive low bits. SVE2 equivalent of x86 BMI2PEXT.svbgrp_u{8,16,32,64}) — Bit Group: partition bits into two groups by mask (selected bits packed low, unselected packed high). No x86 equivalent.12 intrinsics total, all gated by
#[target_feature(enable = "sve2-bitperm")].Implementation
svuint8_tthroughsvint64_t): defined using#[rustc_scalable_vector(N)]fromrustc_scalable_vector(N)rust#143924llvm.aarch64.sve.{bdep,bext,bgrp}.x.nxv{N}i{M}— verified correct via Clang assembly output on Graviton 4stdarch_aarch64_sve(types),stdarch_aarch64_sve2_bitperm(intrinsics)_n_variants deferred until base SVE1 intrinsics (svdup_n_u*) landTesting
-emit-llvmon aarch64bdep z0.d, z0.d, z1.detc.) via Clang-O2on AWS Graviton 4 (c8g, Neoverse V2)cargo check --target aarch64-unknown-linux-gnupasses on nightly 1.96.0Dependencies
rustc_scalable_vector(N)rust#143924 —#[rustc_scalable_vector(N)](merged)Hardware tested on
/proc/cpuinfoconfirmssvebitpermfeature flag