feat: Add SVE kernels for TopKV by morgolock · Pull Request #1256 · ARM-software/ComputeLibrary

morgolock · 2026-02-05T10:59:28Z

Change-Id: I7a0c7bd1154b9cb7f35c7fd1c3b8ad54698f8799

src/cpu/kernels/topkv/generic/sve/impl.h

gunes-arm · 2026-02-05T19:43:27Z

filelist.json

-              "src/cpu/kernels/topkv/generic/neon/qasymm8_signed.cpp"
-            ]
-          }
+  "files": {


why the indentation change?

fixed in next patch

src/cpu/kernels/topkv/generic/sve/impl.h

src/cpu/kernels/topkv/generic/sve/integer.cpp

filelist.json

gunes-arm · 2026-03-09T14:12:49Z

src/cpu/kernels/topkv/generic/sve/fp16.cpp

+inline uint32_t count_gt_block<float16_t>(const float16_t *ptr, float16_t thr, uint32_t block_elems)
+{
+    const svbool_t    pg = svwhilelt_b16(static_cast<uint64_t>(0), static_cast<uint64_t>(block_elems));
+    const svfloat16_t v  = svld1_f16(pg, ptr);


I think I have two questions to ask:

Why do we convert to Fp32 in the Neon(TM) implementation?

We should incorporate epsilon in both implementations.

Addressed in next patch.

Why do we convert to Fp32 in the Neon(TM) implementation

We could do the computation in f16 but the epsilon must be f32 to align with ref. I can address this in a different patch. It's not a serious problem, if anything there is room to optimize the neon kernel even further.

Yes, t's a minor accuracy discrepancy. Currently, what I'm not too in favour of is the difference between the implementations:

We compare in Fp16 in CPPTopKV

We convert everything to Fp32 and do the comparison in Fp32 in Neon implementation

We convert the Fp32 thr+eps to Fp16 here, and do the conversion in Fp16 in SVE implementation.

I think the ideal solution should be doing the same thing for all. The problem in ref. implementation is also something to consider.

By the way, eps in Fp16 is different and is not equal to the epsilon in Fp32 when converted to Fp16. It adds additional numerical complexity.

What do you suggest we do?

I'd merge these SVE kernels because they bring a considerable uplift (1.6x) and there are no failures.

If fp16 is a problem I can remove the SVE kernel.

I think SVE kernel is a very good add, so I wasn't even considering removing it. I was more of asking the plan for handling this numerical inconsistency.

Two options:

a) Rework all that in this patch
b) Rework it on a next patch aligning FP16 comparison in neon/sve with ref

Resolves MLCE-1719 Change-Id: I7a0c7bd1154b9cb7f35c7fd1c3b8ad54698f8799 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>

gunes-arm reviewed Feb 6, 2026

View reviewed changes

morgolock force-pushed the pr/topkv_sve_kernels branch from 1deb127 to 7944860 Compare February 6, 2026 16:12

morgolock force-pushed the pr/topkv_sve_kernels branch from 7944860 to 21ba8e9 Compare March 2, 2026 12:56

gunes-arm reviewed Mar 6, 2026

View reviewed changes

filelist.json Outdated Show resolved Hide resolved

gunes-arm requested changes Mar 9, 2026

View reviewed changes

morgolock force-pushed the pr/topkv_sve_kernels branch from 21ba8e9 to 36f1ac1 Compare March 9, 2026 14:28

morgolock requested a review from gunes-arm March 9, 2026 14:29

morgolock force-pushed the pr/topkv_sve_kernels branch from 36f1ac1 to c6b4071 Compare March 10, 2026 11:49

feat: Add SVE kernels for TopKV.

98ab9ec

Resolves MLCE-1719 Change-Id: I7a0c7bd1154b9cb7f35c7fd1c3b8ad54698f8799 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>

morgolock force-pushed the pr/topkv_sve_kernels branch from c6b4071 to 98ab9ec Compare March 10, 2026 13:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add SVE kernels for TopKV#1256

feat: Add SVE kernels for TopKV#1256
morgolock wants to merge 1 commit intomainfrom
pr/topkv_sve_kernels

morgolock commented Feb 5, 2026

Uh oh!

Uh oh!

gunes-arm Feb 5, 2026

Uh oh!

morgolock Feb 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gunes-arm Mar 9, 2026

Uh oh!

morgolock Mar 10, 2026

Uh oh!

morgolock Mar 10, 2026

Uh oh!

gunes-arm Mar 11, 2026

Uh oh!

morgolock Mar 11, 2026

Uh oh!

gunes-arm Mar 11, 2026

Uh oh!

morgolock Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

morgolock commented Feb 5, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants