add adaptive batch size heuristic for filtered search by yuejiaointel · Pull Request #309 · intel/ScalableVectorSearch

yuejiaointel · 2026-04-02T21:05:38Z

Currently the filtered k-NN search loop uses batch_size = k when calling iterator.next(). When the filter is restrictive (e.g., 1% of IDs pass), this results in many expensive graph traversal rounds to collect enough valid results.

This PR introduces a heuristic that adapts the batch size based on observed filter hit rate:

First round uses search_window_size as the initial batch size
After each round, computes: batch_size = remaining_needed / hit_rate
If no hits yet, keeps using the initial batch size

For example, with k=10 and a 10% filter pass rate: instead of ~100 rounds of 10 candidates, it converges in ~2 rounds.

rfsaliev

Thank you for the good proposal.

Requested changes:

Please apply such improvements to range_search() and in `vamana_index_impl.h as well

Suggestions:
There are some performance related suggestions in comments.

But during the review, I found, that: compute_filtered_batch_size() logic is prediction of further amount of processing based on previous processing results and requested amount of matches aka:
PredictFurtherProcessing(processed, hits, goal)
So, I would declare this function more generic, and move it to utilities header with more common signature and reuse in vamana_index_impl.h as well:

In such case, % max_batch_size operation should be applied outside of this function

/// @param processed - number of already processed elements (total_checked)
/// @param hits - number of matched elements (found)
/// @param goal - number of requested elements to be matched (needed)
/// @param hint - result to be returned if prediction is failed, e.g. other params == 0
size_t predict_further_processing(size_t processed, size_t hits, size_t goal, size_t hint) {
    if (processed * hits * goal == 0) {
        return hint;
    }
    // use prediction formula below
    ...
}

rfsaliev · 2026-04-03T08:26:56Z

bindings/cpp/src/dynamic_vamana_index_impl.h

@@ -136,6 +153,8 @@ class DynamicVamanaIndexImpl {
                            }
                        }
                    }
+                    batch_size =
+                        compute_filtered_batch_size(found, k, total_checked, batch_size);


Good idea, but, from performance perspective, I would slightly change the code:

Compute the batch size at the beginning of the do-while loop - it will avoid computation when found==k

Increment total_checked out-of the for loop.

It might make sense to set initial batch size the max of k and search_window_size
E.g.

Suggested change

size_t total_checked = 0;

auto batch_size = std::max(k, sp.buffer_config_.get_search_window_size());

do {

batch_size =

compute_filtered_batch_size(found, k, total_checked, batch_size);

iterator.next(batch_size);

for (auto& neighbor : iterator.results()) {

if (filter->is_member(neighbor.id())) {

result.set(neighbor, i, found);

found++;

if (found == k) {

break;

}

}

}

total_checked += iterator.size();

Thx, added these change

rfsaliev · 2026-04-03T09:10:22Z

bindings/cpp/src/dynamic_vamana_index_impl.h

+    double hit_rate = static_cast<double>(found) / total_checked;
+    return static_cast<size_t>((needed - found) / hit_rate);


I would also try to improve performance here:

FP64 computation is not very performant

Computation precision is not very important here

There is potential issues in SVS BatchIterator in case of huge batch size

So, I would use the following formula:

hit_rate_inv = 1 / hit_rate = checked / found

result = (needed - found) / hit_rate = (needed - found) * hit_rate_inv = needed * checked / found - checked

The formula needed * checked / found - checked is most precise, but there is the bigger risk of overflow for huge needed and checked values

Suggested change

double hit_rate = static_cast<double>(found) / total_checked;

return static_cast<size_t>((needed - found) / hit_rate);

auto hit_rate = total_checked / found + 1; // found == 0 is handled above; +1 to increase result eliminating INT precision issues

return (needed - found) * hit_rate % max_batch_size; // max_batch_size - constant

Alternative (assuming, that FP32 is fast enough):

Suggested change

double hit_rate = static_cast<double>(found) / total_checked;

return static_cast<size_t>((needed - found) / hit_rate);

float new_batch_size = static_cast<float>(needed) * total_checked / found - total_checked;

return static_cast<size_t>(new_batch_size) % max_batch_size;

thx added, probably need to run some benchmarks before knowing exact performance

- Rename compute_filtered_batch_size to predict_further_processing and move to svs_runtime_utils.h for reuse - Use float arithmetic instead of double for hit rate calculation - Compute batch size at loop start to avoid unnecessary computation - Use iterator.size() instead of per-element increment for total_checked - Initial batch size = max(k, search_window_size) - Apply adaptive batch size to vamana_index_impl.h filtered search

- Cap batch size with std::min instead of modulo to avoid SIGFPE - Add comments explaining adaptive batch sizing logic

yuejiaointel added 2 commits April 2, 2026 14:01

add adaptive batch size heuristic for filtered search

9dd4f75

use IDFilterRange instead of IDFilterSet in test

30d26c4

rfsaliev requested changes Apr 3, 2026

View reviewed changes

yuejiaointel added 2 commits April 3, 2026 17:12

add batch size cap and comments to adaptive filtered search

18d41bb

- Cap batch size with std::min instead of modulo to avoid SIGFPE - Add comments explaining adaptive batch sizing logic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add adaptive batch size heuristic for filtered search#309

add adaptive batch size heuristic for filtered search#309
yuejiaointel wants to merge 4 commits intomainfrom
feature/adaptive-filtered-batch-size-v2

yuejiaointel commented Apr 2, 2026 •

edited

Loading

Uh oh!

rfsaliev left a comment

Uh oh!

rfsaliev Apr 3, 2026

Uh oh!

yuejiaointel Apr 4, 2026

Uh oh!

rfsaliev Apr 3, 2026

Uh oh!

yuejiaointel Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

+                size_t total_checked = 0;
+                auto batch_size = std::max(k, sp.buffer_config_.get_search_window_size());
+                do {
+                    batch_size =
+                        compute_filtered_batch_size(found, k, total_checked, batch_size);
+                    iterator.next(batch_size);
+                    for (auto& neighbor : iterator.results()) {
+                        if (filter->is_member(neighbor.id())) {
+                            result.set(neighbor, i, found);
+                            found++;
+                            if (found == k) {
+                                break;
+                            }
+                        }
+                    }
+                    total_checked += iterator.size();

		double hit_rate = static_cast<double>(found) / total_checked;
		return static_cast<size_t>((needed - found) / hit_rate);

Conversation

yuejiaointel commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rfsaliev left a comment

Choose a reason for hiding this comment

Uh oh!

rfsaliev Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

yuejiaointel Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

rfsaliev Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

yuejiaointel Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yuejiaointel commented Apr 2, 2026 •

edited

Loading