Skip to content

feat(btreemap): add direct-mapped node cache for read operations#416

Merged
sasa-tomic merged 55 commits intomainfrom
perf/direct-mapped-node-cache
Apr 1, 2026
Merged

feat(btreemap): add direct-mapped node cache for read operations#416
sasa-tomic merged 55 commits intomainfrom
perf/direct-mapped-node-cache

Conversation

@sasa-tomic
Copy link
Copy Markdown
Member

@sasa-tomic sasa-tomic commented Mar 18, 2026

Motivation

  • Node loading from stable memory is expensive; caching hot nodes significantly improves read performance.

Solution

  • Add direct-mapped node cache (default 16 slots) to BTreeMap, configurable via with_node_cache() and node_cache_resize().
  • Pin root and its children (depth 0-1) to avoid eviction; deeper nodes compete on recency.
  • Expose NodeCacheMetrics for monitoring hit ratio, cold misses, and collision misses.
  • Add read_value_uncached() to avoid inflating cached nodes with large values.

Impact

  • API additions only, but users should re-evaluate cache sizing for their workload.

Details

  • get(), contains_key(), first/last_entry() now use the cache.
  • Write operations (insert, remove, merge) invalidate affected cache slots.
  • Benchmarks show 25-91% instruction reduction across read-heavy workloads.

Related to #166

Add a 32-slot direct-mapped node cache to BTreeMap that avoids
re-loading hot nodes from stable memory. Modeled after CPU caches:
O(1) lookup via (address / page_size) % 32, collision = eviction.

Read paths (get, contains_key, first/last_key_value) use a
take+return pattern to borrow nodes from the cache without
RefCell lifetime issues. Write paths (insert, remove, split,
merge) invalidate affected cache slots.

Key changes:
- Switch get() from destructive extract_entry_at to node.value()
- Remove unused extract_entry_at method
- Change traverse() closure from Fn(&mut Node) to Fn(&Node)
- Invalidate cache in save_node, deallocate_node, merge, clear_new

Expected improvement: ~15-20% for random reads, ~65% for hot-key
workloads, ~0% overhead for writes (cache.get_mut() bypasses RefCell).
@sasa-tomic sasa-tomic requested a review from a team as a code owner March 18, 2026 17:46
@sasa-tomic sasa-tomic marked this pull request as draft March 18, 2026 17:52
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 19, 2026

canbench 🏋 (dir: ./benchmarks/btreeset) a910cb9 2026-03-31 14:48:48 UTC

./benchmarks/btreeset/canbench_results.yml is up to date
📦 canbench_results_btreeset.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 100 | regressed 0 | improved 0 | new 0 | unchanged 100]
    change:   [max +1.06M | p75 0 | median 0 | p25 0 | min 0]
    change %: [max +0.20% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 100 | regressed 0 | improved 0 | new 0 | unchanged 100]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 100 | regressed 0 | improved 0 | new 0 | unchanged 100]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 19, 2026

canbench 🏋 (dir: ./benchmarks/nns) a910cb9 2026-03-31 14:48:49 UTC

./benchmarks/nns/canbench_results.yml is up to date
📦 canbench_results_nns.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   Improvements detected 🟢
    counts:   [total 16 | regressed 0 | improved 1 | new 0 | unchanged 15]
    change:   [max +35.76M | p75 +289 | median -1.50K | p25 -413.07K | min -82.85M]
    change %: [max +0.40% | p75 0.00% | median -0.05% | p25 -0.25% | min -2.75%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------

Only significant changes:
| status | name                              | calls |   ins |  ins Δ% | HI |  HI Δ% | SMI |  SMI Δ% |
|--------|-----------------------------------|-------|-------|---------|----|--------|-----|---------|
|   -    | vote_cascading_stable_chain_10k_5 |       | 2.92B |  -2.75% |  5 |  0.00% |   0 |   0.00% |

ins = instructions, HI = heap_increase, SMI = stable_memory_increase, Δ% = percent change

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 19, 2026

canbench 🏋 (dir: ./benchmarks/vec) a910cb9 2026-03-31 14:48:35 UTC

./benchmarks/vec/canbench_results.yml is up to date
📦 canbench_results_vec.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 19, 2026

canbench 🏋 (dir: ./benchmarks/memory_manager) a910cb9 2026-03-31 14:48:32 UTC

./benchmarks/memory_manager/canbench_results.yml is up to date
📦 canbench_results_memory-manager.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 3 | regressed 0 | improved 0 | new 0 | unchanged 3]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 3 | regressed 0 | improved 0 | new 0 | unchanged 3]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 3 | regressed 0 | improved 0 | new 0 | unchanged 3]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 20, 2026

canbench 🏋 (dir: ./benchmarks/io_chunks) a910cb9 2026-03-31 14:49:22 UTC

./benchmarks/io_chunks/canbench_results.yml is up to date
📦 canbench_results_io_chunks.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   Improvements detected 🟢
    counts:   [total 18 | regressed 0 | improved 2 | new 0 | unchanged 16]
    change:   [max +87.27M | p75 0 | median 0 | p25 -176 | min -23.13B]
    change %: [max +0.73% | p75 0.00% | median 0.00% | p25 -0.00% | min -59.18%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 18 | regressed 0 | improved 0 | new 0 | unchanged 18]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 18 | regressed 0 | improved 0 | new 0 | unchanged 18]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------

Only significant changes:
| status | name                    | calls |     ins |  ins Δ% | HI |  HI Δ% | SMI |  SMI Δ% |
|--------|-------------------------|-------|---------|---------|----|--------|-----|---------|
|   -    | read_chunks_btreemap_1m |       |  17.81B | -56.50% |  0 |  0.00% |   0 |   0.00% |
|   -    | read_chunks_btreemap_1k |       | 203.38M | -59.18% |  0 |  0.00% |   0 |   0.00% |

ins = instructions, HI = heap_increase, SMI = stable_memory_increase, Δ% = percent change

---------------------------------------------------
CSV results saved to canbench_results.csv

@sasa-tomic sasa-tomic marked this pull request as ready for review March 20, 2026 11:23
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 20, 2026

canbench 🏋 (dir: ./benchmarks/btreemap) a910cb9 2026-03-31 14:49:55 UTC

./benchmarks/btreemap/canbench_results.yml is up to date
📦 canbench_results_btreemap.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   Improvements detected 🟢
    counts:   [total 229 | regressed 0 | improved 100 | new 0 | unchanged 129]
    change:   [max +62.94M | p75 0 | median -1.62M | p25 -92.98M | min -1.48B]
    change %: [max +1.17% | p75 0.00% | median -0.21% | p25 -25.74% | min -91.42%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 229 | regressed 0 | improved 0 | new 0 | unchanged 229]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 229 | regressed 0 | improved 0 | new 0 | unchanged 229]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------

Only significant changes:
| status | name                                      | calls |     ins |  ins Δ% | HI |  HI Δ% | SMI |  SMI Δ% |
|--------|-------------------------------------------|-------|---------|---------|----|--------|-----|---------|
|   -    | btreemap_v2_pop_first_vec_32_128          |       |   1.00B |  -6.23% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_first_vec_32_vec128       |       |   1.00B |  -6.23% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_first_blob_256_128        |       |   2.56B |  -6.31% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_last_vec_32_128           |       | 978.64M |  -6.40% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_last_vec_32_vec128        |       | 978.64M |  -6.40% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_last_blob_256_128         |       |   2.47B |  -6.47% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_last_blob_32_1024         |       | 976.88M |  -7.08% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_first_blob_32_1024        |       |   1.01B |  -7.10% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_first_blob_32_128         |       | 763.85M | -10.21% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_last_blob_32_128          |       | 732.59M | -10.27% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_last_u64_u64              |       | 585.12M | -10.59% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_first_blob_32_0           |       | 673.85M | -10.76% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_first_u64_u64             |       | 604.78M | -10.84% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_first_blob_8_128          |       | 530.70M | -11.22% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_last_blob_32_0            |       | 644.48M | -11.35% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_last_principal            |       | 691.07M | -11.88% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_last_blob_8_128           |       | 520.66M | -11.99% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_pop_first_principal           |       | 705.52M | -12.27% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_contains_blob_32_0            |       | 286.88M | -12.95% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_contains_blob_32_1024         |       | 280.85M | -13.87% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_u64_blob8                 |       | 193.98M | -14.34% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_blob_32_0                 |       | 288.59M | -14.48% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_100k_u64_u64              |       |   2.32B | -16.10% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_blob_32_1024              |       | 292.71M | -16.49% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_vec_128_128               |       | 479.29M | -16.79% |  0 |  0.00% |   0 |   0.00% |
|  ...   | ... 50 rows omitted ...                   |       |         |         |    |        |     |         |
|   -    | btreemap_v2_get_blob_1024_128             |       |   2.97B | -33.30% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_blob_256_128              |       | 903.19M | -33.93% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_blob_4_128                |       | 168.23M | -34.00% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_zipf_10k_u64_u64          |       | 140.00M | -37.01% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_first_key_value_blob_32_1024  |       | 255.02M | -42.74% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_first_key_value_blob_256_128  |       | 352.58M | -46.79% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_vec_4_128                 |       | 208.49M | -49.16% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_last_key_value_blob_32_128    |       | 122.35M | -54.16% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_last_key_value_blob_32_1024   |       | 168.95M | -56.65% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_get_zipf_heavy_10k_u64_u64    |       |  87.48M | -59.26% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_first_key_value_blob_32_128   |       | 106.21M | -61.42% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_first_key_value_u64_u64       |       |  84.89M | -63.27% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_first_key_value_blob_32_0     |       | 100.51M | -65.11% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_last_key_value_vec_32_128     |       |  73.27M | -66.86% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_last_key_value_vec_32_vec128  |       |  73.27M | -66.86% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_last_key_value_u64_u64        |       |  81.20M | -66.90% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_last_key_value_blob_256_128   |       | 150.71M | -76.51% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_first_key_value_blob_8_128    |       |  55.70M | -80.92% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_first_key_value_vec_32_128    |       |  38.66M | -82.98% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_first_key_value_vec_32_vec128 |       |  38.66M | -82.98% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_last_key_value_blob_8_128     |       |  52.86M | -83.54% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_last_key_value_blob_32_0      |       |  32.54M | -85.27% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_last_key_value_principal      |       |  32.06M | -85.28% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_contains_10mib_values         |       |  18.48M | -87.00% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_first_key_value_principal     |       |  31.52M | -91.42% |  0 |  0.00% |   0 |   0.00% |

ins = instructions, HI = heap_increase, SMI = stable_memory_increase, Δ% = percent change

---------------------------------------------------
CSV results saved to canbench_results.csv

Copy link
Copy Markdown
Member Author

@sasa-tomic sasa-tomic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressing open review comments with code changes and replies.

…ard NULL

- Change DEFAULT_NODE_CACHE_NUM_SLOTS from 0 to 16 (covers top 2 tree levels)
- Remove redundant cache_num_slots field from BTreeMap; NodeCache methods
  now self-guard via is_enabled() and callers no longer need checks
- Add NULL address guard in NodeCache::take to prevent false cache hits
Copy link
Copy Markdown
Contributor

@maksymar maksymar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great results! 🎉 Thank you!

@maksymar
Copy link
Copy Markdown
Contributor

maksymar commented Mar 31, 2026

[DONE]
I would suggest to wait with merging this PR until I merge another PR that improves benchmarks.
Currently our benchmarks do not cover potential cache cases, so I would like to remove some excessive benchmarks and add cache-specific use cases, so that we can also see the difference on that PR.

@maksymar
Copy link
Copy Markdown
Contributor

The chart shows how increasing node cache size reduces instruction count (relative to no cache) across operation categories.

  • first/last & pop plateau almost immediately because they only need the tree height number of nodes in the cache at most
  • get and contains keep improving steadily with the size of the cache
  • the rest categories are not currently supported by caching, but probably will be in the future

From this data I'm happy to see that "the default 16" is already good enough while being reasonably small, at the same time bigger caches will provide more value.

chart_by_category

@sasa-tomic sasa-tomic self-assigned this Apr 1, 2026
@sasa-tomic sasa-tomic changed the title perf: add direct-mapped node cache to BTreeMap feat(btreemap)!: add direct-mapped node cache for read operations Apr 1, 2026
@sasa-tomic sasa-tomic changed the title feat(btreemap)!: add direct-mapped node cache for read operations feat(btreemap): add direct-mapped node cache for read operations Apr 1, 2026
@sasa-tomic sasa-tomic merged commit 8f83191 into main Apr 1, 2026
21 checks passed
@sasa-tomic sasa-tomic deleted the perf/direct-mapped-node-cache branch April 1, 2026 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants