Skip to content

fix(allocator): give MimallocArena a non-zero heap tag to prevent free-list corruption#28026

Open
robobun wants to merge 1 commit intomainfrom
claude/fix-mimalloc-arena-heap-tag-28024
Open

fix(allocator): give MimallocArena a non-zero heap tag to prevent free-list corruption#28026
robobun wants to merge 1 commit intomainfrom
claude/fix-mimalloc-arena-heap-tag-28024

Conversation

@robobun
Copy link
Collaborator

@robobun robobun commented Mar 12, 2026

Summary

  • Fix heap corruption caused by MimallocArena using mi_heap_new() with the default tag 0 (same as the backing heap). When worker threads exit, their abandoned pages are reclaimed by the arena instead of the backing heap, and mi_heap_destroy then frees live blocks, corrupting the malloc free-list.
  • Use mi_heap_new_ex with a non-zero tag (111) so _mi_heap_by_tag routes reclaimed pages to the backing heap where they belong.
  • Add null check for jsDynamicCast result in jsHashProtoFuncUpdate to prevent a potential null pointer dereference if constructFromEncoding fails.

Test plan

  • bun bd compiles successfully
  • bun bd test test/js/node/crypto/crypto.test.ts — 369 tests pass
  • bun bd test test/js/node/crypto/crypto-lazyhash.test.ts — 2 tests pass
  • Hash update stress test with worker threads passes (10K hashes + 4 workers)

Fixes #28024

🤖 Generated with Claude Code

…e-list corruption

mi_heap_new() creates the arena with tag 0, same as the backing heap.
When a worker thread exits, its heap pages are abandoned and mimalloc's
_mi_heap_by_tag routes each reclaimed page to the first heap with a
matching tag — which is the arena (pushed to front by mi_heap_new).
mi_heap_destroy then frees these pages along with their live blocks,
corrupting the malloc free-list.

Fix: use mi_heap_new_ex with a non-zero tag (111) so _mi_heap_by_tag
routes reclaimed tag-0 pages to the backing heap where they belong.

Also add a null check for the jsDynamicCast result in jsHashProtoFuncUpdate
to prevent a potential null pointer dereference if constructFromEncoding
fails to return an ArrayBufferView.

Fixes #28024

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@robobun
Copy link
Collaborator Author

robobun commented Mar 12, 2026

Updated 6:03 PM PT - Mar 11th, 2026

❌ Your commit 0769de65 has some failures in Build #39334 (All Failures)


🧪   To try this PR locally:

bunx bun-pr 28026

That installs a local version of the PR into your bun-28026 executable, so you can run:

bun-28026 --bun

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Walkthrough

Two modifications addressing stability issues: arena heap creation now uses a non-zero tag for proper page routing, and hash update operations include input validation for decoded buffers.

Changes

Cohort / File(s) Summary
Arena Heap Management
src/allocators/MimallocArena.zig
Modifies heap initialization to use non-zero tag (111) via mi_heap_new_ex(), ensuring reclaimed pages route to backing heap instead of arena, preventing free-list corruption during heap destruction.
Hash Input Validation
src/bun.js/bindings/node/crypto/JSHash.cpp
Adds null check for decoded ArrayBufferView when updating hash with string input, returning CRYPTO_HASH_UPDATE_FAILED if buffer conversion fails.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(allocator): give MimallocArena a non-zero heap tag to prevent free-list corruption' clearly summarizes the primary change: addressing heap corruption by using a non-zero tag for MimallocArena.
Description check ✅ Passed The PR description covers both required template sections: 'What does this PR do?' (detailed summary of changes) and 'How did you verify your code works?' (comprehensive test plan with checkmarks).
Linked Issues check ✅ Passed The PR addresses all requirements from issue #28024: fixes heap routing corruption in MimallocArena [#28024] and adds null check for jsDynamicCast result [#28024] to prevent null pointer dereference in jsHashProtoFuncUpdate.
Out of Scope Changes check ✅ Passed All changes align with issue #28024: MimallocArena heap tag modification and jsHashProtoFuncUpdate null check guard against the reported segfault without introducing unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/allocators/MimallocArena.zig`:
- Around line 168-175: The hardcoded module-level arena_heap_tag causes all
MimallocArena instances to share tag 111; change this to generate a unique
non-zero tag per instance in init() (or enforce single-instance invariant).
Specifically, remove the constant arena_heap_tag usage in init() and instead
obtain a unique tag (e.g., from a process-wide atomic counter) and pass that tag
to mimalloc.mi_heap_new_ex inside MimallocArena.init(); also ensure any teardown
(mi_heap_destroy) and related logic uses the same instance tag. Update
references to arena_heap_tag, init(), mi_heap_new_ex and mi_heap_destroy so each
MimallocArena holds its own tag value.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7a5573db-b196-4433-a00b-76be3bc40457

📥 Commits

Reviewing files that changed from the base of the PR and between 9d51af8 and 0769de6.

📒 Files selected for processing (2)
  • src/allocators/MimallocArena.zig
  • src/bun.js/bindings/node/crypto/JSHash.cpp

Comment on lines +168 to +175
/// Use a non-zero heap tag so that mimalloc's _mi_heap_by_tag routes
/// reclaimed pages from dead threads to the backing heap (tag 0) instead
/// of to this arena. Without this, mi_heap_destroy frees those reclaimed
/// pages and their live blocks, corrupting the malloc free-list.
const arena_heap_tag = 111;

pub fn init() Self {
const mimalloc_heap = mimalloc.mi_heap_new() orelse bun.outOfMemory();
const mimalloc_heap = mimalloc.mi_heap_new_ex(arena_heap_tag, true, null) orelse bun.outOfMemory();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== MimallocArena construction sites ==\n'
rg -n -C2 '\bMimallocArena\.init\s*\(|\binit\s*\(\)\s*Self\b' src

printf '\n== Tagged mimalloc heap creations ==\n'
rg -n -C2 '\bmi_heap_new_ex\s*\(' src

printf '\n== References to arena_heap_tag / literal 111 in allocator code ==\n'
rg -n -C2 '\barena_heap_tag\b|[^0-9]111[^0-9]' src/allocators src

printf '\n== Heap destroy sites ==\n'
rg -n -C2 '\bmi_heap_destroy\s*\(' src

Repository: oven-sh/bun

Length of output: 50368


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== MimallocArena struct and init definition ==\n'
head -n 200 src/allocators/MimallocArena.zig | cat -n

printf '\n\n== Check if there are any comments around arena_heap_tag ==\n'
sed -n '165,180p' src/allocators/MimallocArena.zig | cat -n

printf '\n\n== Deinit and full init/deinit pattern ==\n'
sed -n '160,190p' src/allocators/MimallocArena.zig | cat -n

Repository: oven-sh/bun

Length of output: 9805


Each MimallocArena instance must have a unique non-zero heap tag.

The hardcoded arena_heap_tag = 111 is used by every init() call, meaning all MimallocArena instances—including those in web_worker, JSTranspiler, and test_command—share the same mimalloc heap tag. If multiple arenas are live concurrently (e.g., in worker threads or during parallel transpilation), abandoned pages from one dead thread routed by tag 111 could be adopted by a different live arena and freed by that arena's mi_heap_destroy(), corrupting its free-list.

Replace the module-level constant with a per-instance unique tag (e.g., using an atomic counter) or enforce a documented invariant that only one MimallocArena can be live at a time.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/allocators/MimallocArena.zig` around lines 168 - 175, The hardcoded
module-level arena_heap_tag causes all MimallocArena instances to share tag 111;
change this to generate a unique non-zero tag per instance in init() (or enforce
single-instance invariant). Specifically, remove the constant arena_heap_tag
usage in init() and instead obtain a unique tag (e.g., from a process-wide
atomic counter) and pass that tag to mimalloc.mi_heap_new_ex inside
MimallocArena.init(); also ensure any teardown (mi_heap_destroy) and related
logic uses the same instance tag. Update references to arena_heap_tag, init(),
mi_heap_new_ex and mi_heap_destroy so each MimallocArena holds its own tag
value.

@github-actions
Copy link
Contributor

Found 6 issues this PR may fix:

  1. bun crash while running claude code #21353 - bun crash while running claude code - directly matches JSHash.cpp:177 null pointer dereference fix
  2. Heap corruption (malloc free list) during long-running server with sharp + mongodb #27929 - Heap corruption (malloc free list) during long-running server with sharp + mongodb - exact match for MimallocArena heap corruption fix
  3. Deadlock: all threads permanently blocked due to lock ordering violation between allocator and thread pool (macOS arm64, Bun 1.3.5 SEA) #26762 - Deadlock: all threads permanently blocked due to lock ordering violation between allocator and thread pool - memory allocator issues that could stem from heap corruption
  4. Segmentation fault after long-running session (v1.3.9-canary.51) #26984 - Segmentation fault after long-running session - memory corruption that could be caused by improper arena tagging
  5. Segfault (null pointer at 0x0) in long-running standalone executable on Windows after sustained memory pressure #26862 - Segfault (null pointer at 0x0) in long-running standalone executable on Windows - could be related to either heap corruption or null pointer issues
  6. Segmentation fault (address 0x28) on macOS Silicon after ~3min runtime when running Claude Code #27790 - Segmentation fault (address 0x28) on macOS Silicon after ~3min runtime when running Claude Code - similar usage pattern to JSHash crash, timing suggests memory corruption buildup

If this is helpful, consider adding Fixes #<number> to the PR description to auto-close the issue on merge.

🤖 Generated with Claude Code

Copy link
Collaborator Author

@robobun robobun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responding to the CodeRabbit suggestion about per-instance unique tags:

A shared non-zero tag (111) is correct and sufficient here. The bug is specifically about tag-0 pages from dead worker threads being routed to the arena instead of the backing heap. Any non-zero tag prevents this.

Per-instance unique tags would add complexity without benefit because:

  1. Arenas call mi_heap_destroy() (not abandonment) on teardown, so tag-111 pages are never orphaned into the reclamation pool
  2. In practice there is at most one arena per thread in the current codebase
  3. mi_heap_tag is uint8_t (0-255), so an atomic counter would wrap around and collide anyway
  4. The upstream mimalloc examples also use a single constant tag for this pattern

@robobun
Copy link
Collaborator Author

robobun commented Mar 12, 2026

Re: CodeRabbit suggestion for per-instance unique heap tags

A shared non-zero tag is correct here. The scenario described (tag-111 pages from one dead thread being adopted by a different arena) doesn't happen because:

  1. Arenas are destroyed, not abandoned. mi_heap_destroy() frees all owned pages immediately — it does not abandon them into the reclamation pool. So there are never orphaned tag-111 pages floating around for another arena to accidentally adopt.

  2. The actual bug is tag-0 collision. Dead worker threads abandon pages with tag 0 (from their default/backing heap). _mi_heap_by_tag(heap, 0) then routes these to the first tag-0 heap on the reclaiming thread — which can be the arena. The fix only needs the arena's tag to differ from 0.

  3. Per-instance unique tags don't scale. mi_heap_tag is uint8_t (0-255), so an atomic counter wraps around after 255 arenas and collides anyway. A constant non-zero tag is simpler and equally correct.

  4. At most one arena per thread in practice. The install.zig code destroys the old arena before creating a new one; all other call sites store arenas in per-thread VM structures.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix looks reasonable — using a non-zero heap tag to prevent page reclamation cross-contamination is the right approach — but this touches low-level allocator internals (mimalloc heap tagging) that warrant a human verifying the tag semantics and the allow_destroy interaction.

Extended reasoning...

Overview

This PR makes two changes: (1) in MimallocArena.zig, it replaces mi_heap_new() with mi_heap_new_ex(arena_heap_tag, true, null) using a non-zero heap tag (111) to prevent mimalloc from routing reclaimed pages from dead worker threads into the arena heap, which would cause heap corruption when mi_heap_destroy is called; (2) in JSHash.cpp, it adds a null check for a jsDynamicCast<JSArrayBufferView*> result before dereferencing.

Security risks

The allocator change directly addresses a heap corruption / use-after-free class vulnerability. The fix itself does not introduce new security risks. The JSHash.cpp null check is purely defensive.

Level of scrutiny

The allocator change, while small, is in critical memory management infrastructure. The correctness of the fix depends on mimalloc internals — specifically that _mi_heap_by_tag routes reclaimed pages by tag and that tag 0 is the default/backing heap tag. A human familiar with the bun mimalloc integration should verify: (a) the tag value 111 does not collide with any other heap tags used in the codebase, (b) the allow_destroy=true parameter is appropriate, and (c) the arena_id=null default is correct.

Other factors

The PR is labeled claude (AI-generated). The bug report found only pre-existing issues (other missing null checks in the same crypto files), not issues introduced by this PR. CI is building. There are no outstanding reviewer comments.

Comment on lines 180 to +183
auto* convertedView = jsDynamicCast<JSC::JSArrayBufferView*>(converted);
if (!convertedView) {
return Bun::ERR::CRYPTO_HASH_UPDATE_FAILED(scope, globalObject);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 Pre-existing: Several jsDynamicCast results in this file and in JSHmac.cpp are dereferenced without null checks, which can cause null pointer dereferences. Specifically: (1) jsHashProtoFuncUpdate line 148 -- jsDynamicCast<JSHash*> is dereferenced at line 153 without a null check; (2) jsHashProtoFuncDigest line 208 -- same pattern, dereferenced at line 211; (3) JSHmac.cpp line 149 -- the exact jsDynamicCast<JSArrayBufferView*>(converted) pattern this PR fixes here is left unchecked. Since this PR is already fixing this class of bug, it would be natural to fix the other instances too.

Extended reasoning...

What the bug is

The PR adds a null check for jsDynamicCast<JSC::JSArrayBufferView*>(converted) at lines 180-183 of JSHash.cpp to prevent a null pointer dereference when constructFromEncoding returns a value that is not a JSArrayBufferView. However, there are three other instances of the same class of unchecked jsDynamicCast in nearby code paths.

The specific code paths

1. jsHashProtoFuncUpdate (JSHash.cpp:148/153): JSHash* hash = jsDynamicCast<JSHash*>(thisHash) can return null if thisValue is not a JSHash object. The result is immediately dereferenced at line 153 via hash->m_finalized. Calling Hash.prototype.update.call({}, "data") or Hash.prototype.update.call(nonHashObject, "data") would trigger a null pointer dereference crash.

2. jsHashProtoFuncDigest (JSHash.cpp:208/211): Identical pattern -- JSHash* hash = jsDynamicCast<JSHash*>(callFrame->thisValue()) is dereferenced at line 211 (hash->m_finalized) without a null check. Hash.prototype.digest.call({}) would crash.

3. JSHmac.cpp (line 149/151): The exact same jsDynamicCast<JSC::JSArrayBufferView*>(converted) pattern that this PR fixes in JSHash.cpp exists here. On line 151, convertedView->vector() and convertedView->byteLength() are called without verifying convertedView is non-null.

Why existing code does not prevent it

JavaScript's Function.prototype.call allows invoking any function with an arbitrary this value, bypassing the normal object type guarantees. When jsDynamicCast is called with an object that does not match the expected type, it correctly returns nullptr, but the callers do not check for this.

Proof by example

For bug (1): Hash.prototype.update.call({}, "data") causes jsDynamicCast<JSHash*>({}) to return nullptr, then hash->m_finalized dereferences null, causing a crash. Note that constructHash at line 306 already handles this correctly: original = jsDynamicCast<JSHash*>(algorithmOrHashInstanceValue); if (!original || original->m_finalized) -- it checks for null before dereferencing.

Impact

Any of these can be triggered from JavaScript to cause a null pointer dereference crash (segfault). While these are pre-existing issues not introduced by this PR, they represent the same class of bug the PR is already fixing and could naturally be addressed together.

How to fix

For the JSHash* casts (issues 1 and 2), add a null check and throw a type error, following the established codebase pattern seen in JSSign.cpp and DiffieHellmanFunctions.h (e.g., throwThisTypeError). For the JSArrayBufferView* cast in JSHmac.cpp (issue 3), apply the same null check + error return that this PR adds in JSHash.cpp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Segfault in Bun v1.3.11 on macOS ARM64: JSHash.cpp jsHashProtoFuncUpdate crash

1 participant