Skip to content

(fix): Type-stable task-local pool accessors for CPU/CUDA/Metal#42

Merged
mgyoo86 merged 2 commits intomasterfrom
fix/type_stability
Apr 2, 2026
Merged

(fix): Type-stable task-local pool accessors for CPU/CUDA/Metal#42
mgyoo86 merged 2 commits intomasterfrom
fix/type_stability

Conversation

@mgyoo86
Copy link
Copy Markdown
Member

@mgyoo86 mgyoo86 commented Apr 2, 2026

Problem

get_task_local_pool(), get_task_local_cuda_pool(), and get_task_local_metal_pool() were type-unstable because their return type assertions used bare abstract types without parametric type parameters:

# Before (type-unstable)
return pool::AdaptiveArrayPool           # missing {RUNTIME_CHECK}
return pool::CuAdaptiveArrayPool         # missing {RUNTIME_CHECK}
return pool::MetalAdaptiveArrayPool      # missing {RUNTIME_CHECK, METAL_STORAGE}

Since task_local_storage() returns IdDict{Any,Any}, the compiler cannot infer concrete return types without explicit parametric assertions. This caused runtime boxing on every pool access — directly undermining the zero-allocation goal.

Root Cause

Julia's type system treats parametric types as invariant: AdaptiveArrayPool{0} <: AdaptiveArrayPool is true, but the compiler cannot narrow ::AdaptiveArrayPool to ::AdaptiveArrayPool{0} without the explicit parameter. The same applied to Dict value types used for multi-device GPU pools.

Fix

Added concrete type parameters to all return assertions and Dict value types:

Backend Before After
CPU ::AdaptiveArrayPool ::AdaptiveArrayPool{RUNTIME_CHECK}
CUDA ::CuAdaptiveArrayPool ::CuAdaptiveArrayPool{RUNTIME_CHECK}
Metal ::MetalAdaptiveArrayPool ::MetalAdaptiveArrayPool{RUNTIME_CHECK, METAL_STORAGE}
CUDA Dict Dict{Int, CuAdaptiveArrayPool} Dict{Int, CuAdaptiveArrayPool{RUNTIME_CHECK}}
Metal Dict Dict{UInt64, MetalAdaptiveArrayPool} Dict{UInt64, MetalAdaptiveArrayPool{RUNTIME_CHECK, METAL_STORAGE}}

Tests

Added @inferred regression tests for all three backends, covering both code paths:

  • Fast path: pool already exists in task-local storage (cache hit)
  • Slow path: fresh Task where pool must be created (Threads.@spawn)

Changed Files

  • src/task_local_pool.jl — CPU return type assertion
  • ext/AdaptiveArrayPoolsCUDAExt/task_local_pool.jl — CUDA return + Dict types
  • ext/AdaptiveArrayPoolsMetalExt/task_local_pool.jl — Metal return + Dict types
  • test/test_task_local_pool.jl — CPU @inferred tests
  • test/cuda/test_extension.jl — CUDA @inferred tests + isa fixes
  • test/metal/test_task_local_pool.jl — Metal @inferred tests + isa fixes
  • test/metal/runtests.jl — import METAL_STORAGE const for tests

Return type assertions on get_task_local_*_pool() were missing type
parameters, causing type instability when retrieved from task-local
storage (IdDict{Any,Any}). Added concrete parametric types to all
return assertions and Dict value types. Added @inferred regression
tests for all three backends covering both fast path (existing pool)
and slow path (fresh task creation).
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.48%. Comparing base (843e7f4) to head (b6330d7).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #42   +/-   ##
=======================================
  Coverage   95.48%   95.48%           
=======================================
  Files          14       14           
  Lines        3232     3232           
=======================================
  Hits         3086     3086           
  Misses        146      146           
Files with missing lines Coverage Δ
src/task_local_pool.jl 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes type instability in task-local pool accessors by asserting concrete parametric pool types (CPU/CUDA/Metal) and using concretely-typed per-device pool dictionaries, preventing boxing on pool access and preserving the library’s zero-allocation goals.

Changes:

  • Add concrete parametric type assertions to get_task_local_pool, get_task_local_cuda_pool, and get_task_local_metal_pool.
  • Make task-local per-device GPU pool dictionaries concretely typed by pool parameters.
  • Add @inferred regression tests for fast-path (cache hit) and slow-path (fresh Task) access for CPU/CUDA/Metal.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/task_local_pool.jl Tightens return type assertion for CPU task-local pool to AdaptiveArrayPool{RUNTIME_CHECK}.
ext/AdaptiveArrayPoolsCUDAExt/task_local_pool.jl Uses Dict{Int, CuAdaptiveArrayPool{RUNTIME_CHECK}} and asserts concrete return type for CUDA task-local pool.
ext/AdaptiveArrayPoolsMetalExt/task_local_pool.jl Uses Dict{UInt64, MetalAdaptiveArrayPool{RUNTIME_CHECK, METAL_STORAGE}} and asserts concrete return type for Metal task-local pool.
test/test_task_local_pool.jl Adds @inferred coverage for CPU task-local pool fast/slow paths.
test/cuda/test_extension.jl Adds @inferred coverage for CUDA task-local pool and updates Dict isa checks to concrete parametric type.
test/metal/test_task_local_pool.jl Adds @inferred coverage for Metal task-local pool and updates Dict isa checks to concrete parametric type.
test/metal/runtests.jl Imports METAL_STORAGE from the extension for parametric type checks in tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mgyoo86 mgyoo86 merged commit a636311 into master Apr 2, 2026
14 checks passed
@mgyoo86 mgyoo86 deleted the fix/type_stability branch April 2, 2026 03:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants