[IR Container] Phase 2.1 Ownership Fusion Special Vals & Axioms#5954
[IR Container] Phase 2.1 Ownership Fusion Special Vals & Axioms#5954mdavis36 wants to merge 9 commits intomd/phase2-ir-refactorfrom
Conversation
|
!test |
|
Review updated until commit 3693e3d Description
|
| Relevant files | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Enhancement |
| ||||||||||
| Bug fix |
| ||||||||||
| Tests |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Memory ownership clarity
|
50c6cd1 to
edcb6dc
Compare
|
!test |
1 similar comment
|
!test |
8112846 to
1588d37
Compare
|
!test |
1588d37 to
cf4ae4a
Compare
|
!test |
Greptile SummaryThis PR successfully moves per- Key changes verified:
All existing behavior is preserved for the single-Fusion case, and the two explicit bug fixes ( Confidence Score: 5/5
Sequence DiagramsequenceDiagram
participant Fusion
participant IrContainer
participant IrBuilder
Note over Fusion,IrBuilder: Lazy special-val creation (new design)
Fusion->>Fusion: zeroVal() — zero_val_ == nullptr
Fusion->>IrBuilder: createInContainer(this, 0L, Index)
IrBuilder->>IrContainer: vals_up_.push_back(unique_ptr<Val>)
IrBuilder-->>Fusion: Val* stored as zero_val_
Note over Fusion,IrContainer: StatementGuard rollback
Fusion->>Fusion: removeStatementsCreatedAfter(num_exprs, num_vals)
Fusion->>Fusion: null zero_val_/one_val_/true_val_/false_val_/magic_zero_val_ if about to be destroyed
Fusion->>IrContainer: vals_.erase + vals_up_.pop_back per destroyed val
Note over Fusion,IrContainer: Fusion copy
Fusion->>IrContainer: IrContainer::copy clones all vals_up_ including special vals
Fusion->>Fusion: remap to->zero_val_ via ir_cloner cache lookup
Fusion->>Fusion: clone axioms_ and metadata_ entries
Last reviewed commit: 4111350 |
cf4ae4a to
7820c63
Compare
6f28a71 to
2b3c4a4
Compare
|
!test |
Moved special values (`zero_val_`, `one_val_`, `true_val_`,
`false_val_`, `magic_zero_val_`) from `IrContainer` to the `Fusion`
class. This ensures that with shared containers, each Fusion has its own
special values, preventing ownership conflicts when one Fusion is
destroyed.
**Option Implemented:** Option A (Move Special Values to Fusion) as
recommended in the prompt.
Added private members and public accessors to Fusion class:
```cpp
// Phase 2: Per-Fusion special values
// With shared containers, each Fusion needs its own special values.
// These are raw pointers - memory is owned by IrContainer's vals_up_.
// Destroying this Fusion removes these vals via
removeStatementsOwnedBy().
Val* zero_val_ = nullptr;
Val* one_val_ = nullptr;
Val* true_val_ = nullptr;
Val* false_val_ = nullptr;
NamedScalar* magic_zero_val_ = nullptr;
```
Public accessors:
- `Val* zeroVal()` - Returns Index 0
- `Val* oneVal()` - Returns Index 1
- `Val* falseVal()` - Returns Bool false
- `Val* trueVal()` - Returns Bool true
- `NamedScalar* magicZeroVal()` - Returns magic zero named scalar
- `Val* zeroVal(DataType dtype)` - Returns 0 for specified dtype
- `Val* oneVal(DataType dtype)` - Returns 1 for specified dtype
Implemented lazy creation pattern for all special value accessors:
```cpp
Val* Fusion::zeroVal() {
if (!zero_val_) {
zero_val_ = IrBuilder::createInContainer<Val>(this, 0L,
DataType::Index);
}
return zero_val_;
}
// Similar implementations for oneVal(), falseVal(), trueVal(),
magicZeroVal()
```
Updated `Fusion::clear()` to reset special value pointers:
```cpp
// Reset per-Fusion special values (they'll be recreated lazily if
needed)
// The actual Val objects were removed by removeStatementsOwnedBy above.
zero_val_ = nullptr;
one_val_ = nullptr;
true_val_ = nullptr;
false_val_ = nullptr;
magic_zero_val_ = nullptr;
```
Removed special value members and added documentation comment:
```cpp
// Note: Special values (zero_val_, one_val_, true_val_, false_val_,
// magic_zero_val_) are now per-Fusion, stored in Fusion class.
// This avoids ownership conflicts when multiple Fusions share an
IrContainer.
// See Fusion::zeroVal(), etc. for the per-Fusion implementation.
```
Removed special value accessor implementations (they're now in Fusion).
All call sites were already updated to use `fusion->zeroVal()` instead
of `ir_container()->zeroVal()`. Verified with grep that no call sites
remain using the old pattern.
Added 8 new unit tests for Task 7:
1. **PerFusionSpecialValuesBasic** - Tests that special values are
created and owned by the Fusion
2. **SpecialValuesOwnedByFusion** - Tests that special values are
tracked in `ownedVals()`
3. **SeparateFusionsHaveOwnSpecialValues** - Tests that two Fusions have
different special value objects
4. **DestroyFusionDoesNotAffectOther** - Tests that destroying one
Fusion doesn't affect another's special values
5. **SpecialValuesLazyCreation** - Tests that same value is returned on
repeated calls
6. **AllSpecialValuesPerFusion** - Tests all five special value
accessors
7. **SpecialValuesClearedOnFusionClear** - Tests that `clear()` resets
special values
8. **SpecialValuesWithDtype** - Tests `zeroVal(dtype)` and
`oneVal(dtype)` accessors
```
[==========] Running 34 tests from 3 test suites.
[ PASSED ] 34 tests.
```
```
[==========] Running 26 tests from 1 test suite.
[ PASSED ] 26 tests.
```
Including 8 new Task 7 tests:
- `Phase2ContainerTest.PerFusionSpecialValuesBasic` - PASSED
- `Phase2ContainerTest.SpecialValuesOwnedByFusion` - PASSED
- `Phase2ContainerTest.SeparateFusionsHaveOwnSpecialValues` - PASSED
- `Phase2ContainerTest.DestroyFusionDoesNotAffectOther` - PASSED
- `Phase2ContainerTest.SpecialValuesLazyCreation` - PASSED
- `Phase2ContainerTest.AllSpecialValuesPerFusion` - PASSED
- `Phase2ContainerTest.SpecialValuesClearedOnFusionClear` - PASSED
- `Phase2ContainerTest.SpecialValuesWithDtype` - PASSED
- `csrc/fusion.h` - Added special value members and accessors
- `csrc/fusion.cpp` - Added accessor implementations, updated `clear()`
- `csrc/ir/container.h` - Removed special values, added comment
- `csrc/ir/container.cpp` - Removed accessor implementations
- `tests/cpp/test_phase2_container_sharing.cpp` - Added 8 unit tests
- [x] Each Fusion has its own special values
- [x] Destroying Fusion A doesn't affect Fusion B's special values
- [x] Special value accessors (`zeroVal()`, `oneVal()`, etc.) return
this Fusion's values
- [x] Lazy creation still works (create on first access)
- [x] Smoke tests pass (34/34)
- [x] Unit tests added (8 tests)
- [x] Unit tests pass (26/26 Phase 2 tests)
- [x] Code compiles without errors
- [x] REPORT.md delivered
1. **Memory ownership:** Special values are raw pointers stored in
Fusion, but the actual memory is owned by IrContainer's `vals_up_`. When
a Fusion is destroyed, `removeStatementsOwnedBy()` cleans up these vals.
2. **Lazy creation pattern:** Special values are created on first
access. This matches the original IrContainer behavior and avoids
creating values that aren't needed.
3. **Clear handling:** `Fusion::clear()` now resets special value
pointers to nullptr after `removeStatementsOwnedBy()` removes the actual
Val objects. This ensures lazy recreation works correctly after clear.
4. **Copy/move handling:** Will be addressed in Tasks 5 and 6. This task
just moves the members and accessors.
Moved `axioms_` and `metadata_` from `IrContainer` to the `Fusion` class. This completes the deprecation of `parent_` usage for val-creating methods, which was necessary because `parent_` implies a 1-1 relationship (container → Fusion), but Phase 2 has 1-many (shared containers). Methods that used `parent_` to create vals were moved to Fusion: - `metadataOf(Val*)` - Now uses `v->container()` to get owning Fusion - `axioms()` - Now creates axiom vals owned by `this` Fusion - `assumePositive/assumeNonNegative` - Now adds to `this` Fusion's axioms - Added `axioms_` and `metadata_` private members - Changed method declarations from forwarding to actual implementations - Added includes for `ir/builder.h` and `ir/internal_nodes.h` - Implemented `metadataOf()`, `axioms()`, `assumePositive()`, `assumeNonNegative()` methods - Updated `clear()` to reset `axioms_` and `metadata_` - Removed `metadataOf()`, `axioms()`, `assumePositive()`, `assumeNonNegative()` declarations - Removed `lazyInitAxioms()` declaration - Removed `axioms_` and `metadata_` members - Removed implementations of above methods - Updated `IrContainer::swap` to remove axioms_/metadata_ swapping - Updated `IrContainer::copy` to remove axioms_/metadata_ handling - Updated `IrContainer::clear` to remove axioms_/metadata_ clearing Each Fusion now has its own axioms and metadata cache. This ensures: 1. No ownership conflicts when multiple Fusions share an IrContainer 2. Correct behavior when one Fusion is destroyed (doesn't affect others) 3. Lazy creation pattern preserved (create on first access) This is a prerequisite for the copy/move semantics changes which will swap/transfer these per-Fusion members.
- Add missing swap of axioms_ and metadata_ in Fusion::swap to prevent dangling pointers after move/assignment - Add missing cloning of axioms_ and metadata_ in Fusion::copy to preserve custom assumptions and metadata cache across copies - Guard Fusion::removeVal against removing cached special vals - Use std::unique_ptr for special vals and steal from vals_up_ to preserve the original invariant (shortcuts in vals_ but not vals_up_) - Fix metadataOf to use 'this' instead of v->container()
The old IrContainer approach popped special vals (zeroVal, oneVal, etc.) from vals_up_ after creation. During Fusion::copy, these vals were not cloned through the normal deterministic_vals() path. Instead, they were first cloned during axiom cloning, which happened AFTER val_type_name_map_ was overridden from the source — causing the name counter to be incremented 1 past the source value. Now that special vals remain in vals_up_, they are properly cloned before the counter override, so the counter stays accurate. This shifts loop index val names down by 1 (e.g., i113 instead of i114). The index expression structure is unchanged.
Special vals (trueVal, falseVal, oneVal, etc.) can be lazily created inside a StatementGuard scope (e.g. by simplifyExpr called from haveDifferentShardings). When the guard rolls back, it pops vals_up_ back to the snapshot, destroying those vals while the Fusion cache pointers still reference them. Subsequent calls return dangling pointers causing UB — this manifested as LoopShardedSplitReshapeIds incorrectly classifying a reshape as resharding on CI. Fusion::removeStatementsCreatedAfter now nulls out any special val cache pointers that are about to be destroyed, so they get re-created on next access.
SubstituteInExpr directly sets mutations_[reference] = substitute
without checking reference == substitute, unlike registerMutation
which guards against this. With per-Fusion special vals, Fusion::copy
now maps zero_val_ through the cloner so that IterDomain extents and
zero_val_ share the same pointer. When concretizeEmptyExtents finds
an extent that IS zero_val_, SubstituteInExpr created a self-mapping
that tripped the two-hop assertion in maybeMutated.
Why this didn't happen before:
Old code (main):
zero_val_ was stored in a separate unique_ptr, popped from
vals_up_. Fusion::copy didn't wire it up — B->zeroVal() lazily
created a brand new Val, so ext != zero always held.
New code (this branch):
zero_val_ lives in vals_up_ like any other Val. Fusion::copy
remaps it via ir_cloner.clone(), so B->zero_val_ IS the same
cloned Val that IterDomain extents reference:
Fusion A Fusion B (clone)
┌─────────────────┐ ┌──────────────────┐
│ zero_val_ ─► 0x1111 │ zero_val_ ─► 0x2222
│ id->extent() ─► 0x1111 │ id->extent() ─► 0x2222
└─────────────────┘ └──────────────────┘
clone maps 0x1111 → 0x2222
So ext == zero, and SubstituteInExpr(ext, zero) created:
mutations_[0x2222] = 0x2222 (self-mapping)
Then maybeMutated looked up 0x2222, found itself, treated
it as a two-hop chain, and asserted.
2b3c4a4 to
eefd453
Compare
3693e3d to
4111350
Compare
|
!test |
wujingyue
left a comment
There was a problem hiding this comment.
special values (
zero_val_,one_val_,true_val_,false_val_,magic_zero_val_)
Why are these scalar values per Fusion not per IrContainer?
For Phase 2 Don't we need to move these back to IrContainer in Phase 3 with scalar sharing?
Since special values have a known static value it is safe to maintain them uniquely for each Fusion. |
That's true but I suspect that has been a premature optimization. Creating them in a shared IrContainer will likely
(Non-blocking of course) |
This was my first instinct too. I thought about making all special values eagerly initialized during IrContainer construction however this still hits the lock contention issue since you need to check / initialize registration of the special value on it's first use to take it from an "un-owned" state to an "owned" one to ensure you have registered the Fusion object with the pre-initialized Val (this relationship is necessary for checks like We've already seen a regression due to lock contention on parallel builds. My current approach to combat this (eventually) is to separate the unique and shared IR, with the hopes of reducing the IR nodes that we need to live in a thread safe container. Since the nature of the shared values is that they are used extensively throughout the IR, keeping them per fusion is easiest and will stop the flip-flopping of ownership between fusion and the shared container. |
Summary
Move per-Fusion state — special values (
zero_val_,one_val_,true_val_,false_val_,magic_zero_val_), axioms, and metadata — fromIrContainertoFusion. This eliminates the 1:1 relationship assumption betweenIrContainerandFusionthat would break when multiple Fusions share a container in Phase 2.Key Changes
Special values (
zero_val_,one_val_,true_val_,false_val_,magic_zero_val_):IrContainertoFusionas raw cache pointers (memory still owned by IrContainer'svals_up_)IrBuilder::createInContainerFusion::clear()nulls cache pointers so they re-create on next accessStatementGuardrollback nulls any special vals created within the guard scope to prevent dangling pointersAxioms and metadata (
axioms_,metadata_):IrContainertoFusionparent_to create vals (metadataOf,axioms,assumePositive,assumeNonNegative) reimplemented on FusionFusion::swapand cloned inFusion::copyBug fixes surfaced by the migration:
SubstituteInExprself-substitution: with per-Fusion special vals,zero_val_is now the same pointer as IterDomain extents (both go through IrCloner).SubstituteInExpr(ext, zero)whereext == zerocreated a self-mapping that tripped a two-hop assertion. Fixed by guarding againstreference == substitute.StatementGuardrollback: special vals lazily created inside a guard scope were destroyed on rollback while Fusion cache pointers still referenced them.Relationship to Phase 2
This is a hard prerequisite for shared containers. Without per-Fusion special vals:
With per-Fusion special vals, each Fusion's
zeroVal()returns its own independently-cached val. Destroying Fusion A has zero effect on Fusion B's special values.CI Risk
Low. Each commit is independently CI-clean. The migration preserves identical behavior for the single-Fusion case (all existing usage). The two bug fixes (
SubstituteInExpr,StatementGuardrollback) address real correctness issues that were latent in the old code and only surfaced because the migration changed pointer identity relationships.