feat: add ExitReason tracking for actor lifecycle by ElFantasma · Pull Request #163 · lambdaclass/spawned

ElFantasma · 2026-03-31T13:10:10Z

Summary

Adds ExitReason enum (Normal, Shutdown, Panic(String), Kill) with is_abnormal() method
Actors now report WHY they stopped, not just that they did
ActorRef::wait_exit() returns the exit reason (async in tasks, blocking in threads)
ActorRef::exit_reason() polls the reason without blocking
join() unchanged for backwards compatibility
Includes exit_reason example demonstrating all exit paths

Foundation for supervision trees (#131, #133). The watch channel changes from bool to Option<ExitReason> in tasks mode, and Mutex<bool> to Mutex<Option<ExitReason>> in threads mode. run_actor() now returns ExitReason on every exit path: clean stop, channel closure, panic in started(), and panic in handler.

Test plan

8 new tests (4 per execution mode): clean stop, panic in started, panic in handler, polling
All 67 existing tests pass (no regressions from watch channel type change)
Runnable example: cargo run -p exit_reason

github-actions · 2026-03-31T13:11:47Z

🤖 Kimi Code Review

The PR looks solid and correctly implements exit reason tracking for supervision. Here are my observations:

✅ Correctness & Safety

concurrency/src/threads/actor.rs:444-451 - The CompletionGuard drop implementation correctly handles mutex poisoning and uses std::mem::replace to move the reason out without leaving the struct in an invalid state. This is safe.

Panic handling - Both async and thread implementations correctly use catch_unwind/AssertUnwindSafe to trap panics and convert them to ExitReason::Panic, ensuring the completion channel always receives a reason.

Race safety - The exit_reason() polling methods are thread-safe:

Async version uses tokio::sync::watch (atomic refcounted)
Thread version uses Mutex with poisoning recovery

⚠️ Minor Issues

concurrency/src/tasks/actor.rs:385-392 - The fallback to ExitReason::Normal when rx.changed().await.is_err() is theoretically unreachable (since run_actor catches all panics), but acceptable as a defensive measure. Consider adding a comment or tracing::warn to indicate this is an unexpected state.

Unused variants - ExitReason::Kill and ExitReason::Shutdown are defined but never constructed. This is fine for a foundational PR, but consider adding #[doc(hidden)] or a TODO comment indicating these are reserved for future supervision implementation.

📝 Suggestions

concurrency/src/error.rs:11-17 - The is_abnormal() method logic is correct for Erlang-style supervision semantics (Transient/Permanent children). Consider documenting that Shutdown is considered normal to prevent confusion:

/// Returns `true` for exit reasons that should trigger a restart.
/// Note: Shutdown is considered normal (no restart).
pub fn is_abnormal(&self) -> bool { ... }

concurrency/src/threads/actor.rs:395-398 - The manual drop(guard) is slightly unidiomatic. You can simplify:

let _guard = CompletionGuard { completion, reason };
// Let it drop naturally at end of scope

🧪 Testing

The test coverage is comprehensive for the implemented functionality:

Clean shutdown path
Panic in started()
Panic in message handler
Polling exit_reason() while running

Missing (non-blocking): Test for panic in stopped() callback, though this is less critical since the actor has already effectively exited at that point.

Summary

The implementation is correct, thread-safe, and follows Rust best practices. The symmetry between async and thread backends is maintained properly. No blockers.

Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

github-actions · 2026-03-31T13:12:16Z

🤖 Codex Code Review

Findings:

High: stopped() panics are documented as ExitReason::Panic, but both runtimes currently discard that and report the previous reason, which is usually Normal. That means a crash in shutdown logic is invisible to supervisors using wait_exit() / exit_reason(). See concurrency/src/tasks/actor.rs:506 and concurrency/src/threads/actor.rs:479.
Medium: The thread backend no longer guarantees completion notification if the actor thread unwinds unexpectedly. CompletionGuard is created only after run_actor returns, so any uncaught panic before that leaves completion as None forever and join() / wait_exit() can block indefinitely. The old placement of the guard was safer. See concurrency/src/threads/actor.rs:412 and concurrency/src/threads/actor.rs:241.
Medium: The async backend treats a closed completion watch channel as ExitReason::Normal. If the task is aborted or panics before completion_tx.send(Some(reason)), callers will misclassify an abnormal termination as a clean stop, which is the wrong default for supervision logic. See concurrency/src/tasks/actor.rs:383 and concurrency/src/tasks/actor.rs:433.

Open question:

ExitReason::Shutdown and ExitReason::Kill are added as public API, but I don’t see any path that can actually produce them yet. If that’s intentional for future supervision work, the docs should say they are reserved; otherwise the enum currently over-promises semantics. See concurrency/src/error.rs:1.

Verification gap: I could not run cargo test in this environment because Cargo attempted to reach index.crates.io and network access is blocked.

Automated review by OpenAI Codex · gpt-5.4 · custom prompt

github-actions · 2026-03-31T13:13:08Z

🤖 Claude Code Review

PR #163 Review: `feat: add ExitReason tracking for actor lifecycle`

Summary

Solid foundational work for supervision trees. The design is clean, the API is symmetric between tasks and threads, and the test coverage is good. A few issues worth addressing before merging.

Bug: Panic in `stopped()` doesn't update `exit_reason`

concurrency/src/tasks/actor.rs ~L505 and concurrency/src/threads/actor.rs ~L480

let stop_result = AssertUnwindSafe(actor.stopped(&ctx)).catch_unwind().await;
if let Err(panic) = stop_result {
    let msg = panic_message(&panic);
    tracing::error!("Panic in stopped() callback: {msg}");
    // exit_reason is NOT updated here
}
exit_reason  // returns Normal when stopped() panicked on a clean actor

If an actor exits cleanly (exit_reason = Normal) but stopped() panics, the caller receives ExitReason::Normal. The stopped() panic is logged but silently dropped from the exit reason. This matters for supervision: a supervisor seeing Normal would not restart a child that actually panicked on cleanup.

Fix: Update exit_reason if stopped() panics and exit_reason is still Normal:

if let Err(panic) = stop_result {
    let msg = panic_message(&panic);
    tracing::error!("Panic in stopped() callback: {msg}");
    if matches!(exit_reason, ExitReason::Normal) {
        exit_reason = ExitReason::Panic(format!("panic in stopped(): {msg}"));
    }
}

`Shutdown` and `Kill` variants are never produced

These variants are defined in ExitReason but no code path ever sets them. This leaves callers unable to distinguish Shutdown from Normal in practice, and Kill from Panic. If these aren't wired up in this PR, they should either have #[doc(hidden)] or a doc comment like // Not yet produced; reserved for supervision tree implementation.

Missing `PartialEq` derive

concurrency/src/error.rs L3

#[derive(Debug, Clone)]
pub enum ExitReason {

Without PartialEq, supervision code can't do reason == ExitReason::Normal. The stated purpose is to enable supervision trees (#131, #133), so this will immediately be needed. Add PartialEq now.

`panic_message` is duplicated

concurrency/src/tasks/actor.rs and concurrency/src/threads/actor.rs both define identical panic_message functions. This belongs in error.rs as a pub(crate) function.

Clippy lint: `&Box<dyn Any>` parameter

Both copies of panic_message take &Box<dyn std::any::Any + Send>, which clippy flags as clippy::borrowed_box. The idiomatic signature is:

fn panic_message(panic: &(dyn std::any::Any + Send)) -> String {

The call sites work the same way since Box<T> derefs to T.

Redundant `drop(guard)` in threads

concurrency/src/threads/actor.rs ~L415

let guard = CompletionGuard { completion, reason };
drop(guard);  // redundant — guard drops at end of closure anyway

The explicit drop adds noise without clarity. The old _guard RAII pattern was cleaner; if the explicit call is meant to make the intent obvious, a comment would be better.

Theoretical race in polling test

concurrency/src/threads/actor.rs (and tasks) test exit_reason_poll_returns_none_while_running

let actor = Counter { count: 0 }.start();
assert!(actor.exit_reason().is_none());  // racy: actor could have exited already

In practice Counter won't self-stop, but the assertion is non-deterministic by construction. A more robust pattern would be to check is_none() only while a long-running actor is inflight, or to replace it with a paired wait_exit/exit_reason assertion.

Minor: `mem::replace` in `CompletionGuard::drop`

concurrency/src/threads/actor.rs ~L243

*completed = Some(std::mem::replace(&mut self.reason, ExitReason::Normal));

This works (you can't move out of self in drop), but it's subtle. A cleaner alternative is to store reason: Option<ExitReason> and use .take():

*completed = self.reason.take();

What's good

The watch::Receiver<Option<ExitReason>> choice for tasks is idiomatic and avoids a second channel.
The Condvar-based polling in threads is correct; the mutex is held across cvar.wait as required.
join() delegating to wait_exit() preserves backward compatibility cleanly.
Test coverage is thorough for the paths that are actually implemented.
panic_message extracting a human-readable string from the panic payload is a meaningful improvement over {panic:?}.

Automated review by Claude (Anthropic) · sonnet · custom prompt

greptile-apps · 2026-03-31T13:15:28Z

Greptile Summary

This PR adds ExitReason tracking to the actor lifecycle, changing the internal completion signal from a bare bool/watch<bool> to Option<ExitReason>, and exposing wait_exit() / exit_reason() on ActorRef for both the tasks and threads execution modes. It is a well-scoped foundation for the upcoming supervision tree work.

Key changes:

New ExitReason enum (Normal, Shutdown, Panic(String), Kill) with is_abnormal() and Display in error.rs
run_actor() now returns ExitReason on every exit path in both tasks/actor.rs and threads/actor.rs
ActorRef::wait_exit() and ActorRef::exit_reason() added to both modules; join() now delegates to wait_exit()
8 new tests (4 per mode) covering clean stop, panic-in-started, panic-in-handler, and polling
New exit_reason example demonstrating all four scenarios

Issues found:

Threads RAII regression (P1): The CompletionGuard is now constructed after run_actor returns. If run_actor ever panics unexpectedly at the thread boundary, the guard is never built, cvar.notify_all() is never called, and any caller blocked in wait_exit() deadlocks forever.
Task abort maps to Normal (P2): If the async task is cancelled/aborted externally, wait_exit() returns ExitReason::Normal — silently misclassifying an aborted actor.
stopped() panics not reflected in exit reason (P2): A clean-stop actor whose stopped() panics still reports Normal.
Missing PartialEq on ExitReason (P2): Supervision trees will want == comparisons.
panic_message duplicated (P2): Identical helper defined in both actor modules.

Confidence Score: 4/5

Safe to merge with the RAII regression in threads mode addressed; all other findings are P2 style/hardening concerns

One P1 issue exists: the threads CompletionGuard is no longer constructed under RAII, meaning an unexpected panic in run_actor at the thread boundary would cause wait_exit() to deadlock. While run_actor catches all known panics internally (making this unlikely in practice), it is a regression from the previous code's safety guarantee. All other findings are P2 quality/robustness suggestions that do not block current functionality.

concurrency/src/threads/actor.rs — the CompletionGuard construction and run_actor call ordering needs a top-level catch_unwind guard to restore the old RAII safety

Important Files Changed

Filename	Overview
concurrency/src/threads/actor.rs	Mutex changed to `Option<ExitReason>`; `CompletionGuard` refactored to carry reason, but now constructed after `run_actor` — breaking the RAII guarantee that prevented `wait_exit()` deadlocks on unexpected panics
concurrency/src/tasks/actor.rs	Watch channel changed from `bool` to `Option<ExitReason>`; `wait_exit()` added; `run_actor` returns `ExitReason`; `panic_message` duplicated from threads module; `wait_exit` maps dropped sender to `Normal` even for task aborts
concurrency/src/error.rs	Adds `ExitReason` enum with `Normal`, `Shutdown`, `Panic(String)`, `Kill` variants, `is_abnormal()`, and `Display` impl; missing `PartialEq` derive limits supervision use
concurrency/src/lib.rs	Re-exports `ExitReason` alongside `ActorError` — straightforward one-line change
examples/exit_reason/src/main.rs	New runnable example demonstrating all four exit paths (clean stop, handler panic, started panic, polling); well-structured and accurate
examples/exit_reason/Cargo.toml	New example package manifest, correctly declares workspace dependencies

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ActorRef
    participant run_actor
    participant CompletionChannel

    Caller->>ActorRef: .start()
    ActorRef->>CompletionChannel: init(None)
    ActorRef->>run_actor: spawn(actor, ctx, rx, token)

    alt started() panics
        run_actor-->>run_actor: catch_unwind → ExitReason::Panic
    else started() calls ctx.stop()
        run_actor-->>run_actor: ExitReason::Normal
    else normal flow
        loop message loop
            run_actor->>run_actor: recv message
            alt handler panics
                run_actor-->>run_actor: exit_reason = Panic
                run_actor-->>run_actor: break
            else ctx.stop() called
                run_actor-->>run_actor: break (Normal)
            end
        end
        run_actor->>run_actor: stopped() [catch_unwind]
    end

    run_actor->>CompletionChannel: send(Some(ExitReason))
    CompletionChannel-->>ActorRef: notify

    Caller->>ActorRef: .wait_exit()
    ActorRef->>CompletionChannel: wait for Some(reason)
    CompletionChannel-->>ActorRef: Some(ExitReason)
    ActorRef-->>Caller: ExitReason

Prompt To Fix All With AI

This is a comment left during a code review.
Path: concurrency/src/threads/actor.rs
Line: 412-420

Comment:
**RAII guarantee broken — `wait_exit()` can deadlock if `run_actor` panics unexpectedly**

The old code constructed `CompletionGuard` *before* calling `run_actor`, so the guard's `Drop` impl would always fire (writing `Some(reason)` and calling `cvar.notify_all()`) even if `run_actor` panicked at the thread level:

```rust
// OLD — always notifies, even on unexpected panic
let _guard = CompletionGuard(completion);
run_actor(actor, ctx, rx, cancellation_token);
```

The new code constructs the guard *after* `run_actor` returns. If `run_actor` ever panics without being caught (e.g., a future bug in the framework, a stack overflow, or a `catch_unwind` boundary issue), the guard is never created, `completion` is just dropped as an `Arc`, and any thread blocked in `wait_exit()` → `cvar.wait(…)` will deadlock forever.

Note that the `tasks` mode is unaffected because dropping `completion_tx` causes `rx.changed()` to return `Err`, so `wait_exit()` returns `ExitReason::Normal` instead of hanging.

A safe fix is to wrap `run_actor` at the thread closure level:
```rust
let _thread_handle = rt::spawn(move || {
    let reason = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
        run_actor(actor, ctx, rx, cancellation_token)
    }))
    .unwrap_or(ExitReason::Panic("unexpected framework panic".to_string()));

    let guard = CompletionGuard { completion, reason };
    drop(guard);
});
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 383-393

Comment:
**`wait_exit()` silently maps task abortion/cancellation to `ExitReason::Normal`**

When `rx.changed().await` returns `Err`, it means every `watch::Sender` was dropped without ever sending `Some(reason)`. If the spawned task is **aborted** from outside (e.g., the tokio `JoinHandle` is used to abort, or the runtime shuts down mid-execution), the sender is dropped without a value and this branch returns `ExitReason::Normal` — even though the actor was forcibly killed.

For a supervision foundation this silent misclassification matters: a supervisor that only restarts on `is_abnormal()` would silently ignore an aborted actor. Consider returning `ExitReason::Kill` here instead:

```suggestion
            if rx.changed().await.is_err() {
                return ExitReason::Kill;
            }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 505-510

Comment:
**Panic in `stopped()` is not reflected in the returned `ExitReason`**

When `stopped()` panics, the error is logged but `exit_reason` is unchanged. Two scenarios arise:

1. **Clean stop + `stopped()` panics** → returns `ExitReason::Normal`. A supervisor relying on `is_abnormal()` won't restart the actor despite cleanup code crashing.
2. **Handler panic + `stopped()` panics** → returns the original `Panic("panic in handler: …")`, which is correct but hides the secondary cleanup failure.

The same logic applies to the equivalent lines in `concurrency/src/threads/actor.rs` (lines 479–485).

If panics in `stopped()` should be surfaced, consider:
```rust
if let Err(panic) = stop_result {
    let msg = panic_message(&panic);
    tracing::error!("Panic in stopped() callback: {msg}");
    if !exit_reason.is_abnormal() {
        exit_reason = ExitReason::Panic(format!("panic in stopped(): {msg}"));
    }
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: concurrency/src/error.rs
Line: 2-12

Comment:
**`ExitReason` is missing `PartialEq` — limits usability for supervision**

`ExitReason` derives `Debug` and `Clone` but not `PartialEq`. Supervision trees will almost certainly need to compare exit reasons directly. Without `PartialEq`, callers must use `matches!` patterns everywhere and cannot use `==` or `!=`.

```suggestion
#[derive(Debug, Clone, PartialEq)]
pub enum ExitReason {
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 515-523

Comment:
**`panic_message` is duplicated across both actor modules**

This identical helper function is defined in both `concurrency/src/tasks/actor.rs` (lines 515–523) and `concurrency/src/threads/actor.rs` (lines 490–498). Consider moving it to `concurrency/src/error.rs` or a shared internal utilities module so it can be reused without duplication.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "feat: add ExitReason tracking for actor ..." | Re-trigger Greptile}

greptile-apps · 2026-03-31T13:15:34Z

        let _thread_handle = rt::spawn(move || {
-            let _guard = CompletionGuard(completion);
-            run_actor(actor, ctx, rx, cancellation_token);
+            let reason = run_actor(actor, ctx, rx, cancellation_token);
+            let guard = CompletionGuard {
+                completion,
+                reason,
+            };
+            drop(guard);
        });



RAII guarantee broken — wait_exit() can deadlock if run_actor panics unexpectedly

The old code constructed CompletionGuard before calling run_actor, so the guard's Drop impl would always fire (writing Some(reason) and calling cvar.notify_all()) even if run_actor panicked at the thread level:

// OLD — always notifies, even on unexpected panic let _guard = CompletionGuard(completion); run_actor(actor, ctx, rx, cancellation_token);

The new code constructs the guard after run_actor returns. If run_actor ever panics without being caught (e.g., a future bug in the framework, a stack overflow, or a catch_unwind boundary issue), the guard is never created, completion is just dropped as an Arc, and any thread blocked in wait_exit() → cvar.wait(…) will deadlock forever.

Note that the tasks mode is unaffected because dropping completion_tx causes rx.changed() to return Err, so wait_exit() returns ExitReason::Normal instead of hanging.

A safe fix is to wrap run_actor at the thread closure level:

let _thread_handle = rt::spawn(move || { let reason = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { run_actor(actor, ctx, rx, cancellation_token) })) .unwrap_or(ExitReason::Panic("unexpected framework panic".to_string())); let guard = CompletionGuard { completion, reason }; drop(guard); });

Prompt To Fix With AI

This is a comment left during a code review. Path: concurrency/src/threads/actor.rs Line: 412-420 Comment: **RAII guarantee broken — `wait_exit()` can deadlock if `run_actor` panics unexpectedly** The old code constructed `CompletionGuard` *before* calling `run_actor`, so the guard's `Drop` impl would always fire (writing `Some(reason)` and calling `cvar.notify_all()`) even if `run_actor` panicked at the thread level: ```rust // OLD — always notifies, even on unexpected panic let _guard = CompletionGuard(completion); run_actor(actor, ctx, rx, cancellation_token); ``` The new code constructs the guard *after* `run_actor` returns. If `run_actor` ever panics without being caught (e.g., a future bug in the framework, a stack overflow, or a `catch_unwind` boundary issue), the guard is never created, `completion` is just dropped as an `Arc`, and any thread blocked in `wait_exit()` → `cvar.wait(…)` will deadlock forever. Note that the `tasks` mode is unaffected because dropping `completion_tx` causes `rx.changed()` to return `Err`, so `wait_exit()` returns `ExitReason::Normal` instead of hanging. A safe fix is to wrap `run_actor` at the thread closure level: ```rust let _thread_handle = rt::spawn(move || { let reason = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { run_actor(actor, ctx, rx, cancellation_token) })) .unwrap_or(ExitReason::Panic("unexpected framework panic".to_string())); let guard = CompletionGuard { completion, reason }; drop(guard); }); ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in 4ec9859.

greptile-apps · 2026-03-31T13:15:35Z

+    pub async fn wait_exit(&self) -> ExitReason {
        let mut rx = self.completion_rx.clone();
-        while !*rx.borrow_and_update() {
+        loop {
+            if let Some(reason) = rx.borrow_and_update().clone() {
+                return reason;
+            }
            if rx.changed().await.is_err() {
-                break;
+                return ExitReason::Normal;
            }
        }
    }


wait_exit() silently maps task abortion/cancellation to ExitReason::Normal

When rx.changed().await returns Err, it means every watch::Sender was dropped without ever sending Some(reason). If the spawned task is aborted from outside (e.g., the tokio JoinHandle is used to abort, or the runtime shuts down mid-execution), the sender is dropped without a value and this branch returns ExitReason::Normal — even though the actor was forcibly killed.

For a supervision foundation this silent misclassification matters: a supervisor that only restarts on is_abnormal() would silently ignore an aborted actor. Consider returning ExitReason::Kill here instead:

Suggested change

pub async fn wait_exit(&self) -> ExitReason {

let mut rx = self.completion_rx.clone();

while !*rx.borrow_and_update() {

loop {

if let Some(reason) = rx.borrow_and_update().clone() {

return reason;

}

if rx.changed().await.is_err() {

break;

return ExitReason::Normal;

}

}

}

if rx.changed().await.is_err() {

return ExitReason::Kill;

}

Prompt To Fix With AI

This is a comment left during a code review. Path: concurrency/src/tasks/actor.rs Line: 383-393 Comment: **`wait_exit()` silently maps task abortion/cancellation to `ExitReason::Normal`** When `rx.changed().await` returns `Err`, it means every `watch::Sender` was dropped without ever sending `Some(reason)`. If the spawned task is **aborted** from outside (e.g., the tokio `JoinHandle` is used to abort, or the runtime shuts down mid-execution), the sender is dropped without a value and this branch returns `ExitReason::Normal` — even though the actor was forcibly killed. For a supervision foundation this silent misclassification matters: a supervisor that only restarts on `is_abnormal()` would silently ignore an aborted actor. Consider returning `ExitReason::Kill` here instead: ```suggestion if rx.changed().await.is_err() { return ExitReason::Kill; } ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in 4ec9859.

greptile-apps · 2026-03-31T13:15:36Z

    cancellation_token.cancel();
    let stop_result = AssertUnwindSafe(actor.stopped(&ctx)).catch_unwind().await;
    if let Err(panic) = stop_result {
-        tracing::error!("Panic in stopped() callback: {panic:?}");
+        let msg = panic_message(&panic);
+        tracing::error!("Panic in stopped() callback: {msg}");
+    }


Panic in stopped() is not reflected in the returned ExitReason

When stopped() panics, the error is logged but exit_reason is unchanged. Two scenarios arise:

Clean stop + stopped() panics → returns ExitReason::Normal. A supervisor relying on is_abnormal() won't restart the actor despite cleanup code crashing.

Handler panic + stopped() panics → returns the original Panic("panic in handler: …"), which is correct but hides the secondary cleanup failure.

The same logic applies to the equivalent lines in concurrency/src/threads/actor.rs (lines 479–485).

If panics in stopped() should be surfaced, consider:

if let Err(panic) = stop_result { let msg = panic_message(&panic); tracing::error!("Panic in stopped() callback: {msg}"); if !exit_reason.is_abnormal() { exit_reason = ExitReason::Panic(format!("panic in stopped(): {msg}")); } }

Prompt To Fix With AI

This is a comment left during a code review. Path: concurrency/src/tasks/actor.rs Line: 505-510 Comment: **Panic in `stopped()` is not reflected in the returned `ExitReason`** When `stopped()` panics, the error is logged but `exit_reason` is unchanged. Two scenarios arise: 1. **Clean stop + `stopped()` panics** → returns `ExitReason::Normal`. A supervisor relying on `is_abnormal()` won't restart the actor despite cleanup code crashing. 2. **Handler panic + `stopped()` panics** → returns the original `Panic("panic in handler: …")`, which is correct but hides the secondary cleanup failure. The same logic applies to the equivalent lines in `concurrency/src/threads/actor.rs` (lines 479–485). If panics in `stopped()` should be surfaced, consider: ```rust if let Err(panic) = stop_result { let msg = panic_message(&panic); tracing::error!("Panic in stopped() callback: {msg}"); if !exit_reason.is_abnormal() { exit_reason = ExitReason::Panic(format!("panic in stopped(): {msg}")); } } ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in 4ec9859.

greptile-apps · 2026-03-31T13:15:38Z

+#[derive(Debug, Clone)]
+pub enum ExitReason {
+    /// Clean stop via `ctx.stop()` or channel closure.
+    Normal,
+    /// Ordered shutdown from a supervisor or linked actor.
+    Shutdown,
+    /// Actor panicked in `started()`, a handler, or `stopped()`.
+    Panic(String),
+    /// Untrappable kill signal.
+    Kill,
+}


ExitReason is missing PartialEq — limits usability for supervision

ExitReason derives Debug and Clone but not PartialEq. Supervision trees will almost certainly need to compare exit reasons directly. Without PartialEq, callers must use matches! patterns everywhere and cannot use == or !=.

Suggested change

#[derive(Debug, Clone)]

pub enum ExitReason {

/// Clean stop via `ctx.stop()` or channel closure.

Normal,

/// Ordered shutdown from a supervisor or linked actor.

Shutdown,

/// Actor panicked in `started()`, a handler, or `stopped()`.

Panic(String),

/// Untrappable kill signal.

Kill,

}

#[derive(Debug, Clone, PartialEq)]

pub enum ExitReason {

Prompt To Fix With AI

This is a comment left during a code review. Path: concurrency/src/error.rs Line: 2-12 Comment: **`ExitReason` is missing `PartialEq` — limits usability for supervision** `ExitReason` derives `Debug` and `Clone` but not `PartialEq`. Supervision trees will almost certainly need to compare exit reasons directly. Without `PartialEq`, callers must use `matches!` patterns everywhere and cannot use `==` or `!=`. ```suggestion #[derive(Debug, Clone, PartialEq)] pub enum ExitReason { ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in 4ec9859.

greptile-apps · 2026-03-31T13:15:39Z

+fn panic_message(panic: &Box<dyn std::any::Any + Send>) -> String {
+    if let Some(s) = panic.downcast_ref::<&str>() {
+        s.to_string()
+    } else if let Some(s) = panic.downcast_ref::<String>() {
+        s.clone()
+    } else {
+        format!("{panic:?}")
    }
 }


panic_message is duplicated across both actor modules

This identical helper function is defined in both concurrency/src/tasks/actor.rs (lines 515–523) and concurrency/src/threads/actor.rs (lines 490–498). Consider moving it to concurrency/src/error.rs or a shared internal utilities module so it can be reused without duplication.

Prompt To Fix With AI

This is a comment left during a code review. Path: concurrency/src/tasks/actor.rs Line: 515-523 Comment: **`panic_message` is duplicated across both actor modules** This identical helper function is defined in both `concurrency/src/tasks/actor.rs` (lines 515–523) and `concurrency/src/threads/actor.rs` (lines 490–498). Consider moving it to `concurrency/src/error.rs` or a shared internal utilities module so it can be reused without duplication. How can I resolve this? If you propose a fix, please make it concise.

Fixed in 4ec9859.

ElFantasma · 2026-03-31T14:02:15Z

Bot Review Responses

All bot findings have been addressed in 4ec9859 and f37e99b. Summary below.

Kimi

Fallback to ExitReason::Normal when rx.changed().await.is_err()

Fixed. Changed to ExitReason::Kill — if the watch sender is dropped without sending a reason, the actor was forcibly terminated, not cleanly stopped.

Manual drop(guard) is unidiomatic

Fixed. Restored RAII pattern with let mut guard = CompletionGuard { ... };.

Missing test for panic in stopped()

stopped() panics now update the exit reason (see Codex/Claude fix below). Existing panic_in_stopped_still_completes test covers the behavior.

Codex

High: stopped() panics are invisible to supervisors

Fixed. If stopped() panics and the exit reason was Normal, it's now updated to ExitReason::Panic("panic in stopped(): ..."). If the reason was already abnormal (e.g., handler panic), the original reason is preserved.

Medium: Thread backend CompletionGuard no longer guarantees completion notification

Fixed. Guard is now constructed BEFORE run_actor() with reason: None. If run_actor panics unexpectedly at the thread boundary, the guard's Drop sets ExitReason::Kill via .take().or(Some(Kill)). This restores the RAII safety guarantee from the old code.

Medium: Async backend treats closed watch channel as Normal

Fixed. Returns ExitReason::Kill instead.

Shutdown and Kill never produced

Added doc comments: "Not yet produced — reserved for supervision tree implementation."

Claude

Bug: Panic in stopped() doesn't update exit_reason

Fixed. Same as Codex finding above.

Missing PartialEq derive

Fixed. Added PartialEq + 3 new tests (exit_reason_partial_eq, exit_reason_display, exit_reason_is_abnormal_classification).

panic_message is duplicated

Fixed. Moved to error.rs as pub(crate) fn panic_message.

Clippy lint: &Box<dyn Any>

Fixed. Changed to &(dyn std::any::Any + Send).

Redundant drop(guard)

Fixed. See Kimi response above.

mem::replace in CompletionGuard::drop

Fixed. Now uses Option<ExitReason> + .take().

Theoretical race in polling test

Disregarded. The Counter actor never self-stops, so exit_reason() is deterministically None immediately after start(). The test is not racy in practice.

Greptile

All findings overlap with Codex/Claude above — all addressed. The sequence diagram and summary are accurate.

Oppen · 2026-04-06T16:51:01Z

+    /// Clean stop via `ctx.stop()` or channel closure.
+    Normal,
+    /// Ordered shutdown from a supervisor or linked actor.
+    /// Not yet produced — reserved for supervision tree implementation.
+    Shutdown,
+    /// Actor panicked in `started()`, a handler, or `stopped()`.
+    Panic(String),
+    /// Untrappable kill signal.
+    /// Not yet produced — reserved for supervision tree implementation.
+    Kill,


I'm not entirely sure about the variants. Specifically, behaviorally it seems like normal and shutdown are equivalent, and kill and panic differ only on the message.

Also, how will the untrappable kill signal be implemented? It sounds like its benefit would be that it works even if the actor is stuck in an endless loop, but that would require real preemption and the ability to direct signals to specific threads (which I'm not sure if it's doable) to work. Otherwise, you would just need to wait for the cancellation point and at that point you can probably just allow the shutdown process to run.

Good questions. 🤡

On Normal vs Shutdown: They are behaviorally identical today — both are "clean" exits where is_abnormal() returns false. The distinction matters for supervision: a Transient child should not restart on either, but the cause is different. Normal means the actor decided to stop on its own (ctx.stop()). Shutdown means something external told it to stop (a supervisor shutting down children in order, or a linked actor propagating an orderly shutdown). In Erlang, shutdown and {shutdown, Term} are distinct from normal for exactly this reason — they're both "clean" but carry different semantic meaning for debugging and logging. You want to know whether an actor stopped because it was done or because its supervisor told it to. That said, if in practice the distinction proves unnecessary, we can collapse them later.

On Kill and preemption: You're right that we can't preempt a stuck actor — Rust has no mechanism for that (no Thread::stop(), and tokio::JoinHandle::abort() only works at .await points). In practice, Kill means:

Cancel the token immediately (same as Shutdown)

Don't run stopped() callback — this is the key difference from Shutdown, which does run cleanup

Bypass trap_exit — a supervisor trapping exits receives Exit messages for Normal/Shutdown/Panic, but Kill terminates the actor without delivering a message. This matters for supervision trees: when a supervisor hits meltdown and needs to tear everything down immediately, it kills children without giving them a chance to interfere.

So Kill is not about preempting a stuck loop (that's a harder problem), but about skipping cleanup and bypassing exit trapping. Same as Erlang's exit(Pid, kill) — it doesn't magically preempt either, it just skips terminate/2 and can't be trapped.

The "stuck in an endless loop" problem is real but orthogonal — that's more in the territory of the Thread backend (where you could pthread_cancel in theory) or timeouts at the supervisor level (ShutdownType::Timeout + force drop). Worth discussing separately.

MegaRedHand · 2026-04-06T17:42:03Z

+    /// Untrappable kill signal.
+    /// Not yet produced — reserved for supervision tree implementation.
+    Kill,


The doc says "Not yet produced", but this PR produces Kill in two places: wait_exit() returns it when the watch channel sender drops, and CompletionGuard::drop() defaults to it when no reason was set.

Good catch — the doc was wrong. Fixed in bf225b9:

Kill doc updated to reflect it's currently produced as a fallback when an async task is aborted externally

Removed the "Not yet produced" note from Kill (kept it on Shutdown which truly is never produced yet)

CompletionGuard::drop fallback changed from Kill to Panic("unexpected framework panic") — see MegaRedHand's other comment

MegaRedHand · 2026-04-06T17:56:22Z

+        let (lock, cvar) = &*self.completion;
        let mut completed = lock.lock().unwrap_or_else(|p| p.into_inner());
-        *completed = true;
+        *completed = self.reason.take().or(Some(ExitReason::Kill));


Should we differentiate here between true Kills and this kind of unexpected failure?

Good point — we should differentiate. Fixed in bf225b9:

The CompletionGuard fallback now uses Panic("unexpected framework panic") instead of Kill. This is more accurate since this path is only hit if run_actor panics at the thread boundary (a framework bug), not an intentional kill.

Kill is now reserved for intentional untrappable termination from supervision (ShutdownType::BrutalKill).

…ures

feat: add ExitReason tracking for actor lifecycle

db3e882

greptile-apps bot reviewed Mar 31, 2026

View reviewed changes

ElFantasma added 2 commits March 31, 2026 10:51

fix: address review feedback on exit reason PR

4ec9859

style: apply cargo fmt

f37e99b

Oppen reviewed Apr 6, 2026

View reviewed changes

MegaRedHand reviewed Apr 6, 2026

View reviewed changes

fix: reserve Kill for intentional kills, use Panic for framework fail…

bf225b9

…ures

MegaRedHand approved these changes Apr 6, 2026

View reviewed changes

Oppen approved these changes Apr 6, 2026

View reviewed changes

ElFantasma merged commit 8958ec2 into main Apr 6, 2026
5 checks passed

Conversation

ElFantasma commented Mar 31, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Mar 31, 2026

🤖 Kimi Code Review

✅ Correctness & Safety

⚠️ Minor Issues

📝 Suggestions

🧪 Testing

Summary

Uh oh!

github-actions bot commented Mar 31, 2026

🤖 Codex Code Review

Uh oh!

github-actions bot commented Mar 31, 2026

🤖 Claude Code Review

PR #163 Review: feat: add ExitReason tracking for actor lifecycle

Summary

Bug: Panic in stopped() doesn't update exit_reason

Shutdown and Kill variants are never produced

Missing PartialEq derive

panic_message is duplicated

Clippy lint: &Box<dyn Any> parameter

Redundant drop(guard) in threads

Theoretical race in polling test

Minor: mem::replace in CompletionGuard::drop

What's good

Uh oh!

greptile-apps bot commented Mar 31, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ElFantasma commented Mar 31, 2026

Bot Review Responses

Kimi

Codex

Claude

Greptile

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ElFantasma Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PR #163 Review: `feat: add ExitReason tracking for actor lifecycle`

Bug: Panic in `stopped()` doesn't update `exit_reason`

`Shutdown` and `Kill` variants are never produced

Missing `PartialEq` derive

`panic_message` is duplicated

Clippy lint: `&Box<dyn Any>` parameter

Redundant `drop(guard)` in threads

Minor: `mem::replace` in `CompletionGuard::drop`

ElFantasma Apr 6, 2026 •

edited

Loading