fix(execution): queued execution finalization and async correlation by PlaneInABottle · Pull Request #3535 · simstudioai/sim

PlaneInABottle · 2026-03-12T10:31:54Z

Summary

This PR isolates the first two remediation steps from a larger webhook/async execution incident branch.

It does two things:

makes queued execution finalization happen inside the core execution path instead of relying on wrapper-level recovery
preserves request/execution correlation across queued async execution paths so background runs can be traced consistently

Problem

We were seeing cases where async/queued executions could finish logically, but terminal-state finalization was still dependent on detached follow-up behavior outside the core execution path.

That creates two operational problems:

an execution can do the actual work, but fail to persist a durable terminal state if the wrapper process is interrupted or recycled at the wrong time
webhook/schedule/workflow async paths do not consistently preserve request/execution correlation, which makes queued runs harder to trace and reconcile once they leave the initial request path

Root cause

1) Terminal-state finalization was not owned strongly enough by the core execution path

Some queued execution flows still depended on wrapper-side cleanup/recovery behavior to finish terminal logging/finalization. That means the execution engine and the durable terminal-state write were not fully coupled.

2) Async correlation was not propagated consistently across queued boundaries

Queued webhook/schedule/workflow execution paths did not always carry the same correlation information all the way through preprocessing, enqueueing, and background execution. That made it harder to connect:

the incoming request
the queued job
the final execution record

What changed

A. Finalize runs inside core execution

This PR moves terminal finalization responsibility into the core execution flow so the same code path that determines the execution outcome is also responsible for finalizing it durably.

Concretely, the changes ensure that:

terminal execution outcomes are finalized from core execution
wrapper-level duplicate/fallback finalization paths are reduced or guarded
timeout/cancellation cleanup is handled more predictably
logging/session behavior is covered by targeted regression tests

B. Preserve async correlation across queued execution paths

This PR also threads correlation data through queued execution paths so the same execution/request identity survives across:

workflow async execution
schedule async execution
webhook preprocessing
webhook background execution
Trigger.dev-backed async dispatch

Concretely, the changes add or preserve:

preassigned execution IDs where needed before queueing
request/execution correlation in preprocessing
correlation on queued payloads and metadata
regression coverage for async webhook/schedule/workflow correlation behavior

Files / surfaces affected

Main execution/finalization surfaces:

apps/sim/lib/workflows/executor/execution-core.ts
apps/sim/lib/logs/execution/logging-session.ts
apps/sim/background/workflow-execution.ts
apps/sim/background/webhook-execution.ts
apps/sim/background/schedule-execution.ts

Async correlation / queueing surfaces:

apps/sim/app/api/workflows/[id]/execute/route.ts
apps/sim/app/api/schedules/execute/route.ts
apps/sim/lib/webhooks/processor.ts
apps/sim/lib/execution/preprocessing.ts
apps/sim/lib/core/async-jobs/types.ts
apps/sim/lib/core/async-jobs/backends/trigger-dev.ts

Why this PR is intentionally scoped this way

This PR is intentionally limited to the first two fixes from the broader incident work:

finalization durability
async correlation preservation

It does not include the later follow-up work around:

richer stuck-execution diagnostics
progress markers / last-block visibility
stale cleanup classification and reconciliation
later review-driven hardening passes

That split is intentional so this PR stays focused on the minimum behavior changes needed to:

make terminal finalization more reliable for queued runs
make async runs traceable across queue boundaries

Testing

Targeted tests were rerun on this isolated branch:

Finalization / execution-core coverage

bun --cwd apps/sim vitest run lib/workflows/executor/execution-core.test.ts lib/logs/execution/logging-session.test.ts

Async correlation coverage

bun --cwd apps/sim vitest run background/async-execution-correlation.test.ts background/async-preprocessing-correlation.test.ts lib/execution/preprocessing.test.ts lib/execution/preprocessing.webhook-correlation.test.ts "app/api/workflows/[id]/execute/route.async.test.ts" "app/api/schedules/execute/route.test.ts" "app/api/webhooks/trigger/[path]/route.test.ts"

Results on this PR branch:

9 test files passed
34 tests passed

Reviewer notes

The large diff is mostly because the webhook execution / processor surfaces sit at the boundary where correlation has to be preserved end-to-end.

The intended review focus is:

terminal-state ownership moving into core execution
correlation propagation through preprocessing + queueing + background execution
regression coverage matching those two behaviors

cursor · 2026-03-12T10:32:00Z

PR Summary

High Risk
Touches core workflow execution finalization and logging paths, which are critical for durable terminal states and billing/run-count side effects. Also changes async job payload/metadata across workflows, schedules, and webhooks, so regressions could impact tracing or queued execution behavior.

Overview
Moves terminal execution finalization into executeWorkflowCore by awaiting completion/cancellation/pause/error logging and always clearing cancellation state, instead of relying on fire-and-forget wrapper cleanup. Adds a core-level guard (wasExecutionFinalizedByCore) to prevent double-finalization when wrappers also attempt recovery.

Threads a new AsyncExecutionCorrelation object (executionId/requestId + source-specific fields) through queue payloads, job metadata, preprocessing (triggerData), execution metadata, and logging so workflow/schedule/webhook async runs keep consistent correlation across enqueueing, background execution, and persisted logs.

Refactors LoggingSession completion to dedupe/retry completion attempts and updates ExecutionLogger completion to preserve any start-time correlation. Adds/updates focused tests covering correlation propagation and core finalization sequencing/fallbacks.

^{Written by Cursor Bugbot for commit a8e6e0a. This will update automatically on new commits. Configure here.}

vercel · 2026-03-12T10:32:02Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
docs	Skipped		Mar 12, 2026 11:30pm

apps/sim/background/webhook-execution.ts

greptile-apps · 2026-03-12T10:37:04Z

Greptile Summary

This PR fixes two related reliability issues in async execution paths: it moves execution finalization out of fire-and-forget wrappers into the core execution path so terminal state is durably written for webhook, schedule, and workflow runs, and it pre-assigns correlation identifiers at the point of enqueueing so background jobs can trace back to the originating request without generating new IDs.

Key changes:

execution-core.ts: finalizeExecutionOutcome and finalizeExecutionError replace the previous fire-and-forget void (async () => {...})() blocks with awaited calls. A module-level finalizedExecutionIds Set + markExecutionFinalizedByCore/wasExecutionFinalizedByCore prevent duplicate finalization in background wrappers.
logging-session.ts: Stores triggerData.correlation on the session and resets completionPromise to null on rejection (allowing retries after failure); buildCompletedExecutionData in logger.ts now preserves environment, trigger, and correlation from the start record through to the terminal record.
preprocessing.ts: Threads triggerData through all logPreprocessingError call-sites so preprocessing failure logs carry full correlation.
Background executors (schedule-execution.ts, workflow-execution.ts): Correctly add buildXxxCorrelation helpers, pass triggerData: { correlation } to both preprocessExecution and safeStart, and gate post-error finalization on wasExecutionFinalizedByCore.
webhook-execution.ts: Adds the same correlation propagation but is missing triggerData: { correlation } in its preprocessExecution call (unlike the other two executors), leaving preprocessing failure logs without correlation metadata. The file was also reformatted to double-quotes/semicolons, inconsistent with the rest of the codebase.

Confidence Score: 3/5

Safe to merge with two issues to address: a missing correlation argument in the webhook executor's preprocessing call and a quote-style inconsistency that will likely fail linting.
The core finalization logic is sound and a meaningful reliability improvement. The schedule and workflow paths are correctly and completely updated. The webhook path, however, has a gap — preprocessExecution is called without triggerData: { correlation }, breaking the correlation chain for webhook preprocessing failures in a way that's inconsistent with the other two executors. Additionally, webhook-execution.ts was reformatted to double-quotes with semicolons, which is inconsistent with the project's single-quote style and will fail linting. Both issues are straightforward to fix but should be resolved before merge.
Pay close attention to apps/sim/background/webhook-execution.ts — missing triggerData in preprocessExecution and inconsistent quote style throughout the file.

Important Files Changed

Filename	Overview
apps/sim/lib/workflows/executor/execution-core.ts	Core change: moves post-execution finalization from fire-and-forget to awaited calls, adds `finalizeExecutionOutcome`/`finalizeExecutionError` helpers, and introduces a module-level `finalizedExecutionIds` Set with potential unbounded growth in long-lived processes.
apps/sim/background/webhook-execution.ts	Adds correlation propagation and `wasExecutionFinalizedByCore` guard; however `preprocessExecution` is called without `triggerData: { correlation }` (unlike schedule and workflow executors), and the entire file was reformatted to double-quotes/semicolons inconsistent with the codebase style.
apps/sim/background/schedule-execution.ts	Correctly adds `buildScheduleCorrelation`, passes `triggerData: { correlation }` to `preprocessExecution`, adds an explicit `safeStart` with correlation, and gates post-error finalization on `wasExecutionFinalizedByCore`.
apps/sim/background/workflow-execution.ts	Correctly adds `buildWorkflowCorrelation`, passes `triggerData: { correlation }` in both `preprocessExecution` and `safeStart`, and guards the catch block with `wasExecutionFinalizedByCore`.
apps/sim/lib/logs/execution/logging-session.ts	Stores `triggerData.correlation` in a private field and now resets `completionPromise` to null on rejection — a design change that allows retried completions after failure, with a subtle risk of double-completion if internal guards aren't airtight.
apps/sim/lib/logs/execution/logger.ts	Refactors completion logic into `buildCompletedExecutionData`, which now preserves `environment`, `trigger`, and `correlation` from the initial start log through to the final completion record. Clean and correct.
apps/sim/lib/execution/preprocessing.ts	Threads `triggerData` through all error-logging paths so that preprocessing failures carry full correlation metadata when a `LoggingSession` error record is written.
apps/sim/lib/webhooks/processor.ts	Adds a pre-assigned correlation object in `checkWebhookPreprocessing` and `queueWebhookExecution`, and includes it in the job metadata; largely reformatted to double-quotes which is inconsistent with codebase style.