Skip to content

fix(execution): report cancellation durability truthfully#3550

Open
PlaneInABottle wants to merge 25 commits intosimstudioai:stagingfrom
PlaneInABottle:fix/cancel-durability
Open

fix(execution): report cancellation durability truthfully#3550
PlaneInABottle wants to merge 25 commits intosimstudioai:stagingfrom
PlaneInABottle:fix/cancel-durability

Conversation

@PlaneInABottle
Copy link

Summary

  • make the cancel-execution response reflect whether the Redis-backed cancellation marker was durably recorded
  • return durability metadata (durablyRecorded, reason) and avoid misleading success logging when persistence fails
  • fix the targeted cancellation tests by hoisting Vitest mocks in cancellation.test.ts

Problem / current behavior

  • the cancel-execution API previously implied success after attempting cancellation even when the Redis-backed marker was not durably recorded
  • when Redis was unavailable or the write failed, callers and operators could read the response/logging as a successful cancellation even though no durable cancellation marker existed
  • the targeted cancellation tests also needed a follow-up fix so the mocked module state was initialized correctly under Vitest

Proposed fix / behavior after

  • update the cancellation helper to return structured durability results instead of a bare boolean
  • make the cancel route set success based on whether the cancellation marker was durably recorded, and include durablyRecorded plus a failure reason in the response
  • downgrade the non-durable path from misleading success reporting to warning-level reporting
  • hoist the Vitest mocks in apps/sim/lib/execution/cancellation.test.ts so the targeted tests reliably validate the success and failure outcomes

Why this matters

  • prevents false-positive cancellation reporting
  • makes Redis availability/write failures visible and actionable to API consumers and operators
  • preserves regression coverage for the durability contract this branch introduces

Files changed

  • apps/sim/app/api/workflows/[id]/executions/[executionId]/cancel/route.ts
  • apps/sim/app/api/workflows/[id]/executions/[executionId]/cancel/route.test.ts
  • apps/sim/lib/execution/cancellation.ts
  • apps/sim/lib/execution/cancellation.test.ts

Tests run

  • cd /Users/Y_ALTAY1/Documents/simWoot/sim/apps/sim && bun run test -- lib/execution/cancellation.test.ts 'app/api/workflows/[id]/executions/[executionId]/cancel/route.test.ts'

Notes / non-goals / follow-ups

  • this PR is intentionally scoped to truthful cancellation durability reporting plus the follow-up test-only fix
  • it does not redesign broader cancellation orchestration, retries, or worker-side handling beyond surfacing whether the Redis-backed marker was durably written
  • the second commit on this branch is test-only and limited to hoisting Vitest mocks in cancellation.test.ts

waleedlatif1 and others added 24 commits February 16, 2026 00:36
…stash, algolia tools; isolated-vm robustness improvements, tables backend (simstudioai#3271)

* feat(tools): advanced fields for youtube, vercel; added cloudflare and dataverse tools (simstudioai#3257)

* refactor(vercel): mark optional fields as advanced mode

Move optional/power-user fields behind the advanced toggle:
- List Deployments: project filter, target, state
- Create Deployment: project ID override, redeploy from, target
- List Projects: search
- Create/Update Project: framework, build/output/install commands
- Env Vars: variable type
- Webhooks: project IDs filter
- Checks: path, details URL
- Team Members: role filter
- All operations: team ID scope

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style(youtube): mark optional params as advanced mode

Hide pagination, sort order, and filter fields behind the advanced
toggle for a cleaner default UX across all YouTube operations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* added advanced fields for vercel and youtube, added cloudflare and dataverse block

* addded desc for dataverse

* add more tools

* ack comment

* more

* ops

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(tables): added tables (simstudioai#2867)

* updates

* required

* trashy table viewer

* updates

* updates

* filtering ui

* updates

* updates

* updates

* one input mode

* format

* fix lints

* improved errors

* updates

* updates

* chages

* doc strings

* breaking down file

* update comments with ai

* updates

* comments

* changes

* revert

* updates

* dedupe

* updates

* updates

* updates

* refactoring

* renames & refactors

* refactoring

* updates

* undo

* update db

* wand

* updates

* fix comments

* fixes

* simplify comments

* u[dates

* renames

* better comments

* validation

* updates

* updates

* updates

* fix sorting

* fix appearnce

* updating prompt to make it user sort

* rm

* updates

* rename

* comments

* clean comments

* simplicifcaiton

* updates

* updates

* refactor

* reduced type confusion

* undo

* rename

* undo changes

* undo

* simplify

* updates

* updates

* revert

* updates

* db updates

* type fix

* fix

* fix error handling

* updates

* docs

* docs

* updates

* rename

* dedupe

* revert

* uncook

* updates

* fix

* fix

* fix

* fix

* prepare merge

* readd migrations

* add back missed code

* migrate enrichment logic to general abstraction

* address bugbot concerns

* adhere to size limits for tables

* remove conflicting migration

* add back migrations

* fix tables auth

* fix permissive auth

* fix lint

* reran migrations

* migrate to use tanstack query for all server state

* update table-selector

* update names

* added tables to permission groups, updated subblock types

---------

Co-authored-by: Vikhyath Mondreti <vikhyath@simstudio.ai>
Co-authored-by: waleed <walif6@gmail.com>

* fix(snapshot): changed insert to upsert when concurrent identical child workflows are running (simstudioai#3259)

* fix(snapshot): changed insert to upsert when concurrent identical child workflows are running

* fixed ci tests failing

* fix(workflows): disallow duplicate workflow names at the same folder level (simstudioai#3260)

* feat(tools): added redis, upstash, algolia, and revenuecat (simstudioai#3261)

* feat(tools): added redis, upstash, algolia, and revenuecat

* ack comment

* feat(models): add gemini-3.1-pro-preview and update gemini-3-pro thinking levels (simstudioai#3263)

* fix(audit-log): lazily resolve actor name/email when missing (simstudioai#3262)

* fix(blocks): move type coercions from tools.config.tool to tools.config.params (simstudioai#3264)

* fix(blocks): move type coercions from tools.config.tool to tools.config.params

Number() coercions in tools.config.tool ran at serialization time before
variable resolution, destroying dynamic references like <block.result.count>
by converting them to NaN/null. Moved all coercions to tools.config.params
which runs at execution time after variables are resolved.

Fixed in 15 blocks: exa, arxiv, sentry, incidentio, wikipedia, ahrefs,
posthog, elasticsearch, dropbox, hunter, lemlist, spotify, youtube, grafana,
parallel. Also added mode: 'advanced' to optional exa fields.

Closes simstudioai#3258

* fix(blocks): address PR review — move remaining param mutations from tool() to params()

- Moved field mappings from tool() to params() in grafana, posthog,
  lemlist, spotify, dropbox (same dynamic reference bug)
- Fixed parallel.ts excerpts/full_content boolean logic
- Fixed parallel.ts search_queries empty case (must set undefined)
- Fixed elasticsearch.ts timeout not included when already ends with 's'
- Restored dropbox.ts tool() switch for proper default fallback

* fix(blocks): restore field renames to tool() for serialization-time validation

Field renames (e.g. personalApiKey→apiKey) must be in tool() because
validateRequiredFieldsBeforeExecution calls selectToolId()→tool() then
checks renamed field names on params. Only type coercions (Number(),
boolean) stay in params() to avoid destroying dynamic variable references.

* improvement(resolver): resovled empty sentinel to not pass through unexecuted valid refs to text inputs (simstudioai#3266)

* fix(blocks): add required constraint for serviceDeskId in JSM block (simstudioai#3268)

* fix(blocks): add required constraint for serviceDeskId in JSM block

* fix(blocks): rename custom field values to request field values in JSM create request

* fix(trigger): add isolated-vm support to trigger.dev container builds (simstudioai#3269)

Scheduled workflow executions running in trigger.dev containers were
failing to spawn isolated-vm workers because the native module wasn't
available in the container. This caused loop condition evaluation to
silently fail and exit after one iteration.

- Add isolated-vm to build.external and additionalPackages in trigger config
- Include isolated-vm-worker.cjs via additionalFiles for child process spawning
- Add fallback path resolution for worker file in trigger.dev environment

* fix(tables): hide tables from sidebar and block registry (simstudioai#3270)

* fix(tables): hide tables from sidebar and block registry

* fix(trigger): add isolated-vm support to trigger.dev container builds (simstudioai#3269)

Scheduled workflow executions running in trigger.dev containers were
failing to spawn isolated-vm workers because the native module wasn't
available in the container. This caused loop condition evaluation to
silently fail and exit after one iteration.

- Add isolated-vm to build.external and additionalPackages in trigger config
- Include isolated-vm-worker.cjs via additionalFiles for child process spawning
- Add fallback path resolution for worker file in trigger.dev environment

* lint

* fix(trigger): update node version to align with main app (simstudioai#3272)

* fix(build): fix corrupted sticky disk cache on blacksmith (simstudioai#3273)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Lakee Sivaraya <71339072+lakeesiv@users.noreply.github.com>
Co-authored-by: Vikhyath Mondreti <vikhyath@simstudio.ai>
Co-authored-by: Vikhyath Mondreti <vikhyathvikku@gmail.com>
… fixes, removed retired models, hex integration
…ogle tasks and bigquery integrations, workflow lock
Return explicit durability results for execution cancellation so success only reflects persisted cancellation state instead of best-effort Redis availability.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@cursor
Copy link

cursor bot commented Mar 12, 2026

PR Summary

Medium Risk
Changes the cancel API response contract and the markExecutionCancelled return type, which may affect existing callers that assumed a boolean or unconditional success. Logic is straightforward but touches execution cancellation and operational signaling (logging/redis failure modes).

Overview
Makes execution cancellation reporting truthful about Redis durability. markExecutionCancelled now returns a structured result (durablyRecorded + reason) instead of a boolean, distinguishing redis_unavailable vs redis_write_failed.

Updates the cancel-execution route to set success based on durable recording, return the new metadata (durablyRecorded, reason, derived redisAvailable), and emit a warning when the cancellation marker cannot be persisted. Adds/updates Vitest coverage for both the helper and the API route across success and failure scenarios.

Written by Cursor Bugbot for commit 4f22a7e. Configure here.

@vercel
Copy link

vercel bot commented Mar 12, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Mar 12, 2026 11:25pm

Request Review

@icecrasher321
Copy link
Collaborator

@PlaneInABottle please make this point towards staging as well

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR improves cancellation durability reporting by refactoring markExecutionCancelled to return a structured discriminated union (ExecutionCancellationRecordResult) instead of a bare boolean, and propagates that result through the cancel API route so callers can distinguish between "durably recorded", "Redis unavailable", and "Redis write failed" outcomes. The fix is well-scoped and the core cancellation.ts changes are clean; however, one semantic issue exists in the route response.

Key changes:

  • cancellation.ts: markExecutionCancelled now returns { durablyRecorded, reason } instead of boolean; the exported discriminated union type is a good contract
  • route.ts: success is now gated on durablyRecorded, and a reason field surfaces the failure mode — but redisAvailable is incorrectly set to cancellation.durablyRecorded, conflating "is Redis reachable" with "did the write succeed"
  • route.test.ts / cancellation.test.ts: New test files with vi.hoisted mocks correctly cover all three outcome variants, though the route tests mirror the redisAvailable bug

Confidence Score: 3/5

  • Safe to merge with a minor fix — the redisAvailable field in the response incorrectly reports false when Redis is reachable but the write fails, misleading API consumers.
  • The core logic in cancellation.ts is correct and well-typed. The route improvement is meaningful (truthful success flag, reason field). The only defect is that redisAvailable is aliased to durablyRecorded rather than being derived from reason !== 'redis_unavailable', producing a self-contradictory response body in the redis_write_failed path. The tests reinforce this incorrect behaviour, so the bug would persist without a targeted fix.
  • apps/sim/app/api/workflows/[id]/executions/[executionId]/cancel/route.ts (line 52) and the corresponding assertion in route.test.ts (lines 77 and 99).

Important Files Changed

Filename Overview
apps/sim/app/api/workflows/[id]/executions/[executionId]/cancel/route.ts Durability reporting upgraded to use structured result; however redisAvailable is incorrectly aliased to durablyRecorded, misreporting Redis availability when the write fails after a successful connection.
apps/sim/lib/execution/cancellation.ts Clean refactor: markExecutionCancelled now returns a discriminated union ExecutionCancellationRecordResult with durablyRecorded and typed reason; all three paths (unavailable, success, write error) are correctly handled.
apps/sim/app/api/workflows/[id]/executions/[executionId]/cancel/route.test.ts New test file with good coverage of all three cancellation outcomes; the redisAvailable: false expectation for the redis_write_failed case mirrors the bug in the route, so it will pass but test incorrect behaviour.
apps/sim/lib/execution/cancellation.test.ts New test file; mocks are properly hoisted with vi.hoisted, covering all three result variants of the refactored markExecutionCancelled.

Sequence Diagram

sequenceDiagram
    participant Client
    participant CancelRoute as POST /cancel
    participant markExecutionCancelled
    participant Redis

    Client->>CancelRoute: POST (workflowId, executionId)
    CancelRoute->>CancelRoute: checkHybridAuth
    CancelRoute->>CancelRoute: authorizeWorkflowByWorkspacePermission
    CancelRoute->>markExecutionCancelled: markExecutionCancelled(executionId)

    alt Redis client is null
        markExecutionCancelled-->>CancelRoute: { durablyRecorded: false, reason: 'redis_unavailable' }
    else Redis client exists
        markExecutionCancelled->>Redis: SET execution:cancel:{id} 1 EX 3600
        alt Write succeeds
            Redis-->>markExecutionCancelled: OK
            markExecutionCancelled-->>CancelRoute: { durablyRecorded: true, reason: 'recorded' }
        else Write throws
            Redis-->>markExecutionCancelled: Error
            markExecutionCancelled-->>CancelRoute: { durablyRecorded: false, reason: 'redis_write_failed' }
        end
    end

    CancelRoute-->>Client: 200 { success, executionId, redisAvailable, durablyRecorded, reason }
Loading

Last reviewed commit: 47dadaa

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@PlaneInABottle PlaneInABottle changed the base branch from main to staging March 12, 2026 23:26
@PlaneInABottle
Copy link
Author

@icecrasher321 Done — the PR has been retargeted to staging.

@icecrasher321 icecrasher321 changed the title fix: report cancellation durability truthfully fix(execution): report cancellation durability truthfully Mar 12, 2026
@icecrasher321
Copy link
Collaborator

bugbot run

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants