Skip to content

Claude/websocket state machine session 01 timj g nk un4 v mv rck4k p76s#909

Open
marcodejongh wants to merge 10 commits intomainfrom
claude/websocket-state-machine-session_01TimjGNkUn4VMvRCK4kP76s
Open

Claude/websocket state machine session 01 timj g nk un4 v mv rck4k p76s#909
marcodejongh wants to merge 10 commits intomainfrom
claude/websocket-state-machine-session_01TimjGNkUn4VMvRCK4kP76s

Conversation

@marcodejongh
Copy link
Owner

No description provided.

claude and others added 10 commits March 4, 2026 10:48
…overlay

When iOS Safari backgrounds the app, the OS silently kills the WebSocket
connection. Previously the UI stayed fully interactive while changes silently
failed to sync. This adds real-time WebSocket connection state tracking via
graphql-ws on.connected/on.closed callbacks, a visibilitychange handler to
proactively detect dead sockets on iOS foreground return, and a
"Reconnecting..." overlay on the queue control bar that blocks interaction
until the connection recovers.

https://claude.ai/code/session_016bdrJ2mAmjzSmZ2z6vWGN9
The on.connected callback was setting isWebSocketConnected=true immediately,
before handleReconnect could finish its async joinSession/delta-sync path.
This caused isReconnecting to become false prematurely, re-enabling queue
mutations while stale state was still being replayed.

Fix: onConnectionStateChange now receives an isReconnect flag. On
reconnections, the persistent session defers setting isWebSocketConnected=true
until handleReconnect's finally block, keeping the reconnecting overlay
and viewOnlyMode active throughout the entire resync.

Also adds iOS background connection loss documentation to
docs/websocket-implementation.md (section 7 under Failure States).

https://claude.ai/code/session_016bdrJ2mAmjzSmZ2z6vWGN9
- graphql-client.test.ts: Tests onConnectionStateChange callback fires
  with correct (connected, isReconnect) flags across initial connect,
  disconnect, and reconnection sequences. Verifies callback ordering
  (stateChange fires before onReconnect).

- connection-state.test.ts: Tests isReconnecting derivation truth table,
  reconnect lock timing (isReconnecting stays true during resync), and
  visibilitychange debounce behavior (300ms delay, rapid-fire cancellation,
  hidden state ignored).

https://claude.ai/code/session_016bdrJ2mAmjzSmZ2z6vWGN9
The full-page overlay blocked all user interaction. Replace it with a
bottom-center Snackbar/Alert that shows "Reconnecting..." with a spinner
and automatically disappears when the connection is re-established.
Queue mutations are still prevented by viewOnlyMode during reconnection.

https://claude.ai/code/session_016bdrJ2mAmjzSmZ2z6vWGN9
Break the BoardSessionBridge reactivation loop that caused infinite
connect/disconnect cycles after join failures. Track failed session IDs
so the bridge doesn't re-activate a session that was just cleared.

Also harden getSessionMembers() against partial Redis data (missing
username would violate String! schema constraint) and add production-safe
logging to the joinSession resolver.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The "detect failed session" effect in BoardSessionBridge was marking
sessions as failed on initial mount when activeSession is null (not yet
set), preventing the activation effect from ever running. Now only marks
a session as failed after it was previously active and then cleared.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The EVENTS_REPLAY query was using GraphQL field aliases (addedItem: item,
currentItem: item), so the response already had aliased field names. But
transformToSubscriptionEvent was reading event.item (the un-aliased name),
which was undefined at runtime. Remove the aliases so the response matches
the QueueEvent type, and the transform function correctly remaps the fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nnection

Extract the 400-line connection useEffect from persistent-session-context.tsx
into two testable, non-React classes:

- ConnectionStateMachine: explicit FSM replacing 6+ scattered boolean state
  variables, eliminating inconsistent state combinations
- SessionConnection: encapsulates client lifecycle, session join/leave,
  subscription setup, reconnection with delta sync, and transient retry logic

This reduces persistent-session-context.tsx from 1418 to ~990 lines and makes
the connection orchestration independently testable.

https://claude.ai/code/session_01TimjGNkUn4VMvRCK4kP76s
Generated patch from the WebSocket ConnectionStateMachine + SessionConnection
refactor since git push is returning 403.

https://claude.ai/code/session_01TimjGNkUn4VMvRCK4kP76s
@claude
Copy link

claude bot commented Mar 4, 2026

Claude Review

⚠️ Needs attention - Contains unintentional file that should be removed before merging.

Issues

  1. Remove websocket-refactor.patch (77KB) - This git patch file at the repository root is a development artifact and should not be committed. It duplicates all the actual changes in git-format and adds 2067 unnecessary lines.

  2. Potential race in triggerResync() - packages/web/app/components/persistent-session/session-connection.ts:283-284 calls handleReconnect() without awaiting it. If called multiple times rapidly (e.g., visibility changes + corruption detection), could cause concurrent resyncs. Consider adding a guard or awaiting the promise.

Documentation

The docs/websocket-implementation.md is updated with the new iOS background connection loss handling - good.

Tests

New test files have good coverage for:

  • ConnectionStateMachine state transitions and flags
  • isReconnecting derivation logic
  • graphql-client connection state callbacks

No missing test coverage identified for the changes.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 356f498195

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +262 to +264
} else {
this.scheduleRetry();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reset state before scheduling transient reconnect attempts

When joinSession() returns null, connect() treats it as transient and calls scheduleRetry(), but the state machine is still in CONNECTING. The timer then calls connect() again, which exits immediately because of the duplicate-connection guard, so retries never actually run and the session can remain stuck in a perpetual connecting state after a single transient failure.

Useful? React with 👍 / 👎.

Comment on lines +321 to +325
} finally {
// Mark as connected now that resync is complete
if (!this.disposed) {
this.stateMachine.transition('CONNECTED');
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Gate CONNECTED transition on successful reconnect resync

handleReconnect() always transitions to CONNECTED in finally, even when joinSession() failed and the function returned early. Since joinSession() converts errors into null, transient reconnect failures will still flip flags to connected, causing the UI to leave reconnect/view-only mode without a completed rejoin or sync and potentially operate on stale state.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants