Skip to content

Fix iOS connection loss: track WebSocket state and show reconnecting …#906

Open
marcodejongh wants to merge 8 commits intomainfrom
claude/fix-ios-connection-loss-1QjLN
Open

Fix iOS connection loss: track WebSocket state and show reconnecting …#906
marcodejongh wants to merge 8 commits intomainfrom
claude/fix-ios-connection-loss-1QjLN

Conversation

@marcodejongh
Copy link
Owner

…overlay

When iOS Safari backgrounds the app, the OS silently kills the WebSocket connection. Previously the UI stayed fully interactive while changes silently failed to sync. This adds real-time WebSocket connection state tracking via graphql-ws on.connected/on.closed callbacks, a visibilitychange handler to proactively detect dead sockets on iOS foreground return, and a "Reconnecting..." overlay on the queue control bar that blocks interaction until the connection recovers.

https://claude.ai/code/session_016bdrJ2mAmjzSmZ2z6vWGN9

@claude
Copy link

claude bot commented Mar 3, 2026

Claude Review

Ready to merge - Minor issues noted below, but nothing blocking.

Issues

  1. Missing documentation update - This PR adds iOS-specific connection handling (visibilitychange detection, isReconnecting state, reconnecting overlay) that should be documented in docs/websocket-implementation.md under "Failure States and Recovery". The existing docs describe client disconnection recovery but don't mention iOS Safari killing background WebSockets or the new UI feedback mechanism.

  2. No tests added - The PR changes critical connection state handling without test coverage for:

    • isReconnecting state transitions (hasConnected && !isWebSocketConnected)
    • Visibility change handler behavior
    • Overlay rendering conditions

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 27d953fc2b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@claude
Copy link

claude bot commented Mar 3, 2026

Claude Review

Ready to merge - Minor issues noted below, but nothing blocking.

Issues

  1. No tests for new state machine - The isReconnecting state derivation and the reconnection flow have several edge cases (initial connection vs reconnection, deferred state update after resync). Consider adding unit tests for:
    • Initial connection: isReconnecting should remain false
    • Reconnection: isReconnecting should be true until handleReconnect completes
    • Visibility change debouncing behavior

Notes

  • Documentation properly updated in docs/websocket-implementation.md
  • The reconnection logic with deferred setIsWebSocketConnected(true) in the finally block is correctly implemented
  • Theme-aware overlay using design tokens is consistent with project patterns

@claude
Copy link

claude bot commented Mar 3, 2026

Claude Review

Ready to merge - Well-implemented iOS reconnection handling with proper state tracking, documentation, and tests.

Minor issues:

  1. Test description mismatch - packages/web/app/components/persistent-session/__tests__/connection-state.test.ts:399: Comment says "neither connected" but test passes isWebSocketConnected = true. The assertion is correct but the description is misleading.

  2. Unrelated package-lock changes - The dev to devOptional changes for playwright packages appear unrelated to this PR.

Notes:

  • Documentation properly updated in docs/websocket-implementation.md
  • Test coverage is good for the core logic (graphql-client callbacks, derivation, visibility debounce)

@claude
Copy link

claude bot commented Mar 3, 2026

Claude Review

Ready to merge - Minor issues noted below, but nothing blocking.

Issues

  1. Silently swallowed Redis errors - packages/backend/src/services/distributed-state.ts:299,303 - The .catch(() => {}) pattern silently ignores Redis cleanup errors. While fire-and-forget is reasonable for cleanup, consider at minimum logging the error for debugging.

  2. Unrelated package-lock.json changes - The dev to devOptional changes for playwright/bufferutil are unrelated to the PR purpose. Minor, but worth noting.

  3. Tests duplicate logic rather than test component - packages/web/app/components/persistent-session/__tests__/connection-state.test.ts tests an extracted function that mirrors the component logic rather than testing the actual isReconnecting derivation in context. If the component implementation changes, these tests could pass while the actual behavior differs. (Acknowledged in test comments, but worth considering integration test coverage.)

Documentation

Documentation update to docs/websocket-implementation.md is thorough and matches the implementation.

claude and others added 6 commits March 4, 2026 10:48
…overlay

When iOS Safari backgrounds the app, the OS silently kills the WebSocket
connection. Previously the UI stayed fully interactive while changes silently
failed to sync. This adds real-time WebSocket connection state tracking via
graphql-ws on.connected/on.closed callbacks, a visibilitychange handler to
proactively detect dead sockets on iOS foreground return, and a
"Reconnecting..." overlay on the queue control bar that blocks interaction
until the connection recovers.

https://claude.ai/code/session_016bdrJ2mAmjzSmZ2z6vWGN9
The on.connected callback was setting isWebSocketConnected=true immediately,
before handleReconnect could finish its async joinSession/delta-sync path.
This caused isReconnecting to become false prematurely, re-enabling queue
mutations while stale state was still being replayed.

Fix: onConnectionStateChange now receives an isReconnect flag. On
reconnections, the persistent session defers setting isWebSocketConnected=true
until handleReconnect's finally block, keeping the reconnecting overlay
and viewOnlyMode active throughout the entire resync.

Also adds iOS background connection loss documentation to
docs/websocket-implementation.md (section 7 under Failure States).

https://claude.ai/code/session_016bdrJ2mAmjzSmZ2z6vWGN9
- graphql-client.test.ts: Tests onConnectionStateChange callback fires
  with correct (connected, isReconnect) flags across initial connect,
  disconnect, and reconnection sequences. Verifies callback ordering
  (stateChange fires before onReconnect).

- connection-state.test.ts: Tests isReconnecting derivation truth table,
  reconnect lock timing (isReconnecting stays true during resync), and
  visibilitychange debounce behavior (300ms delay, rapid-fire cancellation,
  hidden state ignored).

https://claude.ai/code/session_016bdrJ2mAmjzSmZ2z6vWGN9
The full-page overlay blocked all user interaction. Replace it with a
bottom-center Snackbar/Alert that shows "Reconnecting..." with a spinner
and automatically disappears when the connection is re-established.
Queue mutations are still prevented by viewOnlyMode during reconnection.

https://claude.ai/code/session_016bdrJ2mAmjzSmZ2z6vWGN9
Break the BoardSessionBridge reactivation loop that caused infinite
connect/disconnect cycles after join failures. Track failed session IDs
so the bridge doesn't re-activate a session that was just cleared.

Also harden getSessionMembers() against partial Redis data (missing
username would violate String! schema constraint) and add production-safe
logging to the joinSession resolver.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@marcodejongh marcodejongh force-pushed the claude/fix-ios-connection-loss-1QjLN branch from 72f6d02 to dd0b4f6 Compare March 4, 2026 09:55
@claude
Copy link

claude bot commented Mar 4, 2026

Claude Review

Ready to merge - Well-designed iOS connection recovery with good test coverage and documentation.

Minor issues:

  1. Unconditional production logging (packages/backend/src/graphql/resolvers/sessions/mutations.ts:130, 243) - Two console.log statements run unconditionally instead of being gated by DEBUG. This adds log noise in production:

    console.log(`[joinSession] connectionId: ${ctx.connectionId.slice(0, 8)}...`);
  2. Unrelated lockfile changes (package-lock.json) - Playwright packages changed from dev to devOptional. Not a problem, but seems unrelated to this PR.

Verification notes:

  • Tests cover the key derivation logic (isReconnecting = hasConnected && !isWebSocketConnected) and visibilitychange debouncing
  • Documentation updated in docs/websocket-implementation.md with sequence diagrams and implementation details
  • The reconnection lock pattern (deferring setIsWebSocketConnected(true) until handleReconnect completes) is correctly implemented with try-finally ensuring the flag is set even on resync failure

The "detect failed session" effect in BoardSessionBridge was marking
sessions as failed on initial mount when activeSession is null (not yet
set), preventing the activation effect from ever running. Now only marks
a session as failed after it was previously active and then cleared.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Mar 4, 2026

Claude Review

Ready to merge - Solid implementation for iOS connection loss detection with proper state management, tests, and documentation.

Notes (non-blocking)

  • Tests are logic-focused: The tests verify the derivation truth table and debounce behavior in isolation, which is appropriate. Integration testing the full WebSocket reconnection flow would be complex to set up and mock properly.

  • Documentation is well-updated: The new "iOS Background Connection Loss" section in docs/websocket-implementation.md accurately describes the implementation with sequence diagrams.

  • Context threading is correct: isReconnecting properly flows through PersistentSessionContextGraphQLQueueContextQueueBridgeContextQueueControlBar.

The EVENTS_REPLAY query was using GraphQL field aliases (addedItem: item,
currentItem: item), so the response already had aliased field names. But
transformToSubscriptionEvent was reading event.item (the un-aliased name),
which was undefined at runtime. Remove the aliases so the response matches
the QueueEvent type, and the transform function correctly remaps the fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants