vault: gracefully handle individual blob broadcast failures in Observation (backport 2.39.2)#21781
Merged
prashantkumar1982 merged 3 commits intorelease/2.39.2from Mar 30, 2026
Conversation
…ation Previously, if any single payload failed to broadcast as a blob during the Observation phase, the entire observation was aborted and returned an error. This is unnecessarily disruptive — one problematic payload (e.g. transient network issue, malformed data) would prevent all other valid payloads from being included in the observation, stalling the OCR round. Now, individual broadcast failures are logged as warnings (with the request ID and error details) and the failed payload is simply excluded from PendingQueueItems. The remaining payloads continue to be broadcast and observed normally. The blob broadcast logic is extracted into a dedicated broadcastBlobPayloads method for clarity. Made-with: Cursor
Check ctx.Err() when BroadcastBlob fails so that context.Canceled and context.DeadlineExceeded are returned immediately rather than swallowed. This preserves fail-fast semantics for expired OCR rounds while still skipping item-specific transient errors. Made-with: Cursor
Each parallel BroadcastBlob call now gets a 2-second timeout derived from the parent context. A slow individual broadcast will be cancelled and skipped without stalling the rest of the batch. Parent context cancellation still propagates immediately for round-level failures. Made-with: Cursor
Contributor
|
✅ No conflicts with other open PRs targeting |
Contributor
|
I see you updated files related to
|
|
justinkaseman
approved these changes
Mar 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.





Backport of #21765 to
release/2.39.2.Summary
During the Observation phase, pending queue payloads are broadcast as blobs in parallel. Previously, if any single broadcast failed, the entire observation was aborted — stalling the OCR round.
This changes the behavior so that individual failures are isolated: a failed broadcast is logged as a warning (with the request ID and error) and that payload is excluded from
PendingQueueItems. All remaining payloads continue normally.Each
BroadcastBlobcall is given a 2-second timeout so a single slow broadcast cannot stall the entire batch. Parent context cancellation/deadline errors are propagated immediately for fail-fast semantics.Made with Cursor