Skip to content

Add LWTRetryPolicy: retry CAS timeouts on same host with backoff#783

Draft
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:feature/lwt-retry-policy
Draft

Add LWTRetryPolicy: retry CAS timeouts on same host with backoff#783
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:feature/lwt-retry-policy

Conversation

@mykaul
Copy link
Copy Markdown

@mykaul mykaul commented Apr 1, 2026

Summary

LWT queries use Paxos consensus where the first replica (Paxos coordinator/leader) drives the consensus rounds. When a CAS write times out, retrying on a different host causes Paxos contention — the new coordinator must compete with the original, potentially causing cascading timeouts across the cluster.

Currently, no built-in retry policy retries CAS write timeouts at all — they are all RETHROWN immediately:

  • RetryPolicy.on_write_timeout: CAS → RETHROW
  • ExponentialBackoffRetryPolicy.on_write_timeout: CAS → RETHROW
  • DowngradingConsistencyRetryPolicy.on_write_timeout: CAS → RETHROW

This PR adds LWTRetryPolicy, a new retry policy that extends ExponentialBackoffRetryPolicy with LWT-aware behavior:

Scenario Decision Rationale
CAS write timeout RETRY same host + backoff Stay on Paxos coordinator to avoid contention
Serial read timeout RETRY same host + backoff CAS read at serial CL, same coordinator logic
Serial unavailable RETRY next host + backoff Paxos quorum lost on this node, try another
Non-CAS operations Delegate to parent Standard ExponentialBackoffRetryPolicy behavior

This is modeled after gocql's LWTRetryPolicy interface, which retries LWT queries on the same host to avoid Paxos contention. The key comment from gocql (line 188):

"Retrying on a different host is fine for normal (non-LWT) queries, but in case of LWTs it will cause Paxos contention and possibly even timeouts if other clients send statements touching the same partition to the same time."

Usage

from cassandra.cluster import Cluster
from cassandra.policies import LWTRetryPolicy

# Use as the default retry policy
cluster = Cluster(default_retry_policy=LWTRetryPolicy(max_num_retries=3))

# Or assign to a specific statement
statement.retry_policy = LWTRetryPolicy(max_num_retries=5)

Changes

  • cassandra/policies.py: Added LWTRetryPolicy class (extends ExponentialBackoffRetryPolicy)
  • tests/unit/test_policies.py: Added LWTRetryPolicyTest with 21 tests

Tests

21 new tests covering:

  • CAS write timeout retries on same host with backoff
  • Backoff delay increases with retry attempts
  • Max retries exceeded → RETHROW
  • Consistency level preserved across retries
  • Non-CAS writes delegate to parent (SIMPLE→RETHROW, BATCH_LOG→RETRY, COUNTER→RETHROW)
  • Serial read timeout retries on same host (SERIAL and LOCAL_SERIAL)
  • Serial unavailable retries on next host
  • Non-serial operations delegate to parent policy
  • Request errors inherit parent behavior
  • Constructor defaults and customization
  • All methods return proper 3-tuples

All 103 tests in tests/unit/test_policies.py pass.

Related

LWT queries use Paxos consensus where the coordinator is the Paxos leader.
Retrying on a different host causes Paxos contention — the new coordinator
must compete with the original one, potentially causing cascading timeouts.

LWTRetryPolicy (extends ExponentialBackoffRetryPolicy) handles this by:
- CAS write timeouts: retry on SAME host with exponential backoff
- Serial consistency read timeouts: retry on SAME host with backoff
- Serial consistency unavailable: retry on NEXT host (paxos quorum lost)
- Non-CAS operations: delegate to base ExponentialBackoffRetryPolicy

Modeled after gocql's LWTRetryPolicy interface.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant