Skip to content

4.x: PrivateLink Support Phase 3: Query Implementation, DNS Resolution & Unit Tests 🔌#807

Merged
dkropachev merged 2 commits intoscylladb:scylla-4.xfrom
nikagra:4.x-privatelink-support-queries
Mar 14, 2026
Merged

4.x: PrivateLink Support Phase 3: Query Implementation, DNS Resolution & Unit Tests 🔌#807
dkropachev merged 2 commits intoscylladb:scylla-4.xfrom
nikagra:4.x-privatelink-support-queries

Conversation

@nikagra
Copy link

@nikagra nikagra commented Feb 26, 2026

4.x: PrivateLink Support Phase 3: Query Implementation, DNS Resolution & Tests 🔌

Previous phases:

Overview

Completes the runtime implementation of PrivateLink/client-routes support for ScyllaDB Cloud. The
driver can now query system.client_routes at startup and on control-connection reconnect, resolve
node hostnames to InetSocketAddress via a TTL-aware caching DNS resolver, and dynamically
translate addresses used for every CQL connection.

Server-side feature requires scylladb/scylladb#27323.


Changes

Event Negotiation Fix — ProtocolInitHandler (SCYLLADB-850)

  • Introduced a mutable registerEventTypes list (copy of options.eventTypes) so the REGISTER
    step can be retried with a narrowed event set without mutating the original channel options.
  • When the REGISTER step receives a PROTOCOL_ERROR whose message contains
    CLIENT_ROUTES_CHANGE, the driver logs a warning, removes that event type, and retries REGISTER
    with the remaining types (SCHEMA_CHANGE, STATUS_CHANGE, TOPOLOGY_CHANGE).
  • If no event types remain after stripping, the handler calls setConnectSuccess() directly,
    skipping REGISTER altogether.
  • This makes the driver backward-compatible with ScyllaDB Enterprise < 2026.1: client-routes table
    queries may still work on those versions, but live push-updates via the event are disabled.

DNS Resolution — CachingDnsResolver (new)

  • TTL-based in-memory cache backed by ConcurrentHashMap<String, CacheEntry> with per-hostname
    Semaphore(1) to prevent concurrent re-resolution storms.
  • Fast-path unlocked read; double-checked locking only when the entry is absent or stale.
  • Last-known-good fallback: on UnknownHostException, the previous successful resolution is
    returned and a warning is logged.
  • clearCache() preserves last-known-good entries so a full session re-init survives transient DNS
    failures.

Client Routes Query — ClientRoutesHandler

  • queryAndResolveRoutes() implemented: executes
    SELECT connection_id, host_id, address, port, tls_port FROM %s WHERE connection_id IN :connectionIds ALLOW FILTERING
    via AdminRequestHandler.
  • CachingDnsResolver wired in, replacing the Phase 2 no-op stub.
  • Route map atomically swapped via AtomicReference.set() so in-flight translations see a
    consistent snapshot.

Session Lifecycle — DefaultSession

  • initClientRoutes() inserted between topologyMonitor.init() and
    metadataManager.refreshNodes() in the startup chain.
  • Failure is non-fatal: if the client-routes query fails at startup, the session still opens and
    falls back to direct addressing.

Reconnect Integration — ControlConnection

  • On a successful reconnect, clientRoutesHandler.refresh() is awaited before refreshNodes(),
    so translated addresses are current before nodes are re-evaluated.

Tests

Unit Tests — 22 new tests across 3 suites

Suite # Coverage
CachingDnsResolverTest 7 Cache hit/miss, TTL expiry, concurrency, last-known-good fallback, clearCache()
ClientRoutesHandlerTest 8 Query construction, route-map population, translate with/without SSL, refresh no-op when disabled
ClientRoutesAddressTranslatorTest 7 Address translation pipeline, null pass-through, SSL port selection

Integration Tests — 6 new tests across 2 suites

Suite Requirement # Coverage
ClientRoutesIT ScyllaDB Enterprise ≥ 2026.1 5 Routes loaded on init; refresh() picks up new rows; session opens with empty table; TLS port selection; auto-refresh on control-connection reconnect
ClientRoutesUnsupportedVersionIT ScyllaDB Enterprise < 2026.1 1 Session opens and handler returns null when system.client_routes is absent (graceful degradation)

Both integration test suites are @Category(IsolatedTests.class) and @ScyllaOnly. Positive tests
use a user-space mirror table (test_client_routes.client_routes) to avoid requiring privileged
write access to system.

Additional notes

This PR also uses workaround for SCYLLADB-850: older ScyllaDB Enterprise versions (< 2026.1) reject the CLIENT_ROUTES_CHANGE REGISTER event with a PROTOCOL_ERROR, which previously caused connection initialisation to fail. The driver now gracefully handles this by stripping the unsupported event and retrying REGISTER with the remaining types, so sessions succeed on any server version.


Related to DRIVER-86, DRIVER-88, SCYLLADB-850

@nikagra nikagra force-pushed the 4.x-privatelink-support-queries branch 2 times, most recently from 57574a3 to 26aebdd Compare February 27, 2026 14:26
@nikagra nikagra requested review from Copilot and dkropachev and removed request for Copilot February 27, 2026 14:55
@nikagra nikagra marked this pull request as ready for review February 27, 2026 14:55
@nikagra nikagra requested a review from Copilot February 27, 2026 14:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes Phase 3 of PrivateLink support for the Java driver, implementing the runtime components required for ScyllaDB Cloud deployments. Building on Phase 2's infrastructure (#804), it adds DNS resolution, query execution, and complete lifecycle integration for the client routes feature.

Changes:

  • Implements TTL-based caching DNS resolver with last-known-good fallback and concurrency control
  • Adds query execution for system.client_routes table with hostname resolution and route map management
  • Integrates client routes initialization into session startup and control-connection reconnect lifecycle
  • Implements backward-compatible event negotiation to handle older ScyllaDB versions that don't support CLIENT_ROUTES_CHANGE events

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
pom.xml Upgrades native protocol to 1.5.2.2 to support CLIENT_ROUTES_CHANGE event
manual/core/address_resolution/README.md Documents client routes configuration and usage with examples
core/.../util/AddressParser.java New utility for parsing contact point addresses with IPv4/IPv6/hostname support
core/.../clientroutes/DnsResolver.java Interface defining DNS resolution with caching requirements
core/.../clientroutes/CachingDnsResolver.java TTL-based DNS caching implementation with semaphore-based concurrency control
core/.../clientroutes/ClientRouteInfo.java Data structure representing raw client_routes table rows
core/.../clientroutes/ResolvedClientRoute.java Resolved route with DNS resolution capability
core/.../clientroutes/ClientRoutesHandler.java Main coordinator implementing query execution and route management
core/.../clientroutes/ClientRoutesAddressTranslator.java AddressTranslator implementation delegating to handler
core/.../session/DefaultSession.java Integrates client routes initialization into session startup
core/.../metadata/DefaultTopologyMonitor.java Passes node metadata (hostId, datacenter, rack) to address translator
core/.../control/ControlConnection.java Adds CLIENT_ROUTES_CHANGE event handling and reconnect integration
core/.../context/DefaultDriverContext.java Wires client routes handler into driver context
core/.../context/InternalDriverContext.java Adds getClientRoutesHandler() method
core/.../channel/ProtocolInitHandler.java Implements backward-compatible event negotiation for CLIENT_ROUTES_CHANGE
core/.../session/SessionBuilder.java Adds withClientRoutesConfig() with mutual exclusivity validation
core/.../session/ProgrammaticArguments.java Stores client routes config in programmatic arguments
core/.../config/ClientRoutesConfig.java Configuration class for client routes with builder
core/.../config/ClientRoutesEndpoint.java Represents a single client routes endpoint
core/.../addresstranslation/AddressTranslator.java Adds overloaded translate() method accepting node metadata
core/.../config/DefaultDriverOption.java Minor formatting cleanup (blank line removal)
core/src/main/resources/reference.conf Minor formatting cleanup (blank line removal)
core/src/test/.../util/AddressParserTest.java Comprehensive unit tests for address parsing
core/src/test/.../clientroutes/CachingDnsResolverTest.java Unit tests for DNS resolver cache behavior
core/src/test/.../clientroutes/ClientRoutesHandlerTest.java Unit tests for handler translate() method
core/src/test/.../clientroutes/ClientRoutesAddressTranslatorTest.java Unit tests for address translator delegation
core/src/test/.../session/ClientRoutesSessionBuilderTest.java Unit tests for session builder configuration
core/src/test/.../config/ClientRoutesConfigTest.java Unit tests for configuration validation
core/src/test/.../metadata/DefaultTopologyMonitorTest.java Updates test verification counts for additional getString() calls

@nikagra nikagra force-pushed the 4.x-privatelink-support-queries branch from c24ab8c to dcf7511 Compare March 5, 2026 20:37
@nikagra nikagra force-pushed the 4.x-privatelink-support-queries branch 3 times, most recently from b8b99ec to 9149786 Compare March 9, 2026 18:52
@nikagra nikagra force-pushed the 4.x-privatelink-support-queries branch 2 times, most recently from cd23bae to 942f9f5 Compare March 10, 2026 09:04
@nikagra nikagra force-pushed the 4.x-privatelink-support-queries branch 2 times, most recently from bedea4d to a64a501 Compare March 11, 2026 11:14
@dkropachev dkropachev force-pushed the 4.x-privatelink-support-queries branch 3 times, most recently from 84ca83a to 2f9ded7 Compare March 13, 2026 12:43
dkropachev added a commit to nikagra/java-driver that referenced this pull request Mar 14, 2026
- Rewrite connectionAddrOverrides loop with stream API
- Replace FQN java.io.IOException/UncheckedIOException with imports
  in ClientRoutesEndPoint and ClientRoutesEndPointTest
- Fix containsPort(":9042") bug: leading-colon strings were accepted
  as valid addresses; add unit test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dkropachev added a commit to nikagra/java-driver that referenced this pull request Mar 14, 2026
- Rewrite connectionAddrOverrides loop with stream API
- Replace FQN java.io.IOException/UncheckedIOException with imports
  in ClientRoutesEndPoint and ClientRoutesEndPointTest
- Fix containsPort(":9042") bug: leading-colon strings were accepted
  as valid addresses; add unit test
@dkropachev dkropachev force-pushed the 4.x-privatelink-support-queries branch from e291988 to 13221b3 Compare March 14, 2026 12:45
Add support for PrivateLink/NLB client routes that allows the driver to
discover and connect through private endpoints. This includes:

- ClientRoutesConfig and ClientRouteProxy for configuration
- ClientRoutesTopologyMonitor for route discovery and refresh
- ClientRoutesEndPoint for DNS-based endpoint resolution
- Protocol-level CLIENT_ROUTES_CHANGE event handling
- Coalescing queue for efficient route refresh deduplication
- Integration tests with NLB simulator proxy
@dkropachev dkropachev force-pushed the 4.x-privatelink-support-queries branch from 75afb07 to c6a5370 Compare March 14, 2026 18:30
@dkropachev dkropachev merged commit 2dea0ba into scylladb:scylla-4.x Mar 14, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants