Refactor integration tests to prepare node addition#20
Refactor integration tests to prepare node addition#20rdettai-sk wants to merge 1 commit intosekoiafrom
Conversation
There was a problem hiding this comment.
Pull request overview
Refactors the integration-test cluster sandbox to use explicit node names (instead of selecting nodes purely by QuickwitService) to make multi-node scenarios easier to extend—particularly for rolling-upgrade experiments.
Changes:
- Updated
ClusterSandboxBuilderto require a node name when adding nodes and adjustedClusterSandbox::rest_clientto select nodes by name. - Added
STANDALONE_NODE_NAMEand migrated many tests to use named nodes plus the newwait_for_indexing_pipelines(node_name, ...)/shutdown_nodes([...])APIs. - Simplified sandbox internals by storing per-node config/shutdown data together and exposing configs via an iterator (
node_configs()).
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| quickwit/quickwit-integration-tests/src/test_utils/cluster_sandbox.rs | Core refactor: explicit node naming, rest client lookup by node name, updated wait/shutdown helpers, internal node struct. |
| quickwit/quickwit-integration-tests/src/test_utils/mod.rs | Adds STANDALONE_NODE_NAME constant for consistent standalone naming. |
| quickwit/quickwit-integration-tests/src/test_utils/shutdown.rs | Simplifies shutdown handle by removing stored node metadata. |
| quickwit/quickwit-integration-tests/src/tests/basic_tests.rs | Migrates standalone + multi-node tests to named nodes and new sandbox APIs. |
| quickwit/quickwit-integration-tests/src/tests/ingest_v1_tests.rs | Migrates to named nodes and constructs a legacy-ingest client explicitly. |
| quickwit/quickwit-integration-tests/src/tests/ingest_v2_tests.rs | Migrates to named nodes; updates pipeline waits/shutdown; builds a custom detailed-ingest client. |
| quickwit/quickwit-integration-tests/src/tests/no_cp_tests.rs | Migrates to named nodes and uses shutdown_nodes instead of service-based shutdown. |
| quickwit/quickwit-integration-tests/src/tests/otlp_tests.rs | Migrates to named nodes, per-node pipeline waits, and shutdown_nodes. |
| quickwit/quickwit-integration-tests/src/tests/sqs_tests.rs | Migrates to named nodes and updates REST client calls / pipeline waits accordingly. |
| quickwit/quickwit-integration-tests/src/tests/tls_tests.rs | Migrates TLS tests to named nodes and updated node-config access pattern. |
| quickwit/quickwit-integration-tests/src/tests/update_tests/mod.rs | Updates helper to resolve the searcher node id and use name-based rest_client. |
| quickwit/quickwit-integration-tests/src/tests/update_tests/create_on_update.rs | Migrates update tests to named nodes and name-based clients/pipeline waits. |
| quickwit/quickwit-integration-tests/src/tests/update_tests/doc_mapping_tests.rs | Switches to standalone helper + name-based clients and pipeline waits. |
| quickwit/quickwit-integration-tests/src/tests/update_tests/restart_indexer_tests.rs | Switches to standalone helper + name-based clients and pipeline waits. |
| quickwit/quickwit-integration-tests/src/tests/update_tests/search_settings_tests.rs | Migrates to named nodes and name-based clients/pipeline waits. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let node_names_set: HashSet<String> = node_names | ||
| .into_iter() | ||
| .map(|s| s.as_ref().to_string()) | ||
| .collect(); | ||
| let mut i = 0; | ||
| let shutdown_services_map = HashSet::from_iter(shutdown_services); | ||
| while i < self.node_shutdown_handles.len() { | ||
| let handler_services = &self.node_shutdown_handles[i].node_services; | ||
| if !handler_services.is_subset(&shutdown_services_map) { | ||
| while i < self.nodes.len() { | ||
| let node_name = &self.nodes[i].node_name; | ||
| if !node_names_set.contains(node_name) { | ||
| i += 1; | ||
| continue; | ||
| } |
There was a problem hiding this comment.
shutdown_nodes silently ignores unknown node_names (it just never removes anything from self.nodes). This can mask test bugs (e.g., typos) because shutdown_nodes(["idxoner"]).await.unwrap() would succeed and shut down nothing. Consider validating that every requested name existed (return an error or panic in tests) when node_names_set contains entries not found in the cluster.
| .find(|cfg| cfg.enabled_services.contains(&QuickwitService::Indexer)) | ||
| .unwrap(); | ||
| let url = | ||
| reqwest::Url::parse(&format!("http://{}", node_config.rest_config.listen_addr)).unwrap(); |
There was a problem hiding this comment.
This test builds a QuickwitClient from a hard-coded http://… URL and doesn’t apply the same TLS/CA handling as ClusterSandbox::rest_client. That makes the test diverge from the sandbox helpers and will break if the standalone node is ever started with REST TLS enabled. Consider adding a sandbox helper to build a client for a node with extra options (e.g., detailed_response(true)) so URL scheme + CA config stay consistent.
| reqwest::Url::parse(&format!("http://{}", node_config.rest_config.listen_addr)).unwrap(); | |
| reqwest::Url::parse(&format!("{}", node_config.rest_config.listen_addr)).unwrap(); |
| .add_node( | ||
| "indexitor-1", | ||
| [QuickwitService::Indexer, QuickwitService::Janitor], | ||
| ) | ||
| .add_node( | ||
| "indexitor-2", | ||
| [QuickwitService::Indexer, QuickwitService::Janitor], | ||
| ) |
There was a problem hiding this comment.
Node names like "indexitor-1"/"indexitor-2" look like a typo of “indexer”. Since these names are now used as identifiers throughout the test, consider renaming them to something clearer (e.g., indexer-1, indexer-2) to improve readability and reduce the chance of copy/paste mistakes.
| @@ -281,47 +329,14 @@ impl ClusterSandbox { | |||
| } | |||
|
|
|||
| /// Returns a client to one of the nodes that runs the specified service | |||
There was a problem hiding this comment.
The doc comment for rest_client still says it returns a client for a node running a specified service, but the function now takes a node_name. Please update the comment to match the new API (e.g., “Returns a client for the node with the given name”).
| /// Returns a client to one of the nodes that runs the specified service | |
| /// Returns a client for the node with the given name. |
| let metastore_node_id = self | ||
| .find_node_for_service(QuickwitService::Metastore) | ||
| .node_id; | ||
| wait_until_predicate( | ||
| || async move { | ||
| || async { | ||
| match self | ||
| .rest_client(QuickwitService::Metastore) | ||
| .rest_client(metastore_node_id.as_str()) | ||
| .cluster() |
There was a problem hiding this comment.
In wait_for_cluster_num_ready_nodes, the debug message uses result.live_nodes.len() for the “got …” value while the check is against result.ready_nodes.len(). Consider logging ready_nodes.len() to avoid misleading output when this wait loop fails.
|
I still need to clean this up |
A refactoring of integration tests to simplifiy the addition of new node which are meant to simplify experiments on rolling upgrades.