Skip to content

feat(server): add integration test for message deduplication#3099

Open
seokjin0414 wants to merge 6 commits intoapache:masterfrom
seokjin0414:2872-add-integration-test-for-message-deduplication
Open

feat(server): add integration test for message deduplication#3099
seokjin0414 wants to merge 6 commits intoapache:masterfrom
seokjin0414:2872-add-integration-test-for-message-deduplication

Conversation

@seokjin0414
Copy link
Copy Markdown
Contributor

Summary

Closes #2872

  • Add integration test for the message deduplication pipeline (7-step scenario)
  • Fix server panic when all messages in a batch are duplicates
  • Fix partition offset calculation after dedup removes mid-batch messages
  • Fix deduplicator not being created for lazily-initialized partitions

Bug fixes

Empty batch panic (messages.rs): After prepare_for_persistence() removes all duplicate messages, subsequent .unwrap() calls on first_timestamp(), last_timestamp(), last_offset() panic. Added empty batch guard.

Offset calculation (messages.rs): last_offset was computed as current_offset + count - 1, which doesn't account for offset gaps created by dedup removal. Changed to use segment.end_offset (the actual last offset from batch).

Deduplicator not created (partitions.rs): init_partition_inner() hardcoded None for the message_deduplicator parameter. Added create_message_deduplicator() call matching the bootstrap path.

Integration test scenario

Step Description Validates
1 Send 10 messages with id=0 (auto UUID) All pass through with unique IDs
2 Send 10 messages with explicit IDs 1-10 Normal dedup registration
3 Re-send IDs 1-10 with different payload Duplicates rejected, original payload preserved
4 Send all-duplicate batch No server crash, count unchanged
5 Send mixed batch (IDs 6-15) Only new IDs 11-15 accepted
6 Verify offsets Monotonically increasing after dedup
7 Wait for TTL expiry, re-send IDs 1-10 Previously seen IDs accepted again

Test plan

  • cargo fmt --all -- --check
  • cargo clippy -p server -p integration --all-targets -- -D warnings
  • cargo test -p integration --test mod -- message_deduplication (CI)

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 54.54%. Comparing base (2d6562b) to head (76e7c6c).

Additional details and impacted files
@@              Coverage Diff              @@
##             master    #3099       +/-   ##
=============================================
- Coverage     73.17%   54.54%   -18.64%     
  Complexity      943      943               
=============================================
  Files          1123     1121        -2     
  Lines         97892    87279    -10613     
  Branches      75065    64471    -10594     
=============================================
- Hits          71632    47602    -24030     
- Misses        23671    37110    +13439     
+ Partials       2589     2567       -22     
Components Coverage Δ
Rust Core 48.59% <100.00%> (-25.46%) ⬇️
Java SDK 62.30% <ø> (ø)
C# SDK 69.11% <ø> (-0.29%) ⬇️
Python SDK 81.43% <ø> (ø)
Node SDK 91.53% <ø> (+0.22%) ⬆️
Go SDK 39.41% <ø> (ø)
Files with missing lines Coverage Δ
core/server/src/shard/system/messages.rs 88.00% <100.00%> (+0.02%) ⬆️
core/server/src/shard/system/partitions.rs 78.51% <100.00%> (+0.07%) ⬆️
core/server/src/streaming/partitions/journal.rs 85.54% <100.00%> (+0.17%) ⬆️

... and 247 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@seokjin0414 seokjin0414 force-pushed the 2872-add-integration-test-for-message-deduplication branch 4 times, most recently from 6fdbeb7 to 46db1ea Compare April 11, 2026 09:18
Comment thread core/integration/tests/server/scenarios/message_deduplication_scenario.rs Outdated
… calculation

- Add empty batch guard after prepare_for_persistence() to prevent
  server panic when all messages in a batch are duplicates
- Fix partition offset calculation to use actual last offset from batch
  instead of arithmetic that ignores gaps created by dedup removal
- Create message deduplicator for lazily-initialized partitions in
  init_partition_inner() instead of hardcoding None

Signed-off-by: shin <sars21@hanmail.net>
Add 7-step scenario testing the full deduplication pipeline:
- Auto-generated IDs (id=0) all pass through with unique UUIDs
- Explicit IDs are accepted on first send
- Duplicate IDs are rejected, original payload preserved
- All-duplicate batch does not crash server (regression for empty batch)
- Mixed batch with partial duplicates only accepts new IDs
- Offsets are monotonically increasing after dedup removal
- TTL expiry allows previously seen IDs to be accepted again

Signed-off-by: shin <sars21@hanmail.net>
Signed-off-by: shin <sars21@hanmail.net>
…urnal offset tracking

- When all messages in a batch are duplicates, advance partition offset
  past the assigned (but removed) offset range to prevent offset reuse
  in subsequent batches
- Fix journal current_offset to use actual last offset from batch
  instead of arithmetic that ignores gaps created by dedup removal

Signed-off-by: shin <sars21@hanmail.net>
- Reduce builder duplication with conditional .id() call
- Add comment linking DEDUP_TTL_SECS to server config in specific.rs

Signed-off-by: shin <sars21@hanmail.net>
@seokjin0414 seokjin0414 force-pushed the 2872-add-integration-test-for-message-deduplication branch from 46db1ea to e3efc2c Compare April 18, 2026 05:41
The conditional mut builder pattern causes a type mismatch because
bon's type-state changes the builder type after calling .id().
Use maybe_id() which accepts Option<u128> in a single chain instead.

Signed-off-by: shin <sars21@hanmail.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add integration test for message deduplication

2 participants