[lake/iceberg] Support snapshot expiration for lake tables#2881
[lake/iceberg] Support snapshot expiration for lake tables#2881vamossagar12 wants to merge 3 commits intoapache:mainfrom
Conversation
|
@luoyuxia please review this PR. Thanks! |
There was a problem hiding this comment.
Pull request overview
Adds Iceberg support for automatic snapshot expiration in Fluss lake tiering commits, controlled by existing table/tiering configs (mirroring prior Paimon behavior) and validated via new Iceberg tiering tests.
Changes:
- Trigger Iceberg snapshot expiration after successful tiering commits when either
table.datalake.auto-expire-snapshotorlake.tiering.auto-expire-snapshotis enabled. - Pass full
CommitterInitContextintoIcebergLakeCommitterso it can read table/tiering configs. - Add a parameterized Iceberg tiering test to verify snapshot retention/expiration behavior under different config combinations.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| fluss-lake/fluss-lake-iceberg/src/main/java/org/apache/fluss/lake/iceberg/tiering/IcebergLakeTieringFactory.java | Wires CommitterInitContext into the committer so commit-time behavior can depend on configs. |
| fluss-lake/fluss-lake-iceberg/src/main/java/org/apache/fluss/lake/iceberg/tiering/IcebergLakeCommitter.java | Implements optional snapshot expiration after commits based on table/tiering configuration. |
| fluss-lake/fluss-lake-iceberg/src/test/java/org/apache/fluss/lake/iceberg/tiering/IcebergTieringTest.java | Adds parameterized coverage for snapshot expiration and enables passing tiering config into the committer in tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...uss-lake-iceberg/src/test/java/org/apache/fluss/lake/iceberg/tiering/IcebergTieringTest.java
Show resolved
Hide resolved
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
@luoyuxia gentle ping! |
leekeiabstraction
left a comment
There was a problem hiding this comment.
TY for the PR, left a comment. PTAL
| private void expireSnapshots() { | ||
| try { | ||
| ExpireSnapshots expireSnapshots = | ||
| icebergTable.expireSnapshots().cleanExpiredFiles(true); |
There was a problem hiding this comment.
Does this expire all non latest snapshots? If so, would this cause recovery failure in the case when tiering service failover and restarts from latest Flink checkpoint and couldn't find a deleted iceberg snapshot?
Purpose
Linked issue: close #2213
Brief change log
Similar to #2182, this PR adds support for snapshot expiration for Iceberg tables.
Tests
fluss-lake/fluss-lake-iceberg/src/test/java/org/apache/fluss/lake/iceberg/tiering/IcebergTieringTest.java
API and Format
N/A
Documentation
No. The configs
table.datalake.auto-expire-snapshotandtiering service config lake.tiering.auto-expire-snapshotalready exist. This PR makes Iceberg implement it.