Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions api/src/main/java/org/apache/iceberg/FileContent.java
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,13 @@
*/
package org.apache.iceberg;

/** Content type stored in a file, one of DATA, POSITION_DELETES, or EQUALITY_DELETES. */
/** Content type stored in a file. */
public enum FileContent {
DATA(0),
POSITION_DELETES(1),
EQUALITY_DELETES(2);
EQUALITY_DELETES(2),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to make this API change so that TrackingInfo can return the contentType. This is backward compatible since this is a new enum value. If we don't want to make any API changes, I will need to change TrackingInfo.contentType() to return an int instead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me. We could also introduce a different enum for v4, but that doesn't seem worth it.

DATA_MANIFEST(3),
DELETE_MANIFEST(4);
Comment on lines +21 to +27
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I agree with the changes, this is the one set of changes that's in the public API.
One option is we hold off on these changes until 1.11, and then that gives more time. The other option is we go ahead but there's a risk (although super unlikely imo) of having to go through deprecation cycle in case we want to remove something. Personally I'm good with moving forward as it is cc @rdblue @nastra in case they have any thoughts here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt we check this in the write path, which means that if this were actually used we would produce files with bad IDs. It's probably a good idea to look through how writes take place and build some confidence that we won't see this from the v2 or v3 writers. We need to do that for v4 anyway (since this creates the potential to write v4 content types into v3) so we should just do it now before merging this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I just read through the write path and I think we should be ok. The v2/v3 writers can't produce files with the new content IDs because of compile time checks:

  1. DataFile path: GenericDataFile hardcodes FileContent.DATA in its constructor, and DataFile.content() defaults to
    DATA
    . There's no way to create a DataFile with DATA_MANIFEST or DELETE_MANIFEST.

  2. DeleteFile path: FileMetadata.Builder.build() has a validation that only accepts POSITION_DELETES and EQUALITY_DELETES.

  3. Writer typing: The v2/v3 manifest writers are ManifestWriter<DataFile> and ManifestWriter<DeleteFile>.

The V2/V3 metadata wrappers do serialize content().id() without version-gating, so there's no defense at the serialization layer itself, but the type checking upstream make it unreachable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a validation here in this commit.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need DELETE_MANIFEST entries in the root manifest file (even without equality deletes), since V3 to V4 upgrade doesn't require rewriting existing data and delete manifest files to colocate the DVs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed. When a table is upgraded from v3, existing delete manifests will need to be referenced from root as DELETE_MANIFEST entries.


private final int id;

Expand Down
64 changes: 64 additions & 0 deletions core/src/main/java/org/apache/iceberg/DeletionVector.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.iceberg;

import org.apache.iceberg.types.Types;

/**
* Metadata about a deletion vector.
*
* <p>Tracks where a DV blob can be read. The DV blob follows the format defined by the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely don't need to abbreviate in the doc, I get paid by the character

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but are you referring to "DV"? That's how we normally refer to them so it didn't seem strange to me.

* deletion-vector-v1 blob type in the Puffin spec.
*/
interface DeletionVector {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe TrackedDeletionVector

We already have a few DeletionVector named things and this is not actually a DV, it's the reference too one

Types.NestedField LOCATION =
Types.NestedField.required(
155, "location", Types.StringType.get(), "Location of the file containing the DV");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were these IDs assigned for this PR or are they from the design doc? We need to make sure we identify which ones can change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assigned them and have updated the design doc. Let me know if these should change.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as these match the doc and we update the comment with the next ID to assign, I'm good.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using existing field?

Types.NestedField FILE_PATH =
      required(100, "file_path", StringType.get(), "Location URI with FS scheme");

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are different fields with different IDs. FILE_PATH is field 100 (the data file's location at the TrackedFile level) while DeletionVector.LOCATION is field 155 (the Puffin file containing the DV blob). Hope that made sense.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. I forgot that the DV schema is part of the manifest schema along with content file.

I am wondering if we should move all manifest file fields (v1-v4) to a single file so that we can make sure the field ids are all unique. when they are spread across multiple files, it is kind of hard to track.

With a consolidated interface class for all fields, maybe we can also generate a unit test to ensure the uniqueness of the field id assignment .

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that having a single source of truth for field IDs and a uniqueness test would be useful. I think that's better as a follow-up though. Consolidating all existing field ID definitions (from DataFile, ManifestFile, ManifestEntry, etc.) is a broader refactor.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we would refactor the ID definitions in DataFile and ManifestFile, but I could be convinced. I think the idea is to consolidate all v4 IDs in a single place. That's fairly reasonable. I agree that we can do it later though.

Types.NestedField OFFSET =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about reusing existing definition?

  Types.NestedField CONTENT_OFFSET =
      optional(
          144, "content_offset", LongType.get(), "The offset in the file where the content starts");

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation. DeletionVector.OFFSET does reuse field ID 144 from DataFile.CONTENT_OFFSET. The field name is shortened to offset because within the deletion_vector struct context since "content_offset" is redundant.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeletionVector struct is part of the manifest file schema that includes the DataFile schema too. we can't use the same field id?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Field IDs 144 (OFFSET) and 145 (SIZE_IN_BYTES) are reused here. In the v4 TrackedFile schema, CONTENT_OFFSET and CONTENT_SIZE from DataFile are not present at the top level . The DV struct serves the same purpose, so the IDs are semantically aligned. Since 144 and 145 only appear once in the TrackedFile schema (inside the DV struct), there's no conflict.

In v1-v3 manifests, these IDs are used by DataFile. In v4 manifests, they're used by DeletionVector. The two schemas don't coexist in the same manifest format.

Did that make sense?

Copy link
Copy Markdown
Contributor

@rdblue rdblue Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v3 and earlier are separate from the v4 read path. That is, we are not trying to have a reader that can use the v4 schema as a projection to read v3 files.

Because we can consider the v4 schema separate, we can decide how to assign IDs. The strategy that I've talked about with Anoop and Amogh is that we should reuse the IDs as much as possible when the field represents the same thing, even though the field may be in a different struct for v4. If you agree with that, then I think it is a good idea to use id=144 here since this is the same field: the offset within a file where the DV blob starts.

And it also makes sense that the DV file location is a new field because we are already using the previous location field for the data file's location and this is nested within the same struct (so it has to be a different field).

Types.NestedField.required(
144, "offset", Types.LongType.get(), "Offset in the file where the DV content starts");
Types.NestedField SIZE_IN_BYTES =
Types.NestedField.required(
145,
"size_in_bytes",
Types.LongType.get(),
"Length of the referenced DV content stored in the file");
Types.NestedField CARDINALITY =
Types.NestedField.required(
156,
"cardinality",
Types.LongType.get(),
"Number of set bits (deleted rows) in the vector");

static Types.StructType schema() {
return Types.StructType.of(LOCATION, OFFSET, SIZE_IN_BYTES, CARDINALITY);
}

/** Returns the location of the file containing the deletion vector. */
String location();

/** Returns the offset in the file where the deletion vector content starts. */
long offset();

/** Returns the size in bytes of the deletion vector content. */
long sizeInBytes();

/** Returns the number of set bits (deleted rows) in the vector. */
long cardinality();
}
38 changes: 38 additions & 0 deletions core/src/main/java/org/apache/iceberg/EntryStatus.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.iceberg;

/** Status of an entry in a manifest file. */
enum EntryStatus {
EXISTING(0),
ADDED(1),
DELETED(2),
/** Indicates an entry that has been replaced by a column update or DV change. Added in v4. */
REPLACED(3);

private final int id;

EntryStatus(int id) {
this.id = id;
}

public int id() {
return id;
}
}
113 changes: 113 additions & 0 deletions core/src/main/java/org/apache/iceberg/ManifestInfo.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.iceberg;

import java.nio.ByteBuffer;
import org.apache.iceberg.types.Types;

/** Summary information about a manifest referenced by a v4 root manifest entry. */
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we don't need to say "v4 here", since it will go out of date

Suggested change
/** Summary information about a manifest referenced by a v4 root manifest entry. */
/** Summary information about a manifest referenced by a root manifest entry. */

interface ManifestInfo {
Types.NestedField ADDED_FILES_COUNT =
Types.NestedField.required(
504, "added_files_count", Types.IntegerType.get(), "Number of files added");
Types.NestedField EXISTING_FILES_COUNT =
Types.NestedField.required(
505, "existing_files_count", Types.IntegerType.get(), "Number of existing files");
Types.NestedField DELETED_FILES_COUNT =
Types.NestedField.required(
506, "deleted_files_count", Types.IntegerType.get(), "Number of deleted files");
Types.NestedField REPLACED_FILES_COUNT =
Types.NestedField.required(
520, "replaced_files_count", Types.IntegerType.get(), "Number of replaced files");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we reserve or reference these in the design doc? I don't see any reference to some of these ids or the fields.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in the last tab of the design doc, where we pivoted to combined entries. @amogh-jahagirdar will likely be moving this into the main tab.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the v4 doc with the fields. cc @amogh-jahagirdar

Types.NestedField ADDED_ROWS_COUNT =
Types.NestedField.required(
512, "added_rows_count", Types.LongType.get(), "Number of rows in added files");
Types.NestedField EXISTING_ROWS_COUNT =
Types.NestedField.required(
513, "existing_rows_count", Types.LongType.get(), "Number of rows in existing files");
Types.NestedField DELETED_ROWS_COUNT =
Types.NestedField.required(
514, "deleted_rows_count", Types.LongType.get(), "Number of rows in deleted files");
Types.NestedField REPLACED_ROWS_COUNT =
Types.NestedField.required(
521, "replaced_rows_count", Types.LongType.get(), "Number of rows in replaced files");
Types.NestedField MIN_SEQUENCE_NUMBER =
Types.NestedField.required(
516,
"min_sequence_number",
Types.LongType.get(),
"Minimum sequence number of files in this manifest");
Types.NestedField DV =
Types.NestedField.optional(
522, "dv", Types.BinaryType.get(), "Deletion vector for manifest entries");
Types.NestedField DV_CARDINALITY =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DELETED_POSITIONS and REPLACED_POSITIONS don't have cardinality and we may not need it here either. I think that the format could easily write the cardinality into the dv bytes. No need for a change just yet, but we'll watch out for whether we should remove this.

Types.NestedField.optional(
523,
"dv_cardinality",
Types.LongType.get(),
"Number of entries marked as deleted in the DV");

static Types.StructType schema() {
return Types.StructType.of(
ADDED_FILES_COUNT,
EXISTING_FILES_COUNT,
DELETED_FILES_COUNT,
REPLACED_FILES_COUNT,
ADDED_ROWS_COUNT,
EXISTING_ROWS_COUNT,
DELETED_ROWS_COUNT,
REPLACED_ROWS_COUNT,
MIN_SEQUENCE_NUMBER,
DV,
DV_CARDINALITY);
}

/** Returns the number of files added by this manifest. */
int addedFilesCount();

/** Returns the number of existing files referenced by this manifest. */
int existingFilesCount();

/** Returns the number of deleted files in this manifest. */
int deletedFilesCount();

/** Returns the number of replaced files in this manifest. */
int replacedFilesCount();

/** Returns the number of rows in added files. */
long addedRowsCount();

/** Returns the number of rows in existing files. */
long existingRowsCount();

/** Returns the number of rows in deleted files. */
long deletedRowsCount();

/** Returns the number of rows in replaced files. */
long replacedRowsCount();

/** Returns the minimum sequence number of files in this manifest. */
long minSequenceNumber();

/** Returns the deletion vector bitmap, or null if not present. */
ByteBuffer dv();

/** Returns the number of entries marked as deleted in the DV, or null if not present. */
Long dvCardinality();
}
Loading
Loading