Skip to content

fix: handle Avro reader schema with no fields#9611

Open
mzabaluev wants to merge 3 commits intoapache:mainfrom
mzabaluev:avro-empty-reader-schema
Open

fix: handle Avro reader schema with no fields#9611
mzabaluev wants to merge 3 commits intoapache:mainfrom
mzabaluev:avro-empty-reader-schema

Conversation

@mzabaluev
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

In the degenerate case when the Avro reader schema has no fields, the RecordDecoder should be able to produce empty record batches with the number of rows counted from the data. As an optimization for OCF, the reader could skip decoding altogether, relying on record counts provided by data blocks.

What changes are included in this PR?

A row counter is run in the RecordDecoder state.

Are these changes tested?

Added tests to verify decoder behavior given an empty reader schema for the data files in the test suite.

Are there any user-facing changes?

No.

@github-actions github-actions bot added arrow Changes to the arrow crate arrow-avro arrow-avro crate labels Mar 24, 2026
In the degenerate case when the Avro reader schema has no fields,
the RecordDecoder should be able to produce empty record batches with
the number of rows counted from the data.
In arrow-avro.
Since the async reader only reads OCF files and uses decode_block to
decode, use flush_block for consistency and to avoid unnecessary
checks.
@mzabaluev-flarion mzabaluev-flarion force-pushed the avro-empty-reader-schema branch from f8a3f4a to c518a19 Compare March 24, 2026 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate arrow-avro arrow-avro crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Avro decoder can't handle a reader schema with no fields

1 participant