Skip to content

Arrow: Fix vectorized reads for Parquet TIMESTAMP_MILLIS types#14499

Merged
pvary merged 14 commits into
apache:mainfrom
shubham19may:fix/iceberg-arrow-timestamp-vectorization
Nov 19, 2025
Merged

Arrow: Fix vectorized reads for Parquet TIMESTAMP_MILLIS types#14499
pvary merged 14 commits into
apache:mainfrom
shubham19may:fix/iceberg-arrow-timestamp-vectorization

Conversation

@shubham19may

@shubham19may shubham19may commented Nov 4, 2025

Copy link
Copy Markdown
Contributor

Fixes: #14430 & #14046

Description

This PR fixes: Reading Parquet files with TIMESTAMP_MILLIS

Error:

java.lang.ClassCastException: class org.apache.iceberg.shaded.org.apache.arrow.vector.TimeStampMicroTZVector cannot be cast to class org.apache.iceberg.shaded.org.apache.arrow.vector.BigIntVector (org.apache.iceberg.shaded.org.apache.arrow.vector.TimeStampMicroTZVector and org.apache.iceberg.shaded.org.apache.arrow.vector.BigIntVector are in unnamed module of loader 'app')

Cause: when reading parquet files with TIMESTAMP_MILLIS logical type annotation, the VectorizedArrowReader incorrectly allocated a TimeStampMicroTZVector based on iceberg schema (which expects microsecond precision), but the actual reader (TimeStampMicroTZVector) writes raw long values that require a BigIntVector.

Fix:

  • Explicitly create a Field with ArrowType.Int(Long.SIZE, true) type
  • Allocate a BigIntVector that matches what TimestampMillisReader expects and return ReadType.TIMESTAMP_MILLIS to trigger millisecond-to-microsecond conversion

Testing

  1. Creating a Parquet file with spark.sql.parquet.outputTimestampType=TIMESTAMP_MILLIS
  2. Sending the parquet file path to the Iceberg Table to consider it as a data file (an AddFile() API)
  3. Querying with spark.sql.iceberg.vectorization.enabled=true
  4. Confirming successful vectorized reads with correct timestamp values

@github-actions github-actions Bot added the arrow label Nov 4, 2025
Comment on lines +28 to +30
if (System.getProperty(ALLOCATION_MANAGER_TYPE_PROPERTY) == null) {
System.setProperty(ALLOCATION_MANAGER_TYPE_PROPERTY, "Netty");
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well the issue is, arrow's auto-detection of allocation manager type breaks when classes are shaded org.apache.iceberg.shaded.org.apache.arrow.*, because CheckAllocator.check() inspects JAR paths via ProtectionDomain.getCodeSource().getLocation() and doesn't recognize shaded package structures. Setting this property bypasses the broken-based detection, which is essential since our BigIntVector allocation for TIMESTAMP_MILLIS requires a working RootAllocator in shaded Spark runtime environments.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a brief comment explaining why we set arrow.memory.allocation.manager.type to Netty, and consider logging at debug when we default it to Netty so it’s visible at runtime?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@nandorKollar

Copy link
Copy Markdown
Contributor

Is there any way to cover this with tests? One option I can think of is to add a test case to Spark where we write the file with spark.sql.parquet.outputTimestampType=TIMESTAMP_MILLIS, and try to read it back via Arrow batch reader. An other idea, which came to my mind is that restructure the code, handle the special case where Iceberg type is timestamp, but the Parquet type is TIMESTAMP_MILLIS, then we can write a simple test case for that in ArrowSchemaUtil. Not sure if this is a good idea, just a thought.

@github-actions github-actions Bot added the spark label Nov 5, 2025

org.apache.arrow.vector.FieldVector eventTimeVector = root.getVector("event_time");
assertThat(eventTimeVector).isNotNull();
assertThat(eventTimeVector).isInstanceOf(org.apache.arrow.vector.BigIntVector.class);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also assert the value?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

also, in VectorizedParquetDefinitionLevelReader.java, arrow validity buffer was not being properly set for non-null values. (bit should be 1 when Value is not null). Fixed that.

Comment on lines +262 to +272
try (ParquetWriter<InternalRow> writer =
new NativeSparkWriterBuilder(outputFile)
.set("org.apache.spark.sql.parquet.row.attributes", sparkSchema.json())
.set("spark.sql.parquet.writeLegacyFormat", "false")
.set("spark.sql.parquet.outputTimestampType", "TIMESTAMP_MILLIS")
.set("spark.sql.parquet.fieldId.write.enabled", "true")
.build()) {
for (InternalRow row : rows) {
writer.write(row);
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a native parquet writer in the base too?
Maybe we can create a test without spark

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the issue is, iceberg’s native parquet writers explicitly reject TIMESTAMP_MILLIS (check here), as by design iceberg standardizes on microsecond precision.

TIMESTAMP_MILLS support exists only for reading externally-produced files (like from Spark)

To write a test without Spark will add a lot of extra code, and that too using Parquet API. IMO, it’s better to have the current end to end test with Spark.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use ExampleParquetWriter or something similar?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe use AvroParquetWriter to create the Parquet file? Just an idea, maybe it is easier to write the file that way, and it supports millisec precision.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nandorKollar well, for

(a) ExampleParquetWriter : I fear, we have to manually construct Parquet group objects and manually define the schema with TIMESTAMP_MILLIS logical type, which is much more complex than our current approach. Moreover, Iceberg never used ExampleParquetWriter, using it would require writing low-level parquet code.

(b) AvroParquetWriter : Iceberg’s AvroSchemaUtil always uses timestampMicros , never timestampMillis check here. While avro supports timestamp-millis, you cannot use it through. We have to manually create the avro schema for it, and also handle the data conversion. Again, it will bring more unnecessary code.

IMO, we should stick to our current approach.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we plan to support reading Parquet files with TIMESTAMP_MILLIS logical type, we need to support it in all of the readers.

  • Arrow
  • Spark
  • Flink

For this we need to find a way to test it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shubham19may it would be better not to rely on Spark module to cover this case with a test, but rather keep the the arrow module 'self-contained', cover it's functionality with tests there. TestVariantReaders already creates a custom Parquet writer with Avro object modell, it doesn't look too complicated. I'll try to put together an example with ExampleParquetWriter, it might be a bit more complicated. As Iceberg's Parquet writer doesn't write timestamp millis, unfortunately we need to implement our test writer for these types. Actually this is not the only case which is not covered, for example Parquet files with unsigned integer types are not covered either, probably there we can't even use the trick to test it in Spark, as unsigned types are unknown in Spark too.

@shubham19may shubham19may Nov 14, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure @pvary and @nandorKollar.

I have updated the test, moved it away from Spark to the arrow module, inside TestArrowReader.java. Please give it a review whenever you are free, and do tell me if any further changes are required.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shubham19may!

Comment thread arrow/src/main/java/org/apache/iceberg/arrow/ArrowAllocation.java Outdated
Comment thread arrow/src/main/java/org/apache/iceberg/arrow/ArrowAllocation.java Outdated
Comment thread arrow/src/main/java/org/apache/iceberg/arrow/ArrowAllocation.java Outdated
if (setArrowValidityVector) {
BitVectorHelper.setBit(vector.getValidityBuffer(), bufferIdx);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline here is not needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@pvary pvary left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thanks @shubham19may!

One small formal ask only.

Please @nandorKollar and @huaxingao review.

Thanks,
Peter

@nandorKollar

Copy link
Copy Markdown
Contributor

LGTM. Thanks @shubham19may!

One small formal ask only.

Please @nandorKollar and @huaxingao review.

Thanks, Peter

@shubham19may thanks for adding the test case, looks good!


Table table = tables.create(schema, tableLocation);

File testFile = new File(tempDir, "timestamp-millis-test.parquet");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would it be possible to use InMemoryOutputFile instead of temp files? We can maybe rewrite the Arrow tests in a subsequent followup PR.

@nandorKollar

nandorKollar commented Nov 17, 2025

Copy link
Copy Markdown
Contributor

Oh, one more thing: can we please make the title of this change more precise, we no longer fix any 'shaded JAR initialization' problem, that's misleading.

@shubham19may shubham19may changed the title Fix: Arrow vectorized reads for TIMESTAMP_MILLIS and shaded JAR initialization Fix: Arrow vectorized reads for TIMESTAMP_MILLIS Nov 18, 2025
@shubham19may

Copy link
Copy Markdown
Contributor Author

Oh, one more thing: can we please make the title of this change more precise, we no longer fix any 'shaded JAR initialization' problem, that's misleading.

done

@pvary pvary changed the title Fix: Arrow vectorized reads for TIMESTAMP_MILLIS Arrow: Fix vectorized reads for Parquet TIMESTAMP_MILLIS types Nov 18, 2025
@pvary pvary merged commit b2379e5 into apache:main Nov 19, 2025
41 checks passed
@pvary

pvary commented Nov 19, 2025

Copy link
Copy Markdown
Contributor

Merged to main.
Thanks for the fix @shubham19may and @huaxingao and @nandorKollar for the reviews!

@shubham19may

Copy link
Copy Markdown
Contributor Author

Thanks @pvary @huaxingao @nandorKollar for the reviews and help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reading Parquet files in Spark Iceberg with TIMESTAMP_MILLIS parquet type causes ClassCastException

4 participants