GH-1116: [Java] Fix compressed buffer prefix write and ZSTD dstCapacity by LuciferYang · Pull Request #1119 · apache/arrow-java

LuciferYang · 2026-04-18T13:35:04Z

What

Two fixes in the compression codec:

AbstractCompressionCodec.compress(): uncompressed-length prefix could be written with a wrong value

The old code read uncompressedBuffer.writerIndex() after doCompress() to populate the 8-byte prefix. But uncompressedBuffer is a shared reference to the vector's internal buffer. If writerIndex changed between doCompress() and the subsequent read -- e.g. due to VectorSchemaRoot reuse (clear/allocateNew) -- the prefix would get a wrong value such as 0.

Fix: capture writerIndex() once at the top of compress() and reuse it for the empty-buffer check, size comparison, and prefix write.
ZstdCompressionCodec.doCompress(): dstCapacity overstated by 8 bytes

Zstd.compressUnsafe(dst, dstSize, ...) expects dstSize to be the available space from dst. The code offsets dst by 8 bytes past the prefix, but passed 8 + maxSize instead of maxSize. In practice compressBound() headroom prevented an actual out-of-bounds write, but the parameter was semantically wrong.

Fix: pass maxSize instead of dstSize.

Tests

testMultiBatchZstdStreamWithWideSchemaAndAllNulls -- 100 fields x 10 batches x 500 rows, VectorSchemaRoot reuse with all-null timestamp columns in every 3rd batch, full streaming round-trip with per-cell verification.
testAllNullFixedWidthVectorZstdRoundTrip -- 3469-row all-null TimestampMilliVector, buffer-level compress/decompress, asserts decompressed writerIndex matches the original.

Closes

Closes #1116

github-actions · 2026-04-18T13:38:04Z

Thank you for opening a pull request!

Please label the PR with one or more of:

bug-fix
chore
dependencies
documentation
enhancement

Also, add the 'breaking-change' label if appropriate.

See CONTRIBUTING.md for details.

fix

4c754b1

LuciferYang requested review from jbonofre, laurentgo, lidavidm and wgtmac as code owners April 18, 2026 13:35

LuciferYang changed the title ~~# GH-1116: [Java] Fix compressed buffer prefix write and ZSTD dstCapacity~~ GH-1116: [Java] Fix compressed buffer prefix write and ZSTD dstCapacity Apr 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-1116: [Java] Fix compressed buffer prefix write and ZSTD dstCapacity#1119

GH-1116: [Java] Fix compressed buffer prefix write and ZSTD dstCapacity#1119
LuciferYang wants to merge 1 commit intoapache:mainfrom
LuciferYang:fix-zstd-compressed-buffer-prefix-corruption

LuciferYang commented Apr 18, 2026

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LuciferYang commented Apr 18, 2026

What

Tests

Closes

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant