Skip to content

GH-1116: [Java] Fix compressed buffer prefix write and ZSTD dstCapacity#1119

Open
LuciferYang wants to merge 1 commit intoapache:mainfrom
LuciferYang:fix-zstd-compressed-buffer-prefix-corruption
Open

GH-1116: [Java] Fix compressed buffer prefix write and ZSTD dstCapacity#1119
LuciferYang wants to merge 1 commit intoapache:mainfrom
LuciferYang:fix-zstd-compressed-buffer-prefix-corruption

Conversation

@LuciferYang
Copy link
Copy Markdown

What

Two fixes in the compression codec:

  1. AbstractCompressionCodec.compress(): uncompressed-length prefix could be written with a wrong value

    The old code read uncompressedBuffer.writerIndex() after doCompress() to populate the 8-byte prefix. But uncompressedBuffer is a shared reference to the vector's internal buffer. If writerIndex changed between doCompress() and the subsequent read -- e.g. due to VectorSchemaRoot reuse (clear/allocateNew) -- the prefix would get a wrong value such as 0.

    Fix: capture writerIndex() once at the top of compress() and reuse it for the empty-buffer check, size comparison, and prefix write.

  2. ZstdCompressionCodec.doCompress(): dstCapacity overstated by 8 bytes

    Zstd.compressUnsafe(dst, dstSize, ...) expects dstSize to be the available space from dst. The code offsets dst by 8 bytes past the prefix, but passed 8 + maxSize instead of maxSize. In practice compressBound() headroom prevented an actual out-of-bounds write, but the parameter was semantically wrong.

    Fix: pass maxSize instead of dstSize.

Tests

  • testMultiBatchZstdStreamWithWideSchemaAndAllNulls -- 100 fields x 10 batches x 500 rows, VectorSchemaRoot reuse with all-null timestamp columns in every 3rd batch, full streaming round-trip with per-cell verification.
  • testAllNullFixedWidthVectorZstdRoundTrip -- 3469-row all-null TimestampMilliVector, buffer-level compress/decompress, asserts decompressed writerIndex matches the original.

Closes

Closes #1116

@LuciferYang LuciferYang changed the title # GH-1116: [Java] Fix compressed buffer prefix write and ZSTD dstCapacity GH-1116: [Java] Fix compressed buffer prefix write and ZSTD dstCapacity Apr 18, 2026
@github-actions
Copy link
Copy Markdown

Thank you for opening a pull request!

Please label the PR with one or more of:

  • bug-fix
  • chore
  • dependencies
  • documentation
  • enhancement

Also, add the 'breaking-change' label if appropriate.

See CONTRIBUTING.md for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[IPC][ZSTD] Compressed buffer prefix can be written as 0 while ZSTD frame has non-zero content size

1 participant