Skip to content

Improve RunLengthBitPackingHybridDecoder.readNext to avoid per-call buffer allocation and DataInputStream wrapping #3466

@arouel

Description

@arouel

Describe the enhancement requested

RunLengthBitPackingHybridDecoder.readNext() allocates a new int[] and byte[] on every PACKED-mode call. In workloads that decode many bit-packed runs (definition levels, repetition levels, RLE-encoded integers), these allocations dominate the read-side allocation profile. The upstream code even acknowledges this with a // TODO: reuse a buffer comment.

Problem 1: per-call buffer allocation

Lines 94–95 allocate fresh arrays on every PACKED-mode readNext():

currentBuffer = new int[currentCount]; // TODO: reuse a buffer
byte[] bytes = new byte[numGroups * bitWidth];

currentCount is always numGroups * 8, and numGroups is typically small (1–16 groups = 8–128 values per run). These allocations are individually modest but occur thousands of times per column chunk — once per bit-packed run. In a 180M-row merge with multiple integer/boolean columns, the cumulative allocation is substantial.

Since currentCount varies between runs (different numGroups values), the fix retains the field-level int[] and a new field-level byte[], growing them only when the next run requires a larger buffer.

Problem 2: per-call DataInputStream wrapping

Line 98 creates a new DataInputStream(in) on every PACKED-mode call:

new DataInputStream(in).readFully(bytes, 0, bytesToRead);

This allocates a DataInputStream wrapper object per call just to access readFully(). A private readFully() method on the decoder itself eliminates this allocation and the virtual dispatch through the wrapper.

Component(s)

Core

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions