Skip to content

arrow-select: optimise coalesced takes for primitive and view arrays#9758

Open
ClSlaid wants to merge 3 commits intoapache:mainfrom
ClSlaid:optimize-pr-8991-followup
Open

arrow-select: optimise coalesced takes for primitive and view arrays#9758
ClSlaid wants to merge 3 commits intoapache:mainfrom
ClSlaid:optimize-pr-8991-followup

Conversation

@ClSlaid
Copy link
Copy Markdown
Contributor

@ClSlaid ClSlaid commented Apr 18, 2026

Summary

  • add a direct BatchCoalescer::push_batch_with_indices path for primitive, Utf8View, and BinaryView columns when the indices are integer typed and non-null
  • specialise indexed copying for primitive and byte-view in-progress arrays so supported schemas can coalesce rows directly without materialising an intermediate taken RecordBatch
  • keep other data types on the existing take_record_batch fallback; benchmark work on this branch showed widening the direct path beyond primitive and view arrays regressed Utf8 and dictionary-backed cases

Testing

  • cargo test -p arrow-select coalesce --lib
  • cargo clippy -p arrow-select --lib --tests -- -D warnings
  • cargo clippy -p arrow --bench coalesce_kernels --features test_utils -- -D warnings
  • cargo clippy --workspace --all-targets -- -D warnings

Benchmarks

  • take: primitive, 8192, nulls: 0, selectivity: 0.01: 3.5194-3.5796 ms -> 1.8780-1.9136 ms
  • take: primitive, 8192, nulls: 0.1, selectivity: 0.01: 5.5208-5.5708 ms -> 4.0016-4.1647 ms
  • take: primitive, 8192, nulls: 0, selectivity: 0.001: 23.684-23.813 ms -> 5.9713-6.0137 ms
  • take: single_utf8view, 8192, nulls: 0, selectivity: 0.01: 3.0301-3.0830 ms -> 2.4513-2.4854 ms
  • take: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.01: 1.8643-1.8823 ms -> 1.2706-1.2856 ms
  • take: single_binaryview, 8192, nulls: 0, selectivity: 0.01: 3.1346-3.2991 ms -> 2.7578-2.8539 ms
  • take: mixed_binaryview (max_string_len=20), 8192, nulls: 0, selectivity: 0.01: 1.9634-2.0215 ms -> 1.4117-1.4383 ms

## What
- add a direct `push_batch_with_indices` path in `BatchCoalescer` for primitive, `Utf8View`, and `BinaryView` columns when the index array is integer typed and non-null
- teach the coalescer internals to copy indexed primitive and view values directly into in-progress output buffers instead of materialising an intermediate taken `RecordBatch`
- add dedicated `take:` coalesce benchmarks and new indexed coalescing tests for primitive, `Utf8View`, and `BinaryView` inputs

## How
- route supported batches through a direct indices path that chunks the input indices across coalesced output batch boundaries and reuses the existing in-progress array builders
- specialise `InProgressPrimitiveArray::copy_indices` to gather values and build the taken null mask directly from the source array
- specialise `InProgressByteViewArray::copy_indices` to gather selected views and nulls directly, compute buffer compaction from the selected views, and lazily compute whole-array buffer usage only when the row-copy path needs it
- keep unsupported types on the existing `take_record_batch` fallback so the optimisation only applies where the benchmark data shows it is profitable

## Why It Works
- the previous `push_batch_with_indices` implementation always paid to allocate and populate a temporary taken `RecordBatch` before coalescing
- for primitive and view arrays, the coalescer can write the selected rows straight into its output builders, avoiding that extra batch materialisation and the extra copy it implies
- the view-array path remains safe because it preserves the existing reuse-vs-compact behaviour, but bases sparse compaction decisions on the actually selected views rather than the whole source batch

## Tests And Validation
- added indexed coalescing tests for mixed primitive, mixed `Utf8View`, mixed `BinaryView`, and `Utf8` fallback behaviour in `arrow-select/src/coalesce.rs`
- added `take:` coalesce benchmarks in `arrow/benches/coalesce_kernels.rs` covering primitive, `Utf8View`, `BinaryView`, `Utf8`, and dictionary-backed schemas
- validated with:
  - `cargo test -p arrow-select coalesce --lib`
  - `cargo clippy -p arrow-select --lib --tests -- -D warnings`
  - `cargo clippy -p arrow --bench coalesce_kernels --features test_utils -- -D warnings`

## Benchmark Summary
- `take: primitive, 8192, nulls: 0, selectivity: 0.01`: `3.5194-3.5796 ms` -> `1.8780-1.9136 ms`
- `take: primitive, 8192, nulls: 0.1, selectivity: 0.01`: `5.5208-5.5708 ms` -> `4.0016-4.1647 ms`
- `take: primitive, 8192, nulls: 0, selectivity: 0.001`: `23.684-23.813 ms` -> `5.9713-6.0137 ms`
- `take: single_utf8view, 8192, nulls: 0, selectivity: 0.01`: `3.0301-3.0830 ms` -> `2.4513-2.4854 ms`
- `take: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.01`: `1.8643-1.8823 ms` -> `1.2706-1.2856 ms`
- `take: single_binaryview, 8192, nulls: 0, selectivity: 0.01`: `3.1346-3.2991 ms` -> `2.7578-2.8539 ms`
- `take: mixed_binaryview (max_string_len=20), 8192, nulls: 0, selectivity: 0.01`: `1.9634-2.0215 ms` -> `1.4117-1.4383 ms`

Signed-off-by: cl <cailue@apache.org>
@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 18, 2026
@ClSlaid
Copy link
Copy Markdown
Contributor Author

ClSlaid commented Apr 18, 2026

/cc @alamb please have a look, this is a successor PR of #8991

Comment thread arrow-select/src/coalesce.rs Outdated
Comment thread arrow-select/src/coalesce.rs Outdated
Signed-off-by: 蔡略 <cailue@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants