Skip to content

perf: Add BulkNullStringArrayBuilder trait, use in repeat#21854

Open
neilconway wants to merge 2 commits intoapache:mainfrom
neilconway:neilc/perf-builder-repeat
Open

perf: Add BulkNullStringArrayBuilder trait, use in repeat#21854
neilconway wants to merge 2 commits intoapache:mainfrom
neilconway:neilc/perf-builder-repeat

Conversation

@neilconway
Copy link
Copy Markdown
Contributor

Add a DataFusion-side trait that abstracts over the bulk-NULL string array builders (GenericStringArrayBuilder and StringViewArrayBuilder), so that functions which dispatch over Utf8/LargeUtf8/Utf8View can adopt the new builders without giving up their single-bodied generic implementation.

Convert repeat as the first call site. The output is null iff either input is null, so the per-row null match becomes a single NullBuffer::union over the input null buffers, evaluated once before the loop.

Also mark the inherent append_value/append_placeholder methods on the new builders as #[inline]; without this, calls through the trait wrapper end up going through a non-inlined inherent and slow down small-output paths.

Which issue does this PR close?

Rationale for this change

Optimize NULL handling in repeat using the bulk-NULL string builders that have recently been added. This requires adding BulkNullStringArrayBuilder, a trait that is similar in spirit to Arrow's StringLikeArrayBuilder.

Benchmarks:

  • repeat_string overflow [size=1024, repeat_times=1073741824]: 1022.5ns → 1054.5ns (+3.13%)
  • repeat_string overflow [size=4096, repeat_times=1073741824]: 1016.6ns → 1055.3ns (+3.81%)
  • repeat_large_string [size=1024, repeat_times=3]: 32.4µs → 26.6µs (−17.90%)
  • repeat_large_string [size=4096, repeat_times=3]: 127.4µs → 104.0µs (−18.37%)
  • repeat_string [size=1024, repeat_times=3]: 32.6µs → 26.8µs (−17.79%)
  • repeat_string [size=4096, repeat_times=3]: 127.4µs → 105.5µs (−17.19%)
  • repeat_string_view [size=1024, repeat_times=3]: 37.3µs → 31.7µs (−15.01%)
  • repeat_string_view [size=4096, repeat_times=3]: 146.5µs → 124.5µs (−15.02%)
  • repeat_large_string [size=1024, repeat_times=30]: 82.0µs → 80.4µs (−1.95%)
  • repeat_large_string [size=4096, repeat_times=30]: 344.2µs → 338.7µs (−1.60%)
  • repeat_string [size=1024, repeat_times=30]: 81.7µs → 79.7µs (−2.45%)
  • repeat_string [size=4096, repeat_times=30]: 352.2µs → 334.7µs (−4.97%)
  • repeat_string_view [size=1024, repeat_times=30]: 88.1µs → 83.1µs (−5.68%)
  • repeat_string_view [size=4096, repeat_times=30]: 368.8µs → 342.6µs (−7.10%)
  • repeat/scalar_utf8: 174.7ns → 179.2ns (+2.58%)
  • repeat/scalar_utf8view: 174.5ns → 180.5ns (+3.44%)

What changes are included in this PR?

  • Add BulkNullStringArrayBuilder
  • Optimize repeat using BulkNullStringArrayBuilder
  • Inline some functions in GenericStringBuilder; benchmarking suggests this is a win

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

Add a DataFusion-side trait that abstracts over the bulk-NULL string array
builders (GenericStringArrayBuilder<O> and StringViewArrayBuilder), so that
functions which dispatch over Utf8/LargeUtf8/Utf8View can adopt the new
builders without giving up their single-bodied generic implementation.

Convert `repeat` as the first call site. The output is null iff either input
is null, so the per-row null match becomes a single NullBuffer::union over
the input null buffers, evaluated once before the loop.

Also mark the inherent append_value/append_placeholder methods on the new
builders as #[inline]; without this, calls through the trait wrapper end up
going through a non-inlined inherent and slow down small-output paths.
@github-actions github-actions Bot added the functions Changes to functions implementation label Apr 25, 2026
Comment thread datafusion/functions/src/string/repeat.rs
@github-actions github-actions Bot added the sqllogictest SQL Logic Tests (.slt) label Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize repeat to use bulk-NULL string builder

2 participants