Skip to content

refactor: Share left-side spill file across partitions on OOM fallback#21699

Merged
viirya merged 1 commit intoapache:mainfrom
viirya:nlj-shared-left-spill
Apr 19, 2026
Merged

refactor: Share left-side spill file across partitions on OOM fallback#21699
viirya merged 1 commit intoapache:mainfrom
viirya:nlj-shared-left-spill

Conversation

@viirya
Copy link
Copy Markdown
Member

@viirya viirya commented Apr 17, 2026

Which issue does this PR close?

  • Closes #.

Rationale for this change

To reduce the redundant re-execution of the left side during OOM fallback.

What changes are included in this PR?

Previously when OnceFut failed with OOM, each partition independently re-executed the left child to get its own stream. This was redundant since all partitions need the same left data.

Now the first partition to initiate fallback spills the entire left side to disk via a shared OnceAsync. Other partitions wait on the same future and read from the shared spill file, avoiding redundant re-execution of the left child.

Co-authored-by: Claude Code

Are these changes tested?

Unit test

Are there any user-facing changes?

No

@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Apr 17, 2026
@viirya viirya force-pushed the nlj-shared-left-spill branch 2 times, most recently from 17906e9 to ccc6e9d Compare April 17, 2026 17:42
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Apr 17, 2026
@viirya viirya force-pushed the nlj-shared-left-spill branch from ccc6e9d to 5f900c6 Compare April 17, 2026 18:42
01)ProjectionExec: expr=[count(Int64(1))@0 as count(*)], metrics=[<slt:ignore>]
02)--AggregateExec: mode=Single, gby=[], aggr=[count(Int64(1))], metrics=[<slt:ignore>]
03)----NestedLoopJoinExec: join_type=Inner, filter=v1@0 + v2@1 > 0, projection=[], metrics=[output_rows=100.0 K, <slt:ignore> spill_count=1, <slt:ignore>]
03)----NestedLoopJoinExec: join_type=Inner, filter=v1@0 + v2@1 > 0, projection=[], metrics=[output_rows=100.0 K, <slt:ignore> spill_count=2, <slt:ignore>]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the left side spills during OOM, so spill_count increases from 1 to 2.

@viirya viirya requested a review from 2010YOUY01 April 17, 2026 19:08
Copy link
Copy Markdown
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

One potential improvement is to let all partitions share the same buffered left-side chunk. This would allow us to process more left-side data per right-side probe pass under the same memory budget, reducing the number of probe passes.

This likely requires replacing the existing OnceAsync with a different coordination mechanism.

That said, the current solution is a good starting point, since spill encoding and decoding are relatively lightweight, we can iterate on this in the future.

Comment thread datafusion/physical-plan/src/joins/nested_loop_join.rs Outdated
Previously when OnceFut failed with OOM, each partition independently
re-executed the left child to get its own stream. This was redundant
since all partitions need the same left data.

Now the first partition to initiate fallback spills the entire left
side to disk via a shared OnceAsync<LeftSpillData>. Other partitions
wait on the same future and read from the shared spill file, avoiding
redundant re-execution of the left child.

Co-authored-by: Claude Code
@viirya viirya force-pushed the nlj-shared-left-spill branch from 5f900c6 to 3ab0cc9 Compare April 18, 2026 22:13
@viirya viirya added this pull request to the merge queue Apr 19, 2026
@viirya
Copy link
Copy Markdown
Member Author

viirya commented Apr 19, 2026

Thanks @2010YOUY01

Merged via the queue into apache:main with commit 6aa5a7e Apr 19, 2026
35 checks passed
@viirya viirya deleted the nlj-shared-left-spill branch April 19, 2026 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants