Skip to content

Comments

fix: HashJoin panic with dictionary-encoded columns in multi-key joins#20441

Open
Tim-53 wants to merge 3 commits intoapache:mainfrom
Tim-53:fix-20437-hash-join-dictionary-panic
Open

fix: HashJoin panic with dictionary-encoded columns in multi-key joins#20441
Tim-53 wants to merge 3 commits intoapache:mainfrom
Tim-53:fix-20437-hash-join-dictionary-panic

Conversation

@Tim-53
Copy link
Contributor

@Tim-53 Tim-53 commented Feb 20, 2026

Which issue does this PR close?

Rationale for this change

flatten_dictionary_array returned only the unique values rather then the full expanded array when being called on a DictionaryArray. When building a StructArray this caused a length mismatch panic.

What changes are included in this PR?

Replaced array.values() with arrow::compute::cast(array, value_type) in flatten_dictionary_array, which properly expands the dictionary into a full length array matching the row count.

Are these changes tested?

Yes, both a new unit test aswell as a regression test were added.

Are there any user-facing changes?

Nope

@github-actions github-actions bot added core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Feb 20, 2026
@Tim-53 Tim-53 force-pushed the fix-20437-hash-join-dictionary-panic branch from b1328a2 to ffc5b55 Compare February 20, 2026 00:18
Copy link
Contributor

@jonathanc-n jonathanc-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, just some small comments

}

// Issue #20437: https://github.com/apache/datafusion/issues/20437
#[tokio::test]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to keep unit tests at a minimum when possible. Use sqllogictests instead here

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) and removed core Core DataFusion crate labels Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Panic in HashJoin with dictionary-encoded column in multi-column join key

2 participants