feat: Add native support for mode fn by pchintar · Pull Request #3971 · apache/datafusion-comet

pchintar · 2026-04-17T07:57:51Z

Which issue does this PR close?

Closes #3970 .

Rationale for this change

This PR adds native support for the MODE aggregate function in Comet, enabling fully native execution without fallback to Spark.

MODE returns the most frequent value in a group. Unlike simple aggregates, it requires maintaining full frequency state, which is implemented here in a mergeable and execution-friendly form.

What changes are included in this PR?

Added a native Rust implementation (mode.rs) using:
- HashMap-based frequency tracking per aggregation unit
- Mergeable binary state to support partial → final aggregation
Implemented both:
- Accumulator (non-grouped execution)
- GroupsAccumulator (vectorized grouped execution)
GroupsAccumulator maintains per-group frequency maps, ensuring:
- correct grouping semantics
- efficient batched updates
Added full planner + serde wiring:
- proto (Mode)
- Scala serde (CometMode)
- planner integration (AggExprStruct::Mode)

The implementation preserves full correctness across all execution stages:

Partial + Final aggregation
- state stores complete frequency maps (not just local winners)
- guarantees correctness after merge
Null handling
- nulls are ignored during counting
- all-null inputs → NULL
Tie handling
- deterministic selection via max_by_key on counts
Type coverage
- supports all primitive and commonly used Comet-compatible types
Floating-point edge cases
- normalized hashing for NaN values to ensure consistent grouping

How are these changes tested?

Verified across:
- multiple data types
- grouped and non-grouped queries
Edge cases tested:
- empty input
- all-null input
- mixed null/non-null
- skewed distributions (clear vs non-clear modes)
- tie scenarios
Confirmed no fallback in execution plans

pchintar · 2026-04-17T14:43:58Z

Hi, so I understand the issue with the formatting of rust files & so I ran cargo fmt to address the lint failure. But I don't know what's causing the remaining error of "[Spark SQL Tests / spark-sql-auto-sql_core-2/spark-4.0.1"?

Add native support for mode fn

d128992

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add native support for mode fn#3971

feat: Add native support for mode fn#3971
pchintar wants to merge 1 commit intoapache:mainfrom
pchintar:mode-native

pchintar commented Apr 17, 2026

Uh oh!

pchintar commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pchintar commented Apr 17, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

pchintar commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant