Skip to content

Enhance JSON extraction functions to support null handling in queries#17867

Merged
gortiz merged 3 commits intoapache:masterfrom
gortiz:jsonextractscalar-null-default
Mar 23, 2026
Merged

Enhance JSON extraction functions to support null handling in queries#17867
gortiz merged 3 commits intoapache:masterfrom
gortiz:jsonextractscalar-null-default

Conversation

@gortiz
Copy link
Copy Markdown
Contributor

@gortiz gortiz commented Mar 12, 2026

This pull request enhances the JsonExtractScalarTransformFunction to provide better support for handling null default values, especially when Pinot's null handling feature is enabled. It introduces logic to correctly propagate nulls when a null default is specified, and adds comprehensive tests to validate this behavior.

Enhancements to null handling in JSON extraction:

  • Added a _defaultIsNull flag and logic to detect when the default value for jsonExtractScalar is explicitly set to null, and to handle this case differently depending on whether null handling is enabled (_nullHandlingEnabled). [1] [2]
  • Overrode the getNullBitmap method to ensure rows with a null result are correctly marked as null when a null default is used and null handling is enabled.
  • Updated the init method signature to accept the nullHandlingEnabled parameter, aligning with the new null handling logic.

Testing improvements:

  • Added new test records with null values in the JSON field to the test dataset.
  • Introduced two new tests to verify the behavior of jsonExtractScalar when the default value is null, both with null handling enabled and disabled, ensuring correct output in each scenario.

Miscellaneous:

  • Added the @Language("sql") annotation to improve static analysis of SQL queries in tests.
  • Imported @Nullable and RoaringBitmap to support the new null handling logic.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Pinot’s jsonExtractScalar transform to better support null defaults under Pinot’s null-handling mode, and adds tests/data to validate the behavior in query execution.

Changes:

  • Enhanced JsonExtractScalarTransformFunction to detect an explicit null default and propagate nulls via a computed null bitmap when null handling is enabled.
  • Extended the JSON test dataset with a record containing an explicit JSON null value for the extracted field.
  • Added query tests covering jsonExtractScalar(..., null) with enableNullHandling toggled on/off, and annotated test SQL strings with @Language("sql").

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/JsonExtractScalarTransformFunction.java Implements null-default detection and null-bitmap propagation logic for jsonExtractScalar.
pinot-core/src/test/java/org/apache/pinot/queries/BaseJsonQueryTest.java Adds a JSON record with a null field and annotates SQL strings for improved static analysis.
pinot-core/src/test/java/org/apache/pinot/queries/JsonExtractScalarTest.java Adds tests validating null default behavior with null handling enabled vs disabled.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 12, 2026

Codecov Report

❌ Patch coverage is 69.56522% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.19%. Comparing base (54bea86) to head (0d981b6).
⚠️ Report is 50 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...m/function/JsonExtractScalarTransformFunction.java 69.56% 10 Missing and 4 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17867      +/-   ##
============================================
- Coverage     63.26%   63.19%   -0.07%     
- Complexity     1466     1481      +15     
============================================
  Files          3190     3191       +1     
  Lines        192039   192631     +592     
  Branches      29421    29545     +124     
============================================
+ Hits         121484   121728     +244     
- Misses        61042    61365     +323     
- Partials       9513     9538      +25     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 55.46% <69.56%> (-7.78%) ⬇️
java-21 63.17% <69.56%> (-0.06%) ⬇️
temurin 63.19% <69.56%> (-0.07%) ⬇️
unittests 63.18% <69.56%> (-0.07%) ⬇️
unittests1 55.51% <69.56%> (-0.08%) ⬇️
unittests2 34.20% <0.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@xiangfu0 xiangfu0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix the lint

@xiangfu0 xiangfu0 added query Related to query processing json Related to JSON column support labels Mar 12, 2026
@yashmayya yashmayya added the null support Related to NULL value handling label Mar 13, 2026
Comment on lines +166 to +188
RoaringBitmap bitmap = new RoaringBitmap();
for (TransformFunction arg : _arguments.subList(1, _arguments.size() - 1)) {
RoaringBitmap argBitmap = arg.getNullBitmap(valueBlock);
if (argBitmap != null) {
bitmap.or(argBitmap);
}
}
int numDocs = valueBlock.getNumDocs();
RoaringBitmap nullBitmap = new RoaringBitmap();
IntFunction<Object> resultExtractor = getResultExtractor(valueBlock);
for (int i = 0; i < numDocs; i++) {
Object result = null;
try {
result = resultExtractor.apply(i);
} catch (Exception ignored) {
}
if (result == null) {
nullBitmap.add(i);
}
}
if (!nullBitmap.isEmpty()) {
bitmap.or(nullBitmap);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty expensive, we're computing the whole result set again?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I didn't find a better way to do that.

@xiangfu0
Copy link
Copy Markdown
Contributor

please rebase and fix linter

gortiz added 2 commits March 20, 2026 15:53
- Use javax.annotation.Nullable (align with TransformFunction)
- Return null from getNullBitmap when bitmap is empty
- Parse TIMESTAMP defaults via getLongLiteral for correct millis/strings
- Rename tests to satisfy Checkstyle MethodName (no underscores)
@gortiz gortiz merged commit c297fbe into apache:master Mar 23, 2026
13 of 16 checks passed
@gortiz gortiz deleted the jsonextractscalar-null-default branch March 23, 2026 13:20
xiangfu0 added a commit to pinot-contrib/pinot-docs that referenced this pull request Mar 23, 2026
xiangfu0 added a commit to pinot-contrib/pinot-docs that referenced this pull request Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

json Related to JSON column support null support Related to NULL value handling query Related to query processing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants