Enhance JSON extraction functions to support null handling in queries#17867
Enhance JSON extraction functions to support null handling in queries#17867gortiz merged 3 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates Pinot’s jsonExtractScalar transform to better support null defaults under Pinot’s null-handling mode, and adds tests/data to validate the behavior in query execution.
Changes:
- Enhanced
JsonExtractScalarTransformFunctionto detect an explicitnulldefault and propagate nulls via a computed null bitmap when null handling is enabled. - Extended the JSON test dataset with a record containing an explicit JSON
nullvalue for the extracted field. - Added query tests covering
jsonExtractScalar(..., null)withenableNullHandlingtoggled on/off, and annotated test SQL strings with@Language("sql").
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/JsonExtractScalarTransformFunction.java | Implements null-default detection and null-bitmap propagation logic for jsonExtractScalar. |
| pinot-core/src/test/java/org/apache/pinot/queries/BaseJsonQueryTest.java | Adds a JSON record with a null field and annotates SQL strings for improved static analysis. |
| pinot-core/src/test/java/org/apache/pinot/queries/JsonExtractScalarTest.java | Adds tests validating null default behavior with null handling enabled vs disabled. |
...va/org/apache/pinot/core/operator/transform/function/JsonExtractScalarTransformFunction.java
Show resolved
Hide resolved
...va/org/apache/pinot/core/operator/transform/function/JsonExtractScalarTransformFunction.java
Show resolved
Hide resolved
...va/org/apache/pinot/core/operator/transform/function/JsonExtractScalarTransformFunction.java
Show resolved
Hide resolved
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #17867 +/- ##
============================================
- Coverage 63.26% 63.19% -0.07%
- Complexity 1466 1481 +15
============================================
Files 3190 3191 +1
Lines 192039 192631 +592
Branches 29421 29545 +124
============================================
+ Hits 121484 121728 +244
- Misses 61042 61365 +323
- Partials 9513 9538 +25
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
...va/org/apache/pinot/core/operator/transform/function/JsonExtractScalarTransformFunction.java
Show resolved
Hide resolved
| RoaringBitmap bitmap = new RoaringBitmap(); | ||
| for (TransformFunction arg : _arguments.subList(1, _arguments.size() - 1)) { | ||
| RoaringBitmap argBitmap = arg.getNullBitmap(valueBlock); | ||
| if (argBitmap != null) { | ||
| bitmap.or(argBitmap); | ||
| } | ||
| } | ||
| int numDocs = valueBlock.getNumDocs(); | ||
| RoaringBitmap nullBitmap = new RoaringBitmap(); | ||
| IntFunction<Object> resultExtractor = getResultExtractor(valueBlock); | ||
| for (int i = 0; i < numDocs; i++) { | ||
| Object result = null; | ||
| try { | ||
| result = resultExtractor.apply(i); | ||
| } catch (Exception ignored) { | ||
| } | ||
| if (result == null) { | ||
| nullBitmap.add(i); | ||
| } | ||
| } | ||
| if (!nullBitmap.isEmpty()) { | ||
| bitmap.or(nullBitmap); | ||
| } |
There was a problem hiding this comment.
This seems pretty expensive, we're computing the whole result set again?
There was a problem hiding this comment.
Yes, I didn't find a better way to do that.
|
please rebase and fix linter |
- Use javax.annotation.Nullable (align with TransformFunction) - Return null from getNullBitmap when bitmap is empty - Parse TIMESTAMP defaults via getLongLiteral for correct millis/strings - Rename tests to satisfy Checkstyle MethodName (no underscores)
This pull request enhances the
JsonExtractScalarTransformFunctionto provide better support for handlingnulldefault values, especially when Pinot's null handling feature is enabled. It introduces logic to correctly propagate nulls when anulldefault is specified, and adds comprehensive tests to validate this behavior.Enhancements to null handling in JSON extraction:
_defaultIsNullflag and logic to detect when the default value forjsonExtractScalaris explicitly set tonull, and to handle this case differently depending on whether null handling is enabled (_nullHandlingEnabled). [1] [2]getNullBitmapmethod to ensure rows with anullresult are correctly marked as null when anulldefault is used and null handling is enabled.initmethod signature to accept thenullHandlingEnabledparameter, aligning with the new null handling logic.Testing improvements:
nullvalues in the JSON field to the test dataset.jsonExtractScalarwhen the default value isnull, both with null handling enabled and disabled, ensuring correct output in each scenario.Miscellaneous:
@Language("sql")annotation to improve static analysis of SQL queries in tests.@NullableandRoaringBitmapto support the new null handling logic.