Implement predicate functions: all(), any(), none(), single()#2359
Implement predicate functions: all(), any(), none(), single()#2359gregfelice wants to merge 4 commits intoapache:masterfrom
Conversation
Implement the four openCypher predicate functions (issues apache#552, apache#553, apache#555, apache#556) that test list elements against a predicate: all(x IN list WHERE predicate) -- true if all elements match any(x IN list WHERE predicate) -- true if at least one matches none(x IN list WHERE predicate) -- true if no elements match single(x IN list WHERE predicate) -- true if exactly one matches Implementation approach: - Add cypher_predicate_function node type with CPFK_ALL/ANY/NONE/SINGLE kind enum, reusing the list comprehension's unnest-based transformation - Grammar rules in expr_func_subexpr (alongside EXISTS, COALESCE, COUNT) - Transform to efficient SQL sublinks: all() -> NOT EXISTS (SELECT 1 FROM unnest WHERE NOT pred) any() -> EXISTS (SELECT 1 FROM unnest WHERE pred) none() -> NOT EXISTS (SELECT 1 FROM unnest WHERE pred) single() -> (SELECT count(*) FROM unnest WHERE pred) = 1 - Three new keywords (ANY_P, NONE, SINGLE) added to safe_keywords for backward compatibility as property keys and label names - Shared extract_iter_variable_name() helper for variable validation All 32 regression tests pass. New predicate_functions test covers basic semantics, empty lists, graph data integration, boolean combinations, nested predicates, and keyword backward compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds support for the openCypher predicate functions all(), any(), none(), and single() by introducing a new AST node and transforming it into unnest(...)-based SQL subqueries during parsing/analyzing, with a new regression test suite.
Changes:
- Add new
cypher_predicate_functionAST node (+ enum kind) and register/serialize it as an ExtensibleNode. - Extend the Cypher grammar with
all/any/none/single(variable IN list WHERE predicate)and add keywords to the lexer +safe_keywords. - Implement query-tree transformation for predicate functions and add regression tests (
predicate_functions).
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/include/parser/cypher_kwlist.h | Adds any/none/single as Cypher keywords. |
| src/backend/parser/cypher_gram.y | Adds grammar rules + helper to build predicate-function nodes and wraps them in SubLinks. |
| src/include/nodes/cypher_nodes.h | Introduces cypher_predicate_function node + kind enum. |
| src/include/nodes/ag_nodes.h | Registers new node tag for predicate functions. |
| src/backend/nodes/ag_nodes.c | Adds node name and ExtensibleNode methods entry for predicate-function node. |
| src/include/nodes/cypher_outfuncs.h | Declares serialization function for the new node. |
| src/backend/nodes/cypher_outfuncs.c | Implements serialization for cypher_predicate_function. |
| src/backend/parser/cypher_clause.c | Transforms predicate-function node into unnest-based subqueries (EXISTS / count). |
| src/backend/parser/cypher_analyze.c | Adds expression walker support for the new node type. |
| Makefile | Registers the new predicate_functions regression test. |
| regress/sql/predicate_functions.sql | Adds regression SQL coverage for predicate functions. |
| regress/expected/predicate_functions.out | Adds expected output for the new regression test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@gregfelice Please see the above comments by Copilot |
… perf, tests - Rewrite predicate functions from EXISTS_SUBLINK to EXPR_SUBLINK with aggregate-based CASE expressions (bool_or + IS TRUE/FALSE/NULL) to preserve three-valued Cypher NULL semantics - Add list_length check in extract_iter_variable_name() to reject qualified names like x.y as iterator variables - Add copy/read support for cypher_predicate_function ExtensibleNode to prevent query rewriter crashes - Use IS TRUE filtering in single() count (LIMIT 2 optimization breaks correlated variable refs in graph contexts -- documented) - Add 13 NULL regression tests: null list input, null elements, null predicates for all four functions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Addressed all 4 Copilot suggestions:
Bonus: Added copy/read support for All 32 regression tests pass ( |
|
@gregfelice We'll see what Copilot thinks ;) Btw, in the future, can you put your comments in the reply to Copilot, please. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ate.h 1. Add NULL-list guard for all predicate functions (all/any/none/single). Wraps the result with CASE WHEN list IS NULL THEN NULL ELSE <result> END in the grammar layer. This fixes single(x IN null WHERE ...) returning false instead of NULL. The expr pointer is safely shared between the NullTest and the predicate function node because AGE's expression transformer creates new nodes without modifying the parse tree in-place. 2. Fix single() block comment in transform_cypher_predicate_function: described LIMIT 2 optimization but implementation uses plain count(*). Updated comment to match actual implementation. 3. Keep #include "catalog/pg_aggregate.h" -- Copilot suggested removal but AGGKIND_NORMAL macro requires it (build fails without it). Regression test: predicate_functions OK. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@gregfelice Please see Copilot above |
…mprehensions - Refactor build_list_comprehension_node() to reuse the shared extract_iter_variable_name() helper, so `var IN list` validation is consistent between list comprehensions and predicate functions (all/any/none/single). Qualified ColumnRefs like `x.y IN list` are now rejected in list comprehensions the same way they are in predicate functions. - Update list_comprehension expected output for the normalized lowercase "syntax error at or near IN" message. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@jrgemignani — all Copilot items addressed and threads resolved. Round 3 commit (507a2e5) refactors |
Summary
Implements the four openCypher predicate functions (issues #552, #553, #555, #556):
all(x IN list WHERE predicate)— true if all elements matchany(x IN list WHERE predicate)— true if at least one matchesnone(x IN list WHERE predicate)— true if no elements matchsingle(x IN list WHERE predicate)— true if exactly one matchesThese are among the most requested Cypher features for AGE and are critical for users migrating from Neo4j and Kuzu (recently archived).
Implementation
Approach: Builds on the existing list comprehension infrastructure (
unnest-based subqueries with child parsestates for variable scoping).SQL transformation strategy: Each predicate function is lowered to an aggregate subquery over
unnest(list)that preserves Cypher's three-valued (TRUE/FALSE/NULL) semantics:all(),any(),none()→SELECT CASE WHEN bool_or(<first-branch>) THEN <result1> WHEN bool_or(pred IS NULL) THEN NULL ELSE <result2> END FROM unnest(list) AS x— twobool_or()aggregates compute (1) whether any element satisfies the decisive condition and (2) whether any element's predicate is NULL, combined viaCASEto return the correct three-valued result. On an empty list both aggregates return NULL, so theELSEbranch yields the vacuous-truth defaults (all()/none()→ true,any()→ false).single()→SELECT count(*) FROM unnest(list) AS x WHERE pred IS TRUE) = 1— exact count of truthy matches. (LIMIT 2 short-circuit optimization deferred; current form evaluates the full list.)CASE WHEN <list-expr> IS NULL THEN NULL ELSE <subquery> ENDguard so a NULL list expression propagates NULL rather than collapsing to the empty-list defaults.This aggregate-based shape (rather than
EXISTS/NOT EXISTS) was chosen specifically to preserve correct NULL handling:WHEREin a SQL subquery drops rows where the predicate is NULL, which would incorrectly collapseany(x IN [NULL] WHERE x > 0)from NULL to false under anEXISTSshape. The aggregate form keeps NULL-producing rows visible and folds them into a proper three-valued result.Files changed (12):
cypher_nodes.hcypher_predicate_functionnode type withCPFK_ALL/ANY/NONE/SINGLEenumag_nodes.h/ag_nodes.ccypher_outfuncs.h/cypher_outfuncs.ccypher_kwlist.hANY_P,NONE,SINGLEcypher_gram.yexpr_func_subexpr,build_predicate_function_node()helper with NULL-list guard,extract_iter_variable_name()shared helper (rejects qualified ColumnRefs), keywords added tosafe_keywordscypher_clause.ctransform_cypher_predicate_function()— builds query tree withbool_or()aggregates +CASEfor three-valued semantics;make_bool_or_agg()helpercypher_analyze.cMakefileregress/sql/predicate_functions.sqlregress/expected/predicate_functions.outBackward compatibility:
ANY_P,NONE,SINGLEadded tosafe_keywordsso they work as property keys and label names (e.g.,{any: 1, none: 2, single: 3})ALLwas already a reserved keyword withsafe_keywordsentryRegression Tests
Test queries covering:
all()/none(), false forany()/single())MATCH (u) WHERE all(x IN u.vals WHERE ...))any(...) AND all(...))any(x IN ... WHERE all(y IN ... WHERE ...))){any: 1, none: 2, single: 3})ORDER BY)All regression tests pass.