Implement predicate functions: all(), any(), none(), single() by gregfelice · Pull Request #2359 · apache/age

gregfelice · 2026-03-25T02:47:11Z

Summary

Implements the four openCypher predicate functions (issues #552, #553, #555, #556):

all(x IN list WHERE predicate) — true if all elements match
any(x IN list WHERE predicate) — true if at least one matches
none(x IN list WHERE predicate) — true if no elements match
single(x IN list WHERE predicate) — true if exactly one matches

These are among the most requested Cypher features for AGE and are critical for users migrating from Neo4j and Kuzu (recently archived).

Implementation

Approach: Builds on the existing list comprehension infrastructure (unnest-based subqueries with child parsestates for variable scoping).

SQL transformation strategy: Each predicate function is lowered to an aggregate subquery over unnest(list) that preserves Cypher's three-valued (TRUE/FALSE/NULL) semantics:

all(), any(), none() → SELECT CASE WHEN bool_or(<first-branch>) THEN <result1> WHEN bool_or(pred IS NULL) THEN NULL ELSE <result2> END FROM unnest(list) AS x — two bool_or() aggregates compute (1) whether any element satisfies the decisive condition and (2) whether any element's predicate is NULL, combined via CASE to return the correct three-valued result. On an empty list both aggregates return NULL, so the ELSE branch yields the vacuous-truth defaults (all()/none() → true, any() → false).
single() → SELECT count(*) FROM unnest(list) AS x WHERE pred IS TRUE) = 1 — exact count of truthy matches. (LIMIT 2 short-circuit optimization deferred; current form evaluates the full list.)
All four are wrapped in a CASE WHEN <list-expr> IS NULL THEN NULL ELSE <subquery> END guard so a NULL list expression propagates NULL rather than collapsing to the empty-list defaults.

This aggregate-based shape (rather than EXISTS/NOT EXISTS) was chosen specifically to preserve correct NULL handling: WHERE in a SQL subquery drops rows where the predicate is NULL, which would incorrectly collapse any(x IN [NULL] WHERE x > 0) from NULL to false under an EXISTS shape. The aggregate form keeps NULL-producing rows visible and folds them into a proper three-valued result.

Files changed (12):

File	Change
`cypher_nodes.h`	New `cypher_predicate_function` node type with `CPFK_ALL/ANY/NONE/SINGLE` enum
`ag_nodes.h` / `ag_nodes.c`	Node registration
`cypher_outfuncs.h` / `cypher_outfuncs.c`	Serialization
`cypher_kwlist.h`	Three new keywords: `ANY_P`, `NONE`, `SINGLE`
`cypher_gram.y`	Grammar rules in `expr_func_subexpr`, `build_predicate_function_node()` helper with NULL-list guard, `extract_iter_variable_name()` shared helper (rejects qualified ColumnRefs), keywords added to `safe_keywords`
`cypher_clause.c`	`transform_cypher_predicate_function()` — builds query tree with `bool_or()` aggregates + `CASE` for three-valued semantics; `make_bool_or_agg()` helper
`cypher_analyze.c`	Expression walker for new node type
`Makefile`	Register new regression test
`regress/sql/predicate_functions.sql`	New test file
`regress/expected/predicate_functions.out`	Expected output

Backward compatibility:

ANY_P, NONE, SINGLE added to safe_keywords so they work as property keys and label names (e.g., {any: 1, none: 2, single: 3})
ALL was already a reserved keyword with safe_keywords entry
No grammar conflicts (verified: zero new shift/reduce or reduce/reduce warnings)

Regression Tests

Test queries covering:

Basic true/false for each function
Empty list edge cases (vacuous truth for all()/none(), false for any()/single())
NULL list input (all four return NULL via the guard)
NULL predicate results within non-empty lists (three-valued semantics)
Graph data integration (MATCH (u) WHERE all(x IN u.vals WHERE ...))
Boolean combinations (any(...) AND all(...))
Nested predicates (any(x IN ... WHERE all(y IN ... WHERE ...)))
Keyword backward compatibility ({any: 1, none: 2, single: 3})
Deterministic ordering (all graph queries use ORDER BY)

All regression tests pass.

Implement the four openCypher predicate functions (issues apache#552, apache#553, apache#555, apache#556) that test list elements against a predicate: all(x IN list WHERE predicate) -- true if all elements match any(x IN list WHERE predicate) -- true if at least one matches none(x IN list WHERE predicate) -- true if no elements match single(x IN list WHERE predicate) -- true if exactly one matches Implementation approach: - Add cypher_predicate_function node type with CPFK_ALL/ANY/NONE/SINGLE kind enum, reusing the list comprehension's unnest-based transformation - Grammar rules in expr_func_subexpr (alongside EXISTS, COALESCE, COUNT) - Transform to efficient SQL sublinks: all() -> NOT EXISTS (SELECT 1 FROM unnest WHERE NOT pred) any() -> EXISTS (SELECT 1 FROM unnest WHERE pred) none() -> NOT EXISTS (SELECT 1 FROM unnest WHERE pred) single() -> (SELECT count(*) FROM unnest WHERE pred) = 1 - Three new keywords (ANY_P, NONE, SINGLE) added to safe_keywords for backward compatibility as property keys and label names - Shared extract_iter_variable_name() helper for variable validation All 32 regression tests pass. New predicate_functions test covers basic semantics, empty lists, graph data integration, boolean combinations, nested predicates, and keyword backward compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds support for the openCypher predicate functions all(), any(), none(), and single() by introducing a new AST node and transforming it into unnest(...)-based SQL subqueries during parsing/analyzing, with a new regression test suite.

Changes:

Add new cypher_predicate_function AST node (+ enum kind) and register/serialize it as an ExtensibleNode.
Extend the Cypher grammar with all/any/none/single(variable IN list WHERE predicate) and add keywords to the lexer + safe_keywords.
Implement query-tree transformation for predicate functions and add regression tests (predicate_functions).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/include/parser/cypher_kwlist.h	Adds `any/none/single` as Cypher keywords.
src/backend/parser/cypher_gram.y	Adds grammar rules + helper to build predicate-function nodes and wraps them in SubLinks.
src/include/nodes/cypher_nodes.h	Introduces `cypher_predicate_function` node + kind enum.
src/include/nodes/ag_nodes.h	Registers new node tag for predicate functions.
src/backend/nodes/ag_nodes.c	Adds node name and ExtensibleNode methods entry for predicate-function node.
src/include/nodes/cypher_outfuncs.h	Declares serialization function for the new node.
src/backend/nodes/cypher_outfuncs.c	Implements serialization for `cypher_predicate_function`.
src/backend/parser/cypher_clause.c	Transforms predicate-function node into `unnest`-based subqueries (EXISTS / count).
src/backend/parser/cypher_analyze.c	Adds expression walker support for the new node type.
Makefile	Registers the new `predicate_functions` regression test.
regress/sql/predicate_functions.sql	Adds regression SQL coverage for predicate functions.
regress/expected/predicate_functions.out	Adds expected output for the new regression test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jrgemignani · 2026-03-25T20:56:03Z

@gregfelice Please see the above comments by Copilot

… perf, tests - Rewrite predicate functions from EXISTS_SUBLINK to EXPR_SUBLINK with aggregate-based CASE expressions (bool_or + IS TRUE/FALSE/NULL) to preserve three-valued Cypher NULL semantics - Add list_length check in extract_iter_variable_name() to reject qualified names like x.y as iterator variables - Add copy/read support for cypher_predicate_function ExtensibleNode to prevent query rewriter crashes - Use IS TRUE filtering in single() count (LIMIT 2 optimization breaks correlated variable refs in graph contexts -- documented) - Add 13 NULL regression tests: null list input, null elements, null predicates for all four functions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

gregfelice · 2026-03-25T21:20:52Z

Addressed all 4 Copilot suggestions:

NULL semantics — Rewrote from EXISTS_SUBLINK to EXPR_SUBLINK with aggregate-based CASE expressions (bool_or(pred IS TRUE/FALSE) + bool_or(pred IS NULL)) to preserve three-valued Cypher NULL semantics. any(x IN [NULL] WHERE x > 0) now correctly returns NULL instead of false.
Unqualified iterator check — Added list_length(cref->fields) != 1 validation in extract_iter_variable_name(). Qualified names like x.y now error with "qualified name not allowed as iterator variable".
NULL regression tests — Added 13 new test cases covering: NULL list input for all four functions, null elements in lists, literal null predicates, and mixed null/non-null elements.
single() performance — Applied IS TRUE filtering so NULL predicates aren't counted. The LIMIT 2 optimization breaks correlated variable references in graph property contexts (e.g., MATCH (u) WHERE single(x IN u.vals WHERE ...)), documented with TODO for future optimization pass.

Bonus: Added copy/read support for cypher_predicate_function ExtensibleNode to prevent "unexpected copyObject()" crashes when PostgreSQL's query rewriter copies expression trees.

All 32 regression tests pass (predicate_functions: ok).

jrgemignani · 2026-03-25T23:47:44Z

@gregfelice We'll see what Copilot thinks ;) Btw, in the future, can you put your comments in the reply to Copilot, please.

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ate.h 1. Add NULL-list guard for all predicate functions (all/any/none/single). Wraps the result with CASE WHEN list IS NULL THEN NULL ELSE <result> END in the grammar layer. This fixes single(x IN null WHERE ...) returning false instead of NULL. The expr pointer is safely shared between the NullTest and the predicate function node because AGE's expression transformer creates new nodes without modifying the parse tree in-place. 2. Fix single() block comment in transform_cypher_predicate_function: described LIMIT 2 optimization but implementation uses plain count(*). Updated comment to match actual implementation. 3. Keep #include "catalog/pg_aggregate.h" -- Copilot suggested removal but AGGKIND_NORMAL macro requires it (build fails without it). Regression test: predicate_functions OK. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jrgemignani · 2026-04-06T16:17:59Z

@gregfelice Please see Copilot above

…mprehensions - Refactor build_list_comprehension_node() to reuse the shared extract_iter_variable_name() helper, so `var IN list` validation is consistent between list comprehensions and predicate functions (all/any/none/single). Qualified ColumnRefs like `x.y IN list` are now rejected in list comprehensions the same way they are in predicate functions. - Update list_comprehension expected output for the normalized lowercase "syntax error at or near IN" message. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

gregfelice · 2026-04-15T21:57:45Z

@jrgemignani — all Copilot items addressed and threads resolved. Round 3 commit (507a2e5) refactors build_list_comprehension_node() to reuse extract_iter_variable_name(), so iterator-variable validation is consistent between list comprehensions and predicate functions. The PR description already describes the bool_or() aggregate approach correctly. Ready for re-review when you have a moment. Thanks!

jrgemignani requested a review from Copilot March 25, 2026 17:35

Copilot started reviewing on behalf of jrgemignani March 25, 2026 17:37 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Comment thread src/backend/parser/cypher_clause.c Outdated

Comment thread src/backend/parser/cypher_gram.y

Comment thread regress/sql/predicate_functions.sql

Comment thread src/backend/parser/cypher_clause.c

jrgemignani requested a review from Copilot March 25, 2026 23:45

Copilot started reviewing on behalf of jrgemignani March 25, 2026 23:45 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Comment thread src/backend/parser/cypher_gram.y

Comment thread src/backend/parser/cypher_clause.c Outdated

Comment thread src/backend/parser/cypher_clause.c

jrgemignani requested a review from Copilot March 26, 2026 18:38

Copilot started reviewing on behalf of jrgemignani March 26, 2026 18:39 View session

Copilot AI reviewed Mar 26, 2026

View reviewed changes

Comment thread src/backend/parser/cypher_clause.c

Comment thread src/backend/parser/cypher_gram.y

gregfelice mentioned this pull request Apr 15, 2026

FOREACH clause support #2381

Open

jrgemignani approved these changes Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement predicate functions: all(), any(), none(), single()#2359

Implement predicate functions: all(), any(), none(), single()#2359
gregfelice wants to merge 4 commits intoapache:masterfrom
gregfelice:feature_predicate_functions

gregfelice commented Mar 25, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jrgemignani commented Mar 25, 2026

Uh oh!

gregfelice commented Mar 25, 2026

Uh oh!

jrgemignani commented Mar 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

jrgemignani commented Apr 6, 2026

Uh oh!

gregfelice commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gregfelice commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Implementation

Regression Tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jrgemignani commented Mar 25, 2026

Uh oh!

gregfelice commented Mar 25, 2026

Uh oh!

jrgemignani commented Mar 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

jrgemignani commented Apr 6, 2026

Uh oh!

gregfelice commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gregfelice commented Mar 25, 2026 •

edited

Loading