Summary
Add openCypher FOREACH clause support to AGE. FOREACH is a common ETL / iterative-update construct and is one of the few remaining Phase 1 Cypher parity gaps — along with pattern expressions in WHERE (#1577, PR #2360), predicate functions (PR #2359), and MERGE ON CREATE/MATCH SET (PR #2347).
Cypher semantics (openCypher / Neo4j)
FOREACH (var IN list-expression | update-clause [update-clause ...])
- Body may contain only update clauses:
CREATE, MERGE, SET, REMOVE, DELETE, and nested FOREACH.
- Body runs once per list element; binds
var to the current element in scope for body clauses only.
- Produces no new rows in the outer query — the outer row set passes through unchanged.
- No read clauses (
MATCH, WITH, RETURN) inside the body.
- Empty or NULL list → no-op, outer rows preserved.
Examples:
// Create nodes from a list
FOREACH (name IN ['Alice','Bob','Carol'] | CREATE (:Person {name: name}))
// Per-row iterative update
MATCH (p:Person)
FOREACH (tag IN p.tags | SET p.tag_count = p.tag_count + 1)
// Idempotent tag creation
MATCH (p:Person)
FOREACH (t IN p.tag_names | MERGE (tag:Tag {name: t}) MERGE (p)-[:HAS_TAG]->(tag))
Why FOREACH is not UNWIND
UNWIND flattens a list into the row stream — every element becomes an outer row. FOREACH is the opposite: its body runs side-effecting update clauses per element, but the outer row set is unchanged. You can sometimes rewrite one as the other, but not when there are projections downstream that must not be multiplied by list length.
Existing AGE infrastructure that can be reused
cypher_unwind node + transform_cypher_unwind (src/backend/parser/cypher_clause.c:1440) — list iteration, element-variable binding, UNWIND expr AS var grammar shape.
transform_cypher_set_item_list (src/backend/parser/cypher_clause.c:1862) — per-item update list transform, already parameterized via cypher_update_item.
- Existing CustomScan executor nodes for
cypher_create, cypher_set, cypher_delete, cypher_merge — these are exactly the body clauses FOREACH needs to invoke per iteration.
Proposed implementation strategy
Two viable paths; happy to take maintainer input before writing code.
Option A — New cypher_foreach CustomScan node (preferred). Analogous to cypher_create. Holds (a) the list expression, (b) pre-built child update-clause plans, (c) a per-element tuple slot. Per outer tuple: iterate the list, bind var, invoke each child's executor in sequence; no tuples emitted. This matches AGE's existing architecture for write clauses and gives clean semantics (body runs, outer row passes through).
Option B — Lower to side-effecting SubPlan. Transform FOREACH (x IN list | body) into a correlated SubPlan that UNWINDs list and runs body clauses, attached as an init node to the outer query so it runs per outer row but discards its output. Less new code but harder to reason about row-preservation guarantees.
Option A is probably the path that fits AGE best.
Sketch of the code changes
Grammar (cypher_gram.y)
- New
foreach non-terminal mirroring unwind (line ~974).
- New parse node
cypher_foreach mirroring cypher_unwind with fields: target_name, expr, body_clauses.
- Register in the
clause alternation and the transform dispatch in transform_cypher_clause (cypher_clause.c:504).
- Reject non-update body clauses at parse time with a location-bearing error.
Transform (transform_cypher_foreach)
- Push a parse scope with
var bound to the element type.
- Recursively transform each body clause — each becomes its own Query, chained as children of the
cypher_foreach node.
- Validate body is update-only (
cypher_create / cypher_set / cypher_delete / cypher_merge / nested cypher_foreach).
Executor
- New
src/backend/executor/cypher_foreach.c analogous to cypher_create.c.
ExecCypherForeach iterates the evaluated list, sets the ecxt_scantuple element slot, calls each child update executor in sequence, performs per-iteration cleanup, and emits no tuples — the outer tuple passes through.
Regression tests (regress/sql/cypher_foreach.sql)
- Smoke:
FOREACH (x IN [1,2,3] | CREATE (:N {v: x})) → count check.
- Nested SET:
MATCH (n:Person) FOREACH (tag IN n.tags | SET n.tag_count = n.tag_count + 1).
- MERGE inside FOREACH: idempotent tag creation pattern.
- Nested FOREACH.
- Reject reads:
FOREACH (x IN list | MATCH ...) → parse error with location.
- Empty list: no-op, outer rows preserved.
- NULL list: treat as empty (Neo4j semantics).
Open questions for maintainers
- Preference on Option A vs Option B above?
- Any concerns about adding a new CustomScan node in
src/backend/executor/ vs slotting into an existing file?
- Should
RETURN-inside-FOREACH produce a dedicated error message, or fall through the general "unexpected clause" path?
Happy to own this — wanted to file the issue first to align on strategy before writing code, given the scope.
Summary
Add openCypher
FOREACHclause support to AGE.FOREACHis a common ETL / iterative-update construct and is one of the few remaining Phase 1 Cypher parity gaps — along with pattern expressions in WHERE (#1577, PR #2360), predicate functions (PR #2359), andMERGE ON CREATE/MATCH SET(PR #2347).Cypher semantics (openCypher / Neo4j)
CREATE,MERGE,SET,REMOVE,DELETE, and nestedFOREACH.varto the current element in scope for body clauses only.MATCH,WITH,RETURN) inside the body.Examples:
Why FOREACH is not UNWIND
UNWINDflattens a list into the row stream — every element becomes an outer row.FOREACHis the opposite: its body runs side-effecting update clauses per element, but the outer row set is unchanged. You can sometimes rewrite one as the other, but not when there are projections downstream that must not be multiplied by list length.Existing AGE infrastructure that can be reused
cypher_unwindnode +transform_cypher_unwind(src/backend/parser/cypher_clause.c:1440) — list iteration, element-variable binding,UNWIND expr AS vargrammar shape.transform_cypher_set_item_list(src/backend/parser/cypher_clause.c:1862) — per-item update list transform, already parameterized viacypher_update_item.cypher_create,cypher_set,cypher_delete,cypher_merge— these are exactly the body clausesFOREACHneeds to invoke per iteration.Proposed implementation strategy
Two viable paths; happy to take maintainer input before writing code.
Option A — New
cypher_foreachCustomScan node (preferred). Analogous tocypher_create. Holds (a) the list expression, (b) pre-built child update-clause plans, (c) a per-element tuple slot. Per outer tuple: iterate the list, bindvar, invoke each child's executor in sequence; no tuples emitted. This matches AGE's existing architecture for write clauses and gives clean semantics (body runs, outer row passes through).Option B — Lower to side-effecting SubPlan. Transform
FOREACH (x IN list | body)into a correlated SubPlan that UNWINDslistand runs body clauses, attached as an init node to the outer query so it runs per outer row but discards its output. Less new code but harder to reason about row-preservation guarantees.Option A is probably the path that fits AGE best.
Sketch of the code changes
Grammar (
cypher_gram.y)foreachnon-terminal mirroringunwind(line ~974).cypher_foreachmirroringcypher_unwindwith fields:target_name,expr,body_clauses.clausealternation and the transform dispatch intransform_cypher_clause(cypher_clause.c:504).Transform (
transform_cypher_foreach)varbound to the element type.cypher_foreachnode.cypher_create/cypher_set/cypher_delete/cypher_merge/ nestedcypher_foreach).Executor
src/backend/executor/cypher_foreach.canalogous tocypher_create.c.ExecCypherForeachiterates the evaluated list, sets theecxt_scantupleelement slot, calls each child update executor in sequence, performs per-iteration cleanup, and emits no tuples — the outer tuple passes through.Regression tests (
regress/sql/cypher_foreach.sql)FOREACH (x IN [1,2,3] | CREATE (:N {v: x}))→ count check.MATCH (n:Person) FOREACH (tag IN n.tags | SET n.tag_count = n.tag_count + 1).FOREACH (x IN list | MATCH ...)→ parse error with location.Open questions for maintainers
src/backend/executor/vs slotting into an existing file?RETURN-inside-FOREACH produce a dedicated error message, or fall through the general "unexpected clause" path?Happy to own this — wanted to file the issue first to align on strategy before writing code, given the scope.