Describe the bug
OPTIONAL MATCH may incorrectly drop null-preserving outer rows when its WHERE clause contains a correlated subquery predicate.
In the repro below, no matched friend satisfies the EXISTS { ... } predicate. Under Cypher optional semantics, each outer p row should still be preserved with friend = null.
Instead, Apache AGE returns no rows at all.
How are you accessing AGE (Command line, driver, etc.)?
- PostgreSQL
cypher(...) wrapper through the local Python differential-testing harness
- Reproducible directly in
psql inside the Docker container
What data setup do we need to do?
SELECT * FROM cypher('fuzz_graph', $$
CREATE (a:Person {name: 'Alice', age: 30}),
(b:Person {name: 'Bob', age: 25}),
(c:Person {name: 'Charlie', age: 35}),
(a)-[:KNOWS]->(b),
(a)-[:KNOWS]->(c)
$$) AS (v agtype);
What is the necessary configuration info needed?
- Plain Apache AGE Docker image was enough
- Docker image in local repro:
apache/age
- AGE extension version:
1.7.0
- PostgreSQL version:
18.1
- Graph name used in repro:
fuzz_graph
- No extra extensions or special configuration were required
What is the command that caused the error?
SELECT * FROM cypher('fuzz_graph', $$
MATCH (p:Person)
OPTIONAL MATCH (p)-[:KNOWS]->(friend:Person)
WHERE EXISTS { (friend)-[:KNOWS]->(:Person) }
RETURN p.name AS name, friend.name AS friend
ORDER BY name, friend
$$) AS (name agtype, friend agtype);
Returned result on AGE:
Expected behavior
Since neither Bob nor Charlie knows another person in this setup, the optional pattern should fail to bind friend, but the outer p rows should remain:
Alice, null
Bob, null
Charlie, null
Environment (please complete the following information):
- Version: Apache AGE
1.7.0
- PostgreSQL:
18.1
- Host OS: Windows 10
- Architecture: x86_64
- Deployment: Docker
Additional context
Two control cases suggest this is specifically tied to correlated subquery predicates on OPTIONAL MATCH, not to OPTIONAL MATCH in general.
Control case 1, without the correlated subquery predicate, behaves as expected:
SELECT * FROM cypher('fuzz_graph', $$
MATCH (p:Person)
OPTIONAL MATCH (p)-[:KNOWS]->(friend:Person)
RETURN p.name AS name, COUNT(DISTINCT friend.name) AS friendCount
ORDER BY name
$$) AS (name agtype, friendCount agtype);
Expected and observed result:
Alice, 2
Bob, 0
Charlie, 0
Control case 2, with WHERE false, also preserves null rows correctly:
SELECT * FROM cypher('fuzz_graph', $$
MATCH (p:Person)
OPTIONAL MATCH (p)-[:KNOWS]->(friend:Person)
WHERE false
RETURN p.name AS name, friend.name AS friend
ORDER BY name
$$) AS (name agtype, friend agtype);
Expected and observed result:
Alice, null
Bob, null
Charlie, null
A second variant in the same family also reproduces the issue with COUNT { ... }:
SELECT * FROM cypher('fuzz_graph', $$
MATCH (p:Person)
OPTIONAL MATCH (p)-[:FRIEND]->(f)
WHERE COUNT { MATCH (p)-[:FRIEND*..2]->(x) RETURN x } > 1
RETURN p.name AS person, f.name AS friend
ORDER BY person, friend
$$) AS (person agtype, friend agtype);
Describe the bug
OPTIONAL MATCHmay incorrectly drop null-preserving outer rows when itsWHEREclause contains a correlated subquery predicate.In the repro below, no matched
friendsatisfies theEXISTS { ... }predicate. Under Cypher optional semantics, each outerprow should still be preserved withfriend = null.Instead, Apache AGE returns no rows at all.
How are you accessing AGE (Command line, driver, etc.)?
cypher(...)wrapper through the local Python differential-testing harnesspsqlinside the Docker containerWhat data setup do we need to do?
What is the necessary configuration info needed?
apache/age1.7.018.1fuzz_graphWhat is the command that caused the error?
Returned result on AGE:
Expected behavior
Since neither
BobnorCharlieknows another person in this setup, the optional pattern should fail to bindfriend, but the outerprows should remain:Environment (please complete the following information):
1.7.018.1Additional context
Two control cases suggest this is specifically tied to correlated subquery predicates on
OPTIONAL MATCH, not toOPTIONAL MATCHin general.Control case 1, without the correlated subquery predicate, behaves as expected:
Expected and observed result:
Control case 2, with
WHERE false, also preserves null rows correctly:Expected and observed result:
A second variant in the same family also reproduces the issue with
COUNT { ... }: