Skip to content

Feat builder#2

Merged
abnegate merged 191 commits intomainfrom
feat-builder
Apr 23, 2026
Merged

Feat builder#2
abnegate merged 191 commits intomainfrom
feat-builder

Conversation

@abnegate
Copy link
Copy Markdown
Member

@abnegate abnegate commented Mar 4, 2026

No description provided.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 4, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 550a4101-07c3-4838-8fd9-5a84b0ecb557

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat-builder

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

abnegate and others added 22 commits March 5, 2026 10:10
- Fix compileIn/compileNotIn to return `1 = 0` / `1 = 1` for empty arrays instead of invalid `IN ()` / `NOT IN ()`
- Fix NOT MATCH() syntax: wrap in parentheses for valid MySQL (`NOT (MATCH(...) AGAINST(?))`)
- Fix toRawSql SQL injection: escape single quotes in string bindings
- Fix toRawSql corruption: use substr_replace instead of preg_replace to avoid `?` and `$` in values corrupting output
- Fix page(0, n) producing negative offset: clamp to max(0, ...)
- Guard compileLogical/compileExists/compileNotExists against empty arrays producing bare `()`
- Simplify null tracking in compileIn/compileNotIn with boolean flag

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix dotted identifier wrapping (users.id → `users`.`id`)
- Escape wrap character in identifiers to prevent SQL injection
- Wrap _cursor column in identifier quotes
- Return 1=1 for empty raw SQL to prevent invalid WHERE clauses
- Treat COUNT('') as COUNT(*)
- Only emit OFFSET when LIMIT is present
- Escape LIKE metacharacters (% and _) in user input
- Validate JOIN operator against allowlist
- Fix condition provider binding order mismatch with cursor
- Wrap UNION queries in parentheses for correct precedence
- Validate ClickHouse SAMPLE fraction range (0,1)
- Use explicit map for aggregate SQL function names
- Escape backslashes in LIKE pattern values
abnegate and others added 22 commits April 23, 2026 14:07
MySQL::compileJsonContainsExpr and compileJsonOverlapsExpr called
json_encode() with no flags. If the payload contained invalid UTF-8 or
NaN/INF, json_encode returned false which was then silently bound as
boolean false — MySQL would evaluate JSON_CONTAINS(col, 0) against the
cast boolean, producing wrong-but-silent results. Route both through a
shared encodeJsonPayload() helper that passes JSON_THROW_ON_ERROR and
re-wraps JsonException as ValidationException.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…reSQL

Consolidate byte-identical `returning()` and `appendReturning()` into
`Trait\Returning` in the base namespace. Both dialects now use the
single trait and share the `returningColumns` state. MariaDB retains
its upsert guard throwing ValidationException.
Reuse existing Method::sqlFunction() helper across the three aggregate
compile sites (buildAggregationAliasMap, compileAggregate,
aggregateQueryToAstExpression) instead of repeating the identical 15-arm
match in each location. Output SQL is byte-identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the shared skeleton (loop over conflictUpdateColumns, handle
conflictRawSets, join assignments) into a concrete method on SQL.
Each dialect now only implements two small hooks: compileConflictHeader()
(leading clause) and compileConflictAssignment() (RHS for the default
EXCLUDED/VALUES(col) case). Byte-identical SQL output.
Replace the 5-copy countWhen/sumWhen/avgWhen/minWhen/maxWhen bodies in
each dialect with a private aggregateFilter helper. Each *When method is
now a 1-line call-through; dialect-specific SQL emission lives in a
single place per dialect:

- Trait\ConditionalAggregates: portable CASE WHEN ... THEN ... END
- ClickHouse: `-If` combinator (countIf, sumIf, avgIf, minIf, maxIf)

The PostgreSQL dialect (FILTER (WHERE ...)) already has this helper
from an earlier consolidated refactor.

Output SQL is byte-identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace 11 repeated `if ($clause !== '') { $parts[] = $clause; }`
blocks in build() with call-throughs to a new private appendIfNotEmpty
helper. The method body drops from 101 LOC to ~20 LOC and the clause
order is now immediately readable as a single linear list.

Output SQL is byte-identical — the helper preserves both the call order
(so side effects like populating $joinFilterWhereClauses happen before
the where clause is built) and the early-exit on empty fragments.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Prefer assertSame over assertEquals in PHPUnit tests - assertSame
checks type and value. Sweep across dialect builder tests, query/
schema/hook/exception tests, and integration tests.

Only exceptions preserved:
- assertEqualsWithDelta calls (legitimate float-delta comparisons)
- One PDO row int coercion cast inline where MySQL returns TINYINT
  as string
- Removed a now-redundant local @var annotation whose shape conflicted
  with the narrower assertSame-inferred type.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consolidate the $conflictKeys / $conflictUpdateColumns / $conflictRawSets /
$conflictRawSetBindings property declarations into Trait\Inserts, which
is the writer (onConflict()). Trait\Upsert reads them and Trait\Selects
resets them, both via \$this-> since Trait\Inserts is composed on the
base Builder class used by all dialects (SQL + MongoDB + ClickHouse).
Single owner, reachable by every code path.
Resolve the name clash with PostgreSQL's deleteUsing(), which has a
different signature. MySQL's version is symmetric with updateJoin()
(already called deleteAlias + join-table + left/right columns), so
deleteJoin is the natural name. Also rename the internal state
fields (deleteUsingTable/Left/Right -> deleteJoinTable/Left/Right)
for consistency. Update all call sites in the MySQL unit tests.
Shrink tokenize() from 141 to 39 lines by replacing the long if/continue
chain with a match (true) dispatch table and extracting compound branches
(-, /, ., :, $) into readDashPrefix/readSlashPrefix/readDot/
readColonPrefix/readDollarPrefix helpers. Simple single-char tokens share
consumeSingleChar; the default arm goes through readOperatorOrUnknown.
getIdentifierQuoteChar is now fetched once per tokenize() call instead of
per character.

Preserves ordered dispatch semantics and tokens are byte-identical.
classifySQL perf bench stays at ~1.39 us/query (target <2 us).
Grow MariaDB integration tests from 5 to 37 covering joins, unions,
CTEs (including recursive), windows, subqueries, CASE, FOR UPDATE,
JSON, and spatial - while preserving MariaDB-specific RETURNING and
SEQUENCES cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Decompose the 231-line buildAggregate() god method into ten
append*-per-stage-group helpers, each appending stages to a pipeline
list passed by reference. buildAggregate() now reads as a linear
composition of pipeline phases.

Also split needsAggregation() into hasPipelineOnlyFeature() and
hasSubqueryFeature() sub-predicates for readability.

Behaviour is unchanged: generated MongoDB aggregation documents and
bindings are byte-identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ditionalArrayUpdates/AggregateFilter/DistinctOn/MariaDBReturning

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… suites

Where the full generated SQL is deterministic, replace substring assertions
with exact-match assertSame to catch regressions in spacing, clause ordering,
and quoting. Collapses multi-fragment substring blocks into single assertions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the three parallel-field groups on PostgreSQL with readonly DTOs
(UpdateFrom, DeleteUsing, MergeTarget) so related state moves together
instead of living in loose scalars. Collapses 12 fields to 3 nullable
slots and extracts the shared "merge extra conditions into trailing
WHERE" logic into a private mergeIntoWhereClause helper. Pure
refactor — emitted SQL is byte-identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collapse near-identical groupConcat/jsonArrayAgg/jsonObjectAgg
overrides across MySQL, PostgreSQL, SQLite, and ClickHouse into a
single trait. Each dialect now implements three small *Expr hooks
that return the SQL function-call shape; the trait handles the
alias quoting, order-by compilation, and select() dispatch.

Generated SQL is byte-identical for every dialect -- binding
order for groupConcat's separator is preserved by routing the
`?` placeholder through the dialect hook so each engine keeps
its native ordering relative to ORDER BY (MySQL: before SEPARATOR,
SQLite: before comma, PostgreSQL: after separator, ClickHouse: no
ORDER BY).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the duplicated \$groupByModifier state field plus the three
withRollup/withCube/withTotals default overrides into a single
trait. Dialects now opt in to whichever subset they support by
overriding the relevant method; the inherited defaults throw
UnsupportedException so unsupported combinations fail clearly.

MySQL enables ROLLUP only (MariaDB inherits from MySQL).
PostgreSQL enables ROLLUP and CUBE.
ClickHouse enables all three (TOTALS, ROLLUP, CUBE).
SQLite stays opted out entirely.

Dialect-specific emission strategy is unchanged -- MySQL and
PostgreSQL keep their post-build() string manipulation, and
ClickHouse keeps its buildAfterGroupByClause() hook -- so
generated SQL is byte-identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two deprecation sources removed:

1. ReflectionMethod::setAccessible() — no-op since PHP 8.1, deprecated
   since 8.5. Dropped the single call in MySQLTest (invoke() works
   directly on the ReflectionMethod without it).

2. Query::contains() — introduce Query::containsString() as the
   canonical name for string substring matching (LIKE '%value%').
   contains() stays deprecated for external consumers; tests now call
   containsString() so the suite runs clean. Updated the deprecation
   message to point to the two concrete alternatives.

Also add .claude/worktrees/ to .gitignore so parallel-agent worktree
directories are never committed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three issues flagged by Greptile on PR #2:

1. P1: FETCH NEXT and singular ROW not handled (Parser.php)
   SQL standard (and PostgreSQL/DB2) allow FETCH FIRST|NEXT n ROW|ROWS
   ONLY. Parser only accepted FIRST + plural ROWS. Now matches both
   FIRST and NEXT, and both ROW and ROWS. Added 3 regression tests
   covering the new variants.

2. P2: Dead condition in CTE column list guard (Parser.php)
   `!$this->matchKeyword('AS')` is unconditionally true when the
   current token is a LeftParen (a LeftParen isn't a Keyword token).
   Removed the dead conjunction; peekIsColumnList() already
   disambiguates.

3. P2/security: Star expressions bypass column allowlist
   (ColumnValidator.php)
   visitExpression only inspected Column nodes; `SELECT *` or
   `SELECT t.*` silently passed through, bypassing the whitelist when
   ColumnValidator is used as a security gate. Now rejects Star by
   default; callers that want wildcards opt in via
   `new ColumnValidator($cols, allowStar: true)`. Added 2 tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upgrade and pin:
  actions/checkout    v4   -> v6.0.2   (de0fac2e4500dabe0009e67214ff5f5447ce83dd)
  shivammathur/setup-php v2 -> 2.37.0  (accd6127cb78bee3e8082180cb391013d204ef9f)

SHA pinning follows the GitHub-recommended hardening practice: a mutable
tag like `@v4` can be re-pointed by the action owner to a different commit,
which would run unreviewed code in our CI. Pinning to the immutable commit
SHA makes every CI run reproducible and defeats compromised-tag supply-chain
attacks. The `# vX.Y.Z` trailing comment keeps the version human-readable
and usable with Dependabot version-tracking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Changes:
- composer.json: new `test:coverage` script — paratest + --coverage-clover
  coverage.xml + --coverage-text.
- phpunit.xml: add cacheDirectory=".phpunit.cache" (required by PHPUnit 12
  for coverage's static-analysis cache).
- tests.yml: install pcov via setup-php `coverage: pcov`, run the coverage
  script, parse clover totals with a small inline php snippet, write a
  markdown summary to $GITHUB_STEP_SUMMARY, and post/update a sticky PR
  comment with lines/branches/methods/classes percentages. Uses
  marocchino/sticky-pull-request-comment (SHA-pinned).
- .gitignore: ignore .phpunit.cache/ (coverage.xml already ignored).

Coverage is computed from the unit-test suite only — integration tests
hit real DBs and contribute additional coverage, but combining clover
reports across jobs would need a separate workflow pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restructure CI into a single ci.yml with three jobs:

  unit          — paratest with pcov, uploads unit.cov
  integration   — phpunit with pcov + DB services, uploads integration.cov
  coverage      — needs [unit, integration], downloads both .cov artifacts,
                  merges with phpcov, posts combined sticky PR comment

Previously tests.yml and integration.yml were separate workflows and
could not share artifacts via needs:. Combining lets the coverage job
wait on both and report unified totals.

New composer script test:integration:coverage outputs
coverage/integration.cov. The existing test:coverage script now also
outputs a .cov file (instead of clover) so phpcov can merge across the
two suites.

Added phpunit/phpcov dev dep + pinned two new actions:
  actions/upload-artifact   @ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.3
  actions/download-artifact @d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

Deleted tests.yml and integration.yml (replaced by ci.yml).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 23, 2026

📊 Coverage

Metric Covered Ratio
Lines 91.75% 6797 / 7408
Methods 83.39% 929 / 1114
Classes 63.47% 106 / 167

Full per-file breakdown in the job summary.

abnegate and others added 5 commits April 23, 2026 18:37
Branches: PCOV reports line coverage only — it never emits
conditionals/coveredconditionals. The 0% was misleading. Removed the
row; added a footnote explaining the limitation and why we don't swap
to xdebug (~10× slowdown on the unit suite for just this metric).

Classes: the clover project-level metrics do not carry coveredclasses;
the value was defaulting to zero. Fix by iterating every <class> in the
XML and counting those where coveredmethods == methods (all methods
covered). This matches how Clover-consuming tools interpret class
coverage.

Also include raw covered/total counts beside each percentage so the
numbers are readable when coverage moves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a baseline job that checks out the PR's base branch (main),
runs the same unit + integration coverage pipeline, merges into
baseline.xml, and uploads as an artifact. The coverage job then
downloads both PR and baseline reports and posts a comparison table:

| Metric  | PR                | Baseline | Δ      |
|---------|-------------------|----------|--------|
| Lines   | 91.75% (12k/13k)  | 91.50%   | +0.25% |
| Methods | 83.39% (1.2k/1.5k)| 83.00%   | +0.39% |
| Classes | 80.00% (N/M)      | 79.50%   | +0.50% |

Baseline result is cached by the base-branch commit SHA so the
expensive recomputation only happens once per main merge — subsequent
PRs against the same main SHA hit the cache and skip test execution.

Baseline is `continue-on-error: true` and the coverage job uses
`always()` gating plus a conditional artifact download, so a broken
main doesn't block PR coverage from posting — the comment just falls
back to the PR-only table.

New pinned action: actions/cache @0057852b # v4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the PR workflow's baseline job ran the full unit +
integration suite against main on every PR (cached by main SHA, but
still executed on the first PR after each main merge — expensive).

New shape:
  - baseline.yml   : push [main] -> compute coverage, save to actions/cache
                     under key coverage-baseline-<commit-sha>
  - ci.yml baseline: restore-only. Looks up
                     coverage-baseline-<pr.base.sha> and re-publishes it
                     as an artifact for the coverage job. No compute, no
                     DB services, no PHP setup — pure cache relay.

Graceful fallback unchanged: if the cache entry doesn't exist yet (PR
opened before the main-push workflow finished), baseline job produces
no artifact and the coverage job reverts to the PR-only table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@abnegate abnegate merged commit 58d3b7e into main Apr 23, 2026
6 checks passed
@abnegate abnegate deleted the feat-builder branch April 23, 2026 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants