DestinationAppwrite: tolerate existing schema on Skip/Upsert re-migration#171
DestinationAppwrite: tolerate existing schema on Skip/Upsert re-migration#171premtsd-code wants to merge 38 commits intomainfrom
Conversation
Re-migration needs the destination to tolerate existing schema resources.
The database library stays minimal — migration owns the reconciliation
decision since it has both source and destination metadata in hand.
Per-resource behavior:
- Database: pre-check `databases` metadata. If exists, hydrate sequence
from the existing document and return. Both Skip and Upsert tolerate
unconditionally — databases contain the entire tree of collections +
rows, dropping would destroy everything.
- Table: pre-check `database_{sequence}` metadata. Same pattern —
tolerate unconditionally, hydrate sequence, never drop. Attribute and
index reconciliation happens per-resource at the lower layer.
- Attribute: pre-check `attributes` metadata by composite id
({databaseSeq}_{tableSeq}_{key}). Skip tolerates. Upsert compares
source's updatedAt vs destination's — if source is strictly newer the
column is dropped (via `dbForDatabases->deleteAttribute`) and recreated
so the spec matches source. Column data is wiped on drop; rows will be
repopulated by the row-level Upsert path that follows.
- Index: same pattern against `indexes` metadata. Drop is cheap because
indexes carry no user data — only rebuild time.
Approach is pre-check rather than try/catch for control flow: migration
is sequential single-writer, so the race condition justification for
try/catch doesn't apply, and pre-check reads top-to-bottom with no
exception-as-control-flow.
Row-level dispatch (unchanged, already committed):
OnDuplicate::Upsert → $dbForDatabases->upsertDocuments(...)
OnDuplicate::Skip → $dbForDatabases->skipDuplicates(fn () => createDocuments(...))
OnDuplicate::Fail → plain createDocuments
Both row-level primitives are existing library APIs — no database-library
changes are required for this feature.
Helpers:
shouldTolerateSchemaDuplicate() // onDuplicate !== Fail
sourceSpecIsNewer($src, $dst) // strtotime compare, false on empty
Greptile SummaryThis PR adds re-migration tolerance to Confidence Score: 4/5Safe to merge for fresh migrations; Overwrite re-migration may trigger unnecessary drop+recreate for attributes where empty filters/formatOptions arrays are stored as null by Appwrite. All previously flagged P1 issues are addressed in this revision. The two remaining comments are P2: a null-vs-empty-array spec mismatch that could trigger unnecessary drop+recreate in Overwrite mode, and a minor missing partner-table cache purge inside dropAttributeForRecreate. Neither breaks fresh migrations or causes data loss. src/Migration/Destinations/Appwrite.php — valuesMatch / attributeSpecMatches null-vs-[] edge case and dropAttributeForRecreate partner cache purge. Important Files Changed
Reviews (32): Last reviewed commit: "Rename SchemaAction cases to match Appwr..." | Re-trigger Greptile |
Guard against malformed datetime strings (e.g. "0000-00-00 00:00:00") that PHP's strtotime() returns false for. Without the explicit check, false gets loose-compared as 0 and the helper silently returns false, meaning a corrupted source updatedAt would always tolerate the existing destination entry and never trigger drop+recreate. Appwrite itself always emits parseable RFC 3339 timestamps, so this is mainly defensive for non-Appwrite sources (Supabase, NHost, CSV). Flagged by greptile P2.
On a first-run migration createField writes TWO metadata documents for a
two-way relationship attribute: parent-side at
{dbSeq}_{tableSeq}_{key} (line 763) and child-side at
{dbSeq}_{relatedTableSeq}_{twoWayKey} (line 819). Migration operates
directly on the database via $dbForProject — it doesn't rely on
Appwrite's attribute-event workers to derive the child side — so both
documents are physically written by migration itself.
The Upsert drop block was only removing the parent-side document +
column. On the recreate pass that followed, line 819 re-wrote the
child-side document and immediately hit DuplicateException because the
stale child doc from the previous migration run was still sitting there.
The inner catch rolled back the parent and the whole attribute aborted
with "Attribute already exists" — breaking Upsert re-migration for every
two-way relationship whose source spec had changed.
Fix: in the Upsert drop branch, when the resource is a two-way
relationship, also delete the child-side metadata document and the
child-side physical column on the related table, then purge caches.
Both deletes are wrapped in try/catch to tolerate the case where a prior
interrupted run (or utopia-php/database's relationship handling) already
cleaned one side — the goal is to guarantee a clean slate for the
downstream recreate, not to require both sides to still exist.
Flagged by greptile P1. Reproducing scenario:
- OnDuplicate::Upsert
- Destination already contains the attribute
- Source's two-way relationship spec was modified between runs
(source.updatedAt > dest.updatedAt)
DestinationAppwrite's shouldTolerateSchemaDuplicate() and sourceSpecIsNewer() were private helpers that encoded mode-specific behavior. Moving them onto OnDuplicate itself puts the behavior on the type that owns it and makes the call sites self-documenting: $this->onDuplicate->toleratesSchemaDuplicate() $this->onDuplicate->shouldReconcileSchema($src, $dst) shouldReconcileSchema() also subsumes the explicit Skip check at the call sites — it returns false for Skip and Fail, so the compound "`Skip || !sourceSpecIsNewer(...)`" condition collapses to a single "`!shouldReconcileSchema(...)`". No behavior change. Lint + PHPStan L3 clean.
OnDuplicate previously exposed two methods (toleratesSchemaDuplicate +
shouldReconcileSchema) that call sites had to combine. That split meant
the decision was spread across two methods and the call site — adding a
new mode later would require updating all three places in sync.
Replace with a single resolveSchemaAction() that returns one of three
SchemaAction enum cases (Create / Tolerate / DropAndRecreate). The
SchemaAction enum lives in the same file as OnDuplicate so the complete
decision surface is in one place.
For data-preserving containers (databases, tables) the method accepts
allowDrop=false so it can never return DropAndRecreate — callers that
use that path don't need to worry about accidentally dropping the
container. In practice the container call sites just use the inline
`$this->onDuplicate !== OnDuplicate::Fail` gate and then a simple
"is empty?" branch; resolveSchemaAction() is reserved for the leaves
(attributes, indexes) that actually benefit from the three-way choice.
Fail-mode short-circuit is preserved at every call site: no
getDocument pre-check when onDuplicate is Fail, so first-time
migrations (the common default-mode path) pay zero overhead. The
short-circuit stays at the call site rather than inside the enum method
so the "don't query on Fail" invariant is visible where the query is
issued.
Call sites:
- createDatabase / createEntity: inline `!== Fail` gate, tolerate on
existing. No drop ever.
- createField: Fail short-circuit, else resolveSchemaAction + match.
DropAndRecreate triggers the two-way relationship cleanup.
- createIndex: same pattern, no two-way concern.
…havioral comments
Three review improvements to the schema-tolerance work:
1. OnDuplicate::resolveSchemaAction — remove the $allowDrop parameter.
No caller passes it; all four call sites rely on the default. Containers
(databases, tables) bypass the method entirely with an inline
"!== Fail" check. The parameter advertised tuning that wasn't real,
confusing readers about what was actually gated. Docstring tightened
to mention that this method is only safe for leaf resources.
2. Extract deleteAttributeCompletely() as a reusable primitive that
fully removes an attribute from destination — both metadata documents
AND physical columns, mirroring createField's two-document write for
two-way relationships. The Upsert drop block in createField shrinks
from ~40 nested lines to a single method call; the two-way cleanup
is now a named concern that future non-Upsert paths (Merge mode,
explicit rollback, etc.) can reuse without duplication.
3. Replace line-number comments ("the first-run createField wrote TWO
metadata docs (parent at line 763, child at line 819)") with
behavioral descriptions ("createField writes a second `attributes`
document on the related table"). Line references rot on every edit;
behavioral descriptions don't.
Pure refactor — no functional change. Pint + PHPStan L3 clean.
… handling resolveSchemaAction is the load-bearing decision point for re-migration tolerance. Unit test locks the full mode × existence × timestamp matrix so a future refactor can't silently shift the mapping. 14 tests, sub-millisecond to run, no destination spin-up required. Coverage: - Not-exists returns Create for all modes. - Fail + exists → Create (routed through create flow so library DuplicateException surfaces as designed, not silently tolerated). - Skip ignores timestamps entirely — "don't touch" contract enforced by verifying both timestamp orderings produce Tolerate. - Upsert with source strictly newer → DropAndRecreate. - Upsert with dest newer or equal timestamps → Tolerate (strict > gate avoids unnecessary destructive rebuild when nothing has changed). - Upsert with unparseable timestamps → Tolerate (conservative, dataset covers empty strings, '0000-00-00 00:00:00', and non-date garbage). - OnDuplicate::values() case ordering guarded against accidental rename/reorder that would break SDK param validators. Writing the test surfaced a real gap in sourceIsNewer: PHP's strtotime() accepts '0000-00-00 00:00:00' leniently and returns a large negative epoch (-62169984000) rather than false. The previous `=== false` check missed the MySQL zero-date sentinel and would have returned true when source was a valid date and dest was '0000-00-00', triggering a destructive drop on garbage input. Harden sourceIsNewer to additionally reject non-positive epochs (<= 0) — any legitimate Appwrite updatedAt is well past 1970-01-01.
Extends the Upsert reconciliation to cover container metadata drift.
Previously databases and tables always tolerated an existing destination
under Upsert — source renames, enable toggles, and (for tables)
permission changes silently never reached destination on re-migration.
Now Upsert + source strictly newer triggers an updateDocument on the
container's own metadata doc, leaving children (rows, sub-resources)
untouched.
Mechanism:
- New SchemaAction::UpdateInPlace case alongside Create / Tolerate /
DropAndRecreate.
- OnDuplicate::resolveSchemaAction gains a canDrop parameter (default
false — safer default, destructive reconciliation must be opt-in):
canDrop: true → DropAndRecreate on Upsert-newer (attributes, indexes)
canDrop: false → UpdateInPlace on Upsert-newer (databases, tables)
- createField / createIndex pass canDrop: true.
- createDatabase / createEntity rely on the default and handle
SchemaAction::UpdateInPlace by calling updateDocument with a scoped
field set:
Database → name, search, enabled, type, originalId, database, $updatedAt
Table → name, search, enabled, $permissions, documentSecurity, $updatedAt
Immutable fields ($id, $createdAt, internal sequences) are never touched.
Source-wins semantics for table permissions follow the full-Upsert
contract: if the caller wants destination drift preserved across re-runs,
they should pick Skip mode, not Upsert. Row Upsert already works this
way; containers now match.
Unit tests cover both canDrop branches and confirm Skip never returns
UpdateInPlace regardless of timestamps (reinforcing the "don't touch"
contract). 16 tests, pint + phpstan L3 clean.
- Track attribute/index keys seen from source per (db, table); after rows land per table (and in a final sweep at end-of-run) drop destination attributes/indexes whose keys source did not emit. Source rename (delete+create) on source now propagates as a drop on destination without leaving the old column orphaned. Gated to Upsert mode only. - resolveSchemaAction now accepts nullable timestamps — attribute docs with missing $updatedAt no longer throw TypeError, they fall through to Tolerate (same conservative handling as zero-date/garbage input). - Tolerate branches mark resources as SKIPPED instead of implicit SUCCESS so re-migration counters reflect 'already exists' no-ops. - Internal refactor: consolidated 3 orphan-tracker state fields into one $orphanScopes map; extracted dropOrphansByKind(), tolerateMissing(), dropTwoWayMirror() helpers; renamed deleteAttributeCompletely to dropAttributeForRecreate.
- $orphanScopes -> $orphansByTable - orphanScopeKey() -> tableIdentity() - cleanupUpsertOrphansForScope() -> cleanupUpsertOrphansForTable() - $scopeKey local -> $tableId - $scope local -> $tracked (what source declared for this table) No behavior change. The old names were consumer-oriented jargon; the new names name the actual unit (one entry per table).
Processing one side of a two-way relationship already reconciles both metadata docs and both physical columns (via createRelationship on Create/DropAndRecreate, or via the mirror in dropTwoWayMirror on DropAndRecreate). Re-processing the partner side would either waste work or — worse — drop the mirror after the partner table's rows already migrated, destroying relationship data. Track a canonical pair-key (sorted (table, key) tuples scoped to the database) as soon as createField finishes a two-way attribute. On the partner's second pass the check at the top of createField sees the pair in $processedTwoWayPairs and short-circuits with STATUS_SKIPPED. Covers the scenario where one side's source _updatedAt is newer than the other: whichever side is processed first wins; the partner is SKIPPED regardless of its own timestamp. Data safety preferred over honoring a second, potentially-divergent schema decision.
Relationships can't take the DropAndRecreate path cleanly: - utopia's deleteAttribute throws for VAR_RELATIONSHIP - Two-way drops cascade to the partner table's physical column and destroy its row data (partner's rows migrated earlier in the run won't get re-migrated) Route relationships through SchemaAction::UpdateInPlace instead: - canDrop: !$isRelationship on resolveSchemaAction - New updateRelationshipInPlace() helper calls $dbForDatabases->updateRelationship() for the fields utopia supports (newTwoWayKey, twoWay, onDelete) + updateDocument() on Appwrite's application-level doc - relationType changes can't be updated in place and would silently diverge utopia's internal metadata from Appwrite's doc; throw a clear error asking the caller to drop+recreate manually Each side of a two-way is authoritative for its own Appwrite-level doc. utopia's updateRelationship cascades internal metadata on both sides in one call and is idempotent for same-target writes, so neither side's pass stomps the other. Dead-code cleanup: delete dropTwoWayMirror() and the relationship branch of dropAttributeForRecreate — both were only reachable from paths that no longer fire. Simplifies dropAttributeForRecreate signature (drops unused $type and $relatedTable params). Fix dropOrphanAttribute for relationship orphans: deleteAttribute throws for VAR_RELATIONSHIP, so use deleteRelationship which cascades both sides' physical columns + utopia metadata. Clean the two Appwrite-level docs separately. All 34 unit tests pass, pint + phpstan L3 clean.
… updates resolveSchemaAction now distinguishes 'source resource was deleted+ recreated' (createdAt diff) from 'source resource was edited' (updatedAt-newer with createdAt unchanged). The first triggers DropAndRecreate; the second triggers UpdateInPlace. Containers (databases, tables) get UpdateInPlace in both cases (canDrop=false) since dropping would cascade child loss. UpdateInPlace is now SDK-aligned. New helpers updateAttributeInPlace (non-relationships) and refactored updateRelationshipInPlace return bool: true if the source change maps to an SDK updateXAttribute endpoint (required, default, size, min/max, elements, newKey, onDelete), false to fall through to DropAndRecreate. Non-SDK fields (type, array, signed, format, formatOptions, filters) and structural relationship changes (relationType, twoWay toggle, twoWayKey, relatedCollection) automatically route to drop+recreate. Both relationship types (one-way and two-way) now use deleteRelationship on the DropAndRecreate path. One-way is a clean drop (no partner data to lose). Two-way drops both columns; partner row data is restored by the subsequent Upsert row pass. canDrop on attributes is now uniformly true. Indexes have no SDK in-place primitive, so UpdateInPlace consults indexSpecMatches: matching specs tolerate (filters phantom updatedAt drift), differing specs drop and recreate. Maintainability passes: extracted databaseCollectionId, tableCollectionId, attributeIndexMetaId, purgeTableCaches, arraysDifferOnKeys helpers. SDK-boundary fields lifted to ATTRIBUTE_NON_SDK_FIELDS and RELATIONSHIP_STRUCTURAL_FIELDS class constants — single point of change when SDK adds new endpoints. Test matrix expanded: createdAt diff × canDrop × mode × unparseable timestamp combinations.
utopia's updateRelationship updates the physical constraint on both sides, but the Appwrite-level attributes meta doc on the partner side keeps the old options.onDelete value. Reads from the related collection's attribute list (Appwrite API, validators, subsequent re-migration pre-checks) see stale data. Mirrors the partner-side cleanup pattern already used in dropAttributeForRecreate. One-way unaffected (onDelete change on one-way already routes to DropAndRecreate).
Source emits both sides of a two-way relationship as separate Column resources, but processing one side already reconciles both physical columns + both Appwrite-level meta docs (utopia's createRelationship/ deleteRelationship cascade). Without dedup, the partner pass can fire DropAndRecreate again — partner's source createdAt is independent of parent's and may not match what we just stamped on dest's partner meta (parent's createdAt). Reintroduces the pair-key dedup that existed in 791dbb1 and was removed in 6e6f825 when relationships routed exclusively through UpdateInPlace. Commit 786a27b reopened the DropAndRecreate path for createdAt-different without restoring the dedup; this fixes that gap. Pair-key is canonical (sorted, scoped to dbSeq) so both sides resolve to the same string. Set after Tolerate, after UpdateInPlace success, and after the create-path completion. Skip check at the top of createField.
Four targeted fixes from a code review pass on the schema-reconciliation
code, picked for clear correctness or future-debug-cost improvements:
1. META_DATABASES / META_ATTRIBUTES / META_INDEXES class constants. ~30
bare collection-name string literals are now phpstan-checkable and a
typo no longer compiles. Avoided false-positive replacements in
getAttribute('attributes', []) / collectionStructure['indexes'] etc.,
which name UtopiaDocument fields, not collections.
2. resetRunState() called at top of run(). $rowBuffer, $orphansByTable,
and $processedTwoWayPairs are instance state that previously leaked
across run() invocations on a reused Appwrite destination instance.
A second run on the same instance would inherit the first run's
processed-pair set and silently SKIP unrelated two-way attributes.
3. tolerateMissing() renamed to bestEffort(). The previous name implied
it caught only NotFound errors, but the implementation swallows all
\Throwable. Honest naming prevents 'why didn't the orphan get
cleaned up' debugging time later.
4. Inconsistent partner-side error message: 'Column limit exceeded' on
line 949 was a copy-paste drift from the parent block's correct
'Attribute limit exceeded'. Same conceptual failure should surface
the same user-visible message.
5. TwoWayPartner value object + resolveTwoWayPartner() helper. Three
sites (dropAttributeForRecreate, refreshTwoWayPartnerOnDelete,
dropOrphanAttribute) independently extracted relatedCollection /
twoWayKey from options with subtly different fallback handling
(?? null vs ?? ''). Single helper returns ?TwoWayPartner with
relatedTable / partnerKey / partnerMetaId; null when not two-way or
the partner table has gone away. Inconsistent fallbacks were a P1
bug waiting to ship.
All four are P1 reviewer concerns. Larger items (split createField,
template-method for the pre-check skeleton) deferred to a follow-up
PR per the user's explicit instruction.
31/31 unit tests pass, pint clean, phpstan level 3 clean.
The 3-field readonly value object was thin enough that an array shape
(typed via phpdoc) covers the same phpstan-checkable surface without a
separate class file. Helper return type narrows from ?TwoWayPartner to
?array{relatedTable, partnerKey, partnerMetaId} and call sites switch
from $partner->field to $partner['field'].
Reflects on appwrite/appwrite's TablesDB and Databases services to build the set of meta-doc fields the SDK can mutate via update*Column / update*Attribute endpoints, then asserts no overlap with our ATTRIBUTE_NON_SDK_FIELDS constant. When the SDK ships a new mutable param (e.g. exposing 'signed' or 'array' on Update), CI fails so we sync the constant. Catches drift in CI rather than at runtime. PARAM_TO_META_FIELD documents the SDK-vs-meta-doc vocabulary mismatch: - xdefault -> default (PHP keyword workaround) - min/max/elements -> null (sub-fields of formatOptions, not direct) - onDelete -> null (sub-field of relationship options) Currently observed SDK-exposed top-level fields: required, default, size. Constant correctly lists: type, array, signed, format, formatOptions, filters. Zero overlap, test passes.
Adds an unconditional spec-match check after resolveSchemaAction returns DropAndRecreate or UpdateInPlace. If source's full attribute/table/db/ index spec already matches dest, action is forced to Tolerate — no DDL, no in-place call, no row-data wipe. Closes the gap where a user dropping + recreating a column on source with the same spec triggered a wasteful drop+recreate cascade on dest (createdAt advanced even though spec was unchanged). Row pass under Upsert mode still propagates source row values via upsertDocuments. Helpers: databaseSpecMatches, tableSpecMatches, attributeSpecMatches (handles relationship via RELATIONSHIP_STRUCTURAL_FIELDS + onDelete), indexSpecMatches (existing). Inserted as a single override at each of the four pre-check sites. createIndex: removed the inline indexSpecMatches IIFE inside UpdateInPlace since the unified outer guard now subsumes it. The 'Index spec unchanged' status message merges into the generic 'Already exists on destination' — slight loss of fidelity, gain in symmetry across the four create methods. 47/47 unit tests pass, pint clean, phpstan level 3 clean.
Mostly mechanical reduction of multi-line WHY blocks to one-liners and removal of obvious WHAT comments. Keeps load-bearing rationale (e.g. 'utopia syncs both physical sides; partner meta has to be refreshed by hand') as one short line per concern.
- valuesMatch helper: ksort assoc arrays before === so {min,max} matches
{max,min}. Lists (filters, columns) keep order-sensitive comparison
since order is semantically meaningful there.
- Apply via attributeSpecMatches (formatOptions, relationship options,
onDelete) and arraysDifferOnKeys (used by updateAttributeInPlace and
updateRelationshipInPlace).
- One-line comment on each spec-match guard documenting why the Create
case is excluded (no existing dest doc to match against).
…ything
resolveSchemaAction's createdAt branch became redundant once the
spec-match guard landed. The flow:
- resolveSchemaAction returns Create / Tolerate / UpdateInPlace
based purely on existence and updatedAt-newer.
- The destination's spec-match guard overrides UpdateInPlace to
Tolerate when source and dest have identical spec.
- Inner attribute/index spec checks fall through to drop+recreate
when the SDK can't express the change in place.
End-state behavior on every scenario is identical (verified before
this commit). The createdAt path was extra work that produced the
same outcome via a longer route.
Removed:
- $sourceCreatedAt / $destCreatedAt / $canDrop params from
resolveSchemaAction
- createdAtDiffers private helper
- SchemaAction::DropAndRecreate enum case (no longer returned by
resolveSchemaAction; inner fallthrough drops directly)
- 4 call sites in Appwrite.php drop the createdAt args + canDrop:
named arg + DropAndRecreate match arm + DropAndRecreate post-match
if-clause
- ~13 createdAt-specific unit tests; remaining 19 test the
exists × mode × updatedAt matrix
1. Stale $table after dropAttributeForRecreate caused false LimitException $table is loaded once at the top of createField. After dropAttributeForRecreate runs (UpdateInPlace fallthrough), the meta doc + physical column are gone from the DB but the in-memory $table object still has the dropped entry in its 'attributes' array. utopia's checkAttribute clones $table and appends the new attribute, then asks the adapter for getCountOfAttributes — at the adapter ceiling this throws LimitException even though the net change is zero. Fix: re-fetch $table after dropAttributeForRecreate. 2. Partner meta-doc deleted from wrong table when relatedCollection changes dropAttributeForRecreate looked up the partner meta via $resource->getOptions() — source's options. But it runs precisely when RELATIONSHIP_STRUCTURAL_FIELDS differ (relatedCollection, twoWayKey, etc), so source's options point to the NEW partner table while dest's existing partner meta lives on the OLD partner table. The OLD partner meta was left orphaned forever. Fix: pass $existingAttr through to dropAttributeForRecreate and use its 'options' (dest's current state) for partner resolution.
…ments $existingAttr is always non-empty at the only call site (UpdateInPlace fallthrough requires $exists=true). Drops the unreachable nullable fallback path and the defensive isEmpty() check. Comments collapsed to one line each — the WHY is small enough to fit.
Greptile #2: source-side filter (Sources/Appwrite::exportColumns) skips RELATION_SIDE_CHILD, so the partner meta-doc created inline inside the parent's createField never reaches the partner table's createField path — and therefore never gets registered with trackOrphanCandidate. If any other column on the partner table is migrated, orphan cleanup sees the partner key absent from the tracked set and deletes it, breaking the relationship's child side. Adds a trackOrphanCandidate call right after the inline createDocument so the partner key is registered against $relatedTable. Phpstan note: adding the strict-typed call inside the existing relationship block shifted phpstan's narrowing analysis and made it flag two pre-existing $relatedTable usages (createRelationship at line 957 and purgeCachedDocument at line 1004). Both are reachable only when type === VAR_RELATIONSHIP, where $relatedTable is guaranteed defined. Suppressed with @phpstan-ignore-next-line on those two specific lines.
Greptile: $count (META_INDEXES query) and IndexValidator both ran
BEFORE the pre-check / drop block. On an Upsert recreate path:
- The to-be-dropped index was included in the count, so a table at
getLimitForIndexes() falsely threw 'Index limit reached' even
though the net change is zero (one drop, one add).
- IndexValidator was initialized with $tableIndexes from the
in-memory $table — including the existing index — so the new
index spec could be flagged as a conflict against itself ('Invalid
index: …') before the drop ran.
Reordered createIndex so:
1. Pre-check + drop block runs first (Tolerate returns early).
2. $table is reloaded after a drop so embedded $tableIndexes
reflects the post-drop state.
3. count, validateFieldsForIndexes, build $index, IndexValidator
all run AFTER the drop, against the fresh $table.
Mirrors the attribute path's structure where checkAttribute already
runs after dropAttributeForRecreate.
| private const META_INDEXES = 'indexes'; | ||
|
|
||
| /** Fields the SDK's per-type updateXAttribute endpoints don't expose; a change here forces drop+recreate. */ | ||
| private const ATTRIBUTE_NON_SDK_FIELDS = [ |
There was a problem hiding this comment.
Let's call this ATTRIBUTE_IMMUTABLE_FIELDS
| /** Names of the platform-DB collections holding Appwrite schema metadata. */ | ||
| private const META_DATABASES = 'databases'; | ||
| private const META_ATTRIBUTES = 'attributes'; | ||
| private const META_INDEXES = 'indexes'; |
There was a problem hiding this comment.
These are project DB collections
| ]; | ||
|
|
||
| /** Fields the SDK's updateRelationshipAttribute doesn't expose (only onDelete/newKey are SDK-reachable). */ | ||
| private const RELATIONSHIP_STRUCTURAL_FIELDS = [ |
There was a problem hiding this comment.
Same here, RELATIONSHIP_IMMUTABLE_FIELDS
| enum SchemaAction | ||
| { | ||
| case Create; | ||
| case Tolerate; | ||
| case UpdateInPlace; |
There was a problem hiding this comment.
Let's make these match Appwrite terms; skip, overwrite and fail
Maintainer review feedback: - ATTRIBUTE_NON_SDK_FIELDS -> ATTRIBUTE_IMMUTABLE_FIELDS - RELATIONSHIP_STRUCTURAL_FIELDS -> RELATIONSHIP_IMMUTABLE_FIELDS - META_* docblock 'platform-DB' -> 'project-DB' (these collections live in dbForProject, not the platform-level DB) Naming shift: "NON_SDK"/"STRUCTURAL" describes the cause (SDK doesn't expose them), "IMMUTABLE" describes the effect (can't change in place, only via drop+recreate). The latter is the audience-facing semantic. SDK-boundary lock-in test (AppwriteAttributeSdkBoundaryTest) still reflects on the renamed constant; updated its references.
Maintainer review: align with Appwrite's terminology (skip / overwrite / fail). Public-API breaking change — value 'upsert' becomes 'overwrite', so any pre-1.9.3 consumer of OnDuplicate::Upsert or the 'upsert' string needs to update. Renamed in this repo: - enum case + value - all OnDuplicate::Upsert references in DestinationAppwrite - internal helpers cleanupUpsertOrphans(ForTable) -> cleanupOverwriteOrphans(ForTable) - comment / docblock mentions of 'Upsert-mode' / 'Upsert recreate' - unit test method names testUpsert* -> testOverwrite* Cross-repo follow-up needed in appwrite/appwrite: - app/controllers/api/migrations.php (3 endpoint param descriptions) - src/Appwrite/Platform/Workers/Migrations.php (OnDuplicate reference) - e2e tests passing 'onDuplicate' => 'upsert' - console UI hardcoded option strings (separate repo)
Maintainer review on utopia-php/migration#171 renamed OnDuplicate::Upsert -> OnDuplicate::Overwrite (value 'upsert' -> 'overwrite') to align with Appwrite terms (skip / overwrite / fail). Applying the cross-repo ripple here: - app/controllers/api/migrations.php: 3 endpoint param descriptions updated ('upsert' -> 'overwrite' in the help text). The validator still uses OnDuplicate::values() so it auto-picks up the new value. - tests/e2e/Services/Migrations/MigrationsBase.php: all 'onDuplicate' => 'upsert' -> 'overwrite'; method names testAppwriteMigrationUpsert* -> testAppwriteMigrationOverwrite*; comments / assertion messages / local var names switched. - Left untouched: utopia's upsertDocuments operation, transaction TransactionState 'upsert' action, Operation validator — those refer to the database-level upsert primitive, not the OnDuplicate enum. composer.lock: utopia-php/migration 7d71505 -> b8ae7bc.
Greptile #3167268161: cleanupOverwriteOrphansForTable removes orphan attributes from dest's META_ATTRIBUTES + physical schema, but the in-memory $table object still holds them in its 'attributes' array. The field-strip loop right after iterates $table->getAttribute( 'attributes') to decide which row payload fields to keep — finds the orphan as 'valid' and leaves it in the row. Subsequent upsertDocuments hits the post-cleanup schema, structure validator rejects the row. Fix: reload $table immediately after orphan cleanup. Mirrors the attribute-path and index-path stale-$table fixes.
Jake's review on PR #171 (line 9 of OnDuplicate.php — SchemaAction enum): align the SchemaAction case names with Appwrite's user-facing terms (skip / overwrite / fail). - SchemaAction::Tolerate -> SchemaAction::Skip - SchemaAction::UpdateInPlace -> SchemaAction::Overwrite - SchemaAction::Create kept (no Appwrite equivalent; means 'no existing dest doc, run the create flow') Now the inner machinery uses the same vocabulary as the user-facing OnDuplicate enum (Fail / Skip / Overwrite). Comments and test method names updated to match (Tolerate -> Skip, UpdateInPlace -> Overwrite where they refer to the action — the test name 'UpdatesInPlace' as a verb phrase stays).
Summary
Re-migration against a destination Appwrite project needs the destination to tolerate existing schema resources — otherwise the second run throws
DuplicateExceptionon every database/table/attribute/index that already exists. This PR makesDestinationAppwritehandle reconciliation locally, with no changes required inutopia-php/database.Schema decisions are routed through a single method on the enum —
OnDuplicate::resolveSchemaAction()— which returns one of threeSchemaActioncases:Create,Tolerate, orUpdateInPlace. The decision usesupdatedAtonly:updatedAtstrictly newer than dest →UpdateInPlaceupdatedAtequal or older than dest →Tolerate(preserve dest drift)Tolerate(conservative)CreateSkipmode → alwaysTolerateon existingFailmode → alwaysCreateon existing (lets the library surfaceDuplicateException)The destination then runs a spec-match guard that overrides
UpdateInPlacetoToleratewhenever source and dest already have identical spec — see below.Spec-match guard
Before dispatching to the
matcharm, every create method runs a full-spec equality check that overridesUpdateInPlacetoToleratewhen source and dest already match:databaseSpecMatches— name, enabled, type, originalId, databasetableSpecMatches— name, enabled, $permissions (sorted), documentSecurityattributeSpecMatches— type, size, required, default, array, signed, format, formatOptions, filters; for relationships: structural fields + onDeleteindexSpecMatches— type, attributes, ordersEffect: a source-side drop+recreate that produces an identical spec on dest no longer triggers a wasteful drop+recreate cascade —
Tolerateskips the work, and the row pass under Upsert refreshes data viaupsertDocumentsregardless. Comparisons use avaluesMatchhelper thatksort()s associative arrays before===soformatOptions = {min:1, max:100}matches{max:100, min:1}even when utopia and source serialize keys in different orders. Lists (filters, columns) keep order-sensitive comparison since their order is semantically meaningful.SDK-aligned UpdateInPlace
UpdateInPlacefor attributes is constrained to changes the Appwrite SDK can express viaupdateXAttribute/updateRelationshipAttribute. Anything outside that surface falls through to drop+recreate. The boundary is encoded as two class constants onDestinationAppwrite:A source-side change touching one of these means the physical column shape is changing and no in-place SDK call can express it → drop+recreate. The
AppwriteAttributeSdkBoundaryTestunit test reflects onappwrite/appwrite'sTablesDB+Databasesservices and asserts these constants stay aligned with the SDK reality, so a new SDK endpoint surfaces in CI within one commit.Per-resource behavior
name,enabled,type,originalId,database,$updatedAtname,enabled,$permissions,documentSecurity,$updatedAtupdateAttributedeleteAttribute+ createupdateRelationshipAttributeforonDelete/newKey(refreshes partner meta)deleteRelationship(clears partner mirror)onDeleteis gated off; falls through to drop+recreate)deleteRelationshipskipDuplicates(createDocuments)upsertDocumentsupsertDocumentsupsertDocumentscreateDocumentsTolerate branches set
Resource::STATUS_SKIPPEDand returnfalseso migration counters showskipinstead ofsuccessfor no-op re-migrations. Immutable container fields ($id,$createdAt, internal sequences) are never touched on UpdateInPlace.Relationship drop uses
deleteRelationship(both types)One-way and two-way relationships both flow through
Database::deleteRelationship()rather thandeleteAttribute. Reason: utopia'sdeleteAttributedoesn't unwire the parent column for relationships at all — onlydeleteRelationshipclears both metadata docs and the physical column on each side. The row pass restores partner data on recreate, so two-way drop is safe.Two-way
onDeleteUpdateInPlace syncs the partner meta docutopia-php/database'supdateRelationshipupdates the physical constraint on both sides of a two-way relationship, but the Appwrite-levelattributesmeta doc on the partner side is application-layer state utopia doesn't know about.updateRelationshipInPlacerefreshes the partner meta'soptions.onDeleteand purges the partner table's caches after the parent-side update, so reads from the related collection don't see staleonDelete. (One-way +onDeletechange is gated off and falls through to drop+recreate, since utopia'supdateRelationshippartner-cascade throws on one-way.)Upsert orphan cleanup
DestinationAppwritetracks every attribute and index key source declares per (database, table) during the run in an$orphansByTablemap. At cleanup time it queries the destination'sattributes/indexescollections for that table and drops any whose key wasn't in the tracked set.When cleanup fires:
createRecordon first row batch flush for a given table. Ensures the Structure validator sees the post-cleanup schema, so rows aren't rejected on orphan required columns.run()override after source streaming completes. Only visits tables that had no rows.Gating: tracking and cleanup are both short-circuited when
$this->onDuplicate !== OnDuplicate::Upsert. Skip and Fail modes never touch destination state they weren't explicitly asked to write.Fail-open on partial failure:
run()callscleanupUpsertOrphans()afterparent::run()returns — if the migration throws mid-stream, the cleanup is skipped and the destination is preserved as-is.Two-way relationships: orphan drops route through
deleteRelationship, which clears the partner mirror automatically.Two-way relationship pair-key dedup
Two-way relationships emit two source-side resources (one per side). Without dedup, the partner pass would re-process and either double-work or destroy partner-side row data already migrated.
Prevention: a canonical pair-key (
{dbSeq}@{min(tuple)}<->{max(tuple)}where tuple ={tableId}::{attributeKey}) is recorded in$processedTwoWayPairsafter the first side completes. The partner's second pass checks the set and short-circuits withSTATUS_SKIPPED.Why pre-check over try/catch
Migration is a sequential single-writer, so the race-condition justification for exception-driven control flow doesn't apply. Pre-check reads top-to-bottom without using exceptions for control flow. On
OnDuplicate::Fail(the default) the pre-check is gated off entirely — fresh-migration paths pay zero overhead.Zero-date and null-timestamp hardening
sourceIsNewer()treats null, empty, garbage, and MySQL zero-date ('0000-00-00 00:00:00') timestamps as "unknown" → no destructive action. PHP'sstrtotime('0000-00-00 00:00:00')returns a large negative epoch rather thanfalse, so the naive=== falsecheck would have missed the sentinel —parseTimestamp()rejects non-positive epochs explicitly.Maintainability — helpers, constants, and run-state isolation
Helpers extracted to keep create methods narrow:
META_DATABASES/META_ATTRIBUTES/META_INDEXESclass constants — replace ~30 magic-string collection-name occurrences. Phpstan-checkable; typo-resistant.databaseCollectionId/tableCollectionId/attributeIndexMetaId— single source of truth for the platform-DB collection ids.purgeTableCaches— collapses 8+ duplicated cache-purge pairs.resolveTwoWayPartnerreturning a typed array shape — single entry point for the 3 sites that previously extractedrelatedCollection/twoWayKey/relatedTablewith inconsistent fallbacks.arraysDifferOnKeys— drives the SDK-boundary diff inupdateAttributeInPlaceandupdateRelationshipInPlace.valuesMatch— order-independent equality for associative arrays (formatOptions, options).databaseSpecMatches/tableSpecMatches/attributeSpecMatches/indexSpecMatches— full-spec-equality checks driving the unified spec-match guard.bestEffort— explicit "swallows all Throwable" name for cleanup-path errors.resetRunState()— invoked at the top ofrun()so reusing a destination instance no longer leaks$rowBuffer,$orphansByTable, or$processedTwoWayPairs.The action-dispatch shape in each create method:
Zero library changes
All of this uses existing
utopia-php/databaseAPIs:getDocument(collection, id)for pre-checkupdateDocumentfor container UpdateInPlaceupdateAttribute/updateRelationshipfor SDK-reachable in-place updatesdeleteAttribute/deleteRelationship/deleteIndexfor leaf drop+recreate and orphan dropsfind('attributes' | 'indexes', ...)for orphan discoveryskipDuplicates()(5.3.22) for row-level SkipupsertDocuments()(existing) for row-level UpsertNo database library bump required.
Tests
OnDuplicateTest— exists × mode × updatedAt matrix forresolveSchemaAction, including null/zero-date/unparseable-timestamp datasets (DataProvider) andOnDuplicate::values()case ordering.AppwriteAttributeSdkBoundaryTest— reflects onappwrite/appwrite'sTablesDBandDatabasesservices, verifiesATTRIBUTE_NON_SDK_FIELDSdoesn't claim any SDK-reachable param as non-SDK. Catches SDK drift in CI whenappwrite/appwriteadds new mutable params.testAppwriteMigrationRowsOnDuplicate— rows under all three modestestAppwriteMigrationReRunIsIdempotent— unchanged source is a no-op (every resource Tolerated)testAppwriteMigrationUpsertUpdatesContainerMetadata— Upsert reconciles database/table drift via UpdateInPlacetestAppwriteMigrationSkipPreservesContainerDrift— Skip preserves dest container drifttestAppwriteMigrationUpsertUpdatesAttributeInPlace— SDK-reachable attribute change propagates viaupdateAttributeInPlace; row data preservedtestAppwriteMigrationSkipPreservesAttributeDrift— Skip preserves dest attribute drift (leaf-level analog)testAppwriteMigrationUpsertUpdatesRelationshipOnDeleteInPlace— two-wayonDeletechange updates in place on both sides; partner meta refreshedtestAppwriteMigrationUpsertTwoWayRecreateSkipsPartnerSide— pair-key dedup prevents partner drop+recreate from wiping rows already migrated this runtestAppwriteMigrationUpsertOneWayRelationshipDropAndRecreate— one-way + onDelete change falls through to drop+recreate (in-place gated off for one-way)testAppwriteMigrationUpsertAttributeRecreateDropsAndRecreates— source drops+recreates with DIFFERENT spec → drop+recreate, row pass refillstestAppwriteMigrationUpsertSameSpecRecreateTolerates— source drops+recreates with SAME spec → spec-match guard forces Tolerate; dest meta untouchedtestAppwriteMigrationUpsertDropsOrphanColumn— Upsert drops destination columns source no longer declarestestAppwriteMigrationSkipKeepsOrphanColumn— Skip preserves orphan columnsTest plan
skipfor tolerated resources.updateAttribute.typechange) → drop+recreate; row pass refills.deleteRelationship(parent column unwired correctly on each side).onDeleteUpdateInPlace refreshes the partner-side Appwrite meta doc.updatedAttolerate instead of dropping.formatOptionsksort-equality survives utopia/source key-order differences.ATTRIBUTE_NON_SDK_FIELDSmatchesappwrite/appwriteUpdate endpoints.vendor/bin/pint --testpasses.vendor/bin/phpstan analyse --level 3 src testspasses.