Consolidate MCPRegistry status to match MCPServer pattern#4643
Consolidate MCPRegistry status to match MCPServer pattern#4643ChrisJBurns merged 4 commits intomainfrom
Conversation
MCPRegistry had an over-engineered three-phase status model with separate SyncStatus and APIStatus sub-objects, a DeriveOverallPhase() method, and a StatusCollector batching pattern. With sync operations moved to the registry server itself, only deployment readiness remains as a status source, making this complexity unnecessary. Flatten MCPRegistryStatus to the standard Kubernetes workload pattern used by MCPServer: Phase + Ready condition + ReadyReplicas + URL. This enables kubectl wait --for=condition=Ready and consistent tooling across all ToolHive workloads. Closes #4251 Closes #4269 Closes #4270 Closes #4510 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4643 +/- ##
==========================================
- Coverage 68.89% 68.84% -0.06%
==========================================
Files 505 506 +1
Lines 52425 52597 +172
==========================================
+ Hits 36118 36210 +92
- Misses 13514 13585 +71
- Partials 2793 2802 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@claude Please review this using the agents |
|
Claude finished @ChrisJBurns's task in 3m 15s —— View job PR Review: Consolidate MCPRegistry status to match MCPServer pattern
OverviewThe refactoring goal is sound and the overall design is correct. The Issues to Fix1. Stale When a registry transitions from // In the not-ready branch:
latest.Status.Phase = mcpv1alpha1.MCPRegistryPhasePending
latest.Status.URL = "" // add this
// In the error branch:
latest.Status.Phase = mcpv1alpha1.MCPRegistryPhaseFailed
latest.Status.URL = "" // add this2. Each call makes a live The fix is to return the readiness result from func (r *MCPRegistryReconciler) updateRegistryStatus(...) (bool, error) { // return isReady
...
isReady := r.registryAPIManager.IsAPIReady(ctx, mcpRegistry)
...
return isReady, r.Status().Update(ctx, latest)
}3. Redundant
4. Status update failure silently ignored on valid PodTemplateSpec path The invalid-spec path returns 5. Use The constant endpoint := fmt.Sprintf("http://%s.%s:%d",
mcpRegistry.GetAPIResourceName(), mcpRegistry.Namespace, registryapi.RegistryAPIPort)Suggestions (non-blocking)6. Missing tests for Codecov reports 7. Duplicate condition reason values These values are identical to 8. Unconditional The status is always written even when nothing changed. This causes a minor reconcile-loop amplification (triggers a new watch event → re-enqueues). This is pre-existing debt shared with the MCPServer controller, so not a regression — worth a follow-up issue. 9. Large PR justification section needed The automated review bot ( ## Large PR Justification
- Deletes the entire `pkg/mcpregistrystatus/` package (7 files, ~1100 lines) atomically — splitting would leave the codebase in an inconsistent state between old and new status models
- CRD and deepcopy files are generated and cannot be partially splitKubernetes Operator Pattern VerificationThe kubernetes-expert agent confirmed:
|
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ChrisJBurns
left a comment
There was a problem hiding this comment.
Multi-Agent Consensus Review
Agents consulted: kubernetes-expert, code-reviewer, toolhive-expert
Consensus Summary
| # | Finding | Consensus | Severity | Action |
|---|---|---|---|---|
| 1 | Architecture docs have stale refs to deleted package | 10/10 | HIGH | Fix |
| 2 | ObservedGeneration uses stale object |
8/10 | MEDIUM | Fix |
| 3 | 3 redundant Deployment GETs per reconcile | 10/10 | MEDIUM | Fix |
| 4 | Double status update from PodTemplate validation | 10/10 | MEDIUM | Discuss |
| 5 | Error.ConditionType field is dead code |
7/10 | MEDIUM | Fix |
| 6 | Redundant ReadyReplicas assignment |
7/10 | LOW | Fix |
Overall
This is a well-executed simplification that removes ~2,000 lines of unnecessary abstraction. The core design is sound — with sync operations moved to the registry server, the StatusCollector/Deriver pattern was genuinely over-engineered. The new code is cleaner, consistent with MCPServer, and enables standard kubectl wait --for=condition=Ready.
The findings are about polishing the new implementation, not questioning the approach.
Documentation (Finding 1 — HIGH)
Per CLAUDE.md: "When making changes that affect architecture, update relevant docs in docs/arch/."
Two architecture docs still reference the deleted mcpregistrystatus package, old status fields, and removed phases:
docs/arch/06-registry-system.mdlines 636-654: ShowssyncStatus/apiStatusYAML, listsSyncing/Readyphases, referencescmd/thv-operator/pkg/mcpregistrystatus/docs/arch/09-operator-architecture.mdlines 485-549: References "Batched updates via StatusCollector",cmd/thv-operator/pkg/mcpregistrystatus/, andSyncStatusfield (twice)
These need to be updated to reflect the new flat status model (Phase/Ready condition/ReadyReplicas/URL).
Generated with Claude Code
- Use latest.Generation instead of stale mcpRegistry.Generation for ObservedGeneration (Finding 2) - Add GetAPIStatus() combining IsAPIReady + GetReadyReplicas into a single Deployment fetch, and return isReady from updateRegistryStatus to eliminate 3 redundant GETs per reconcile (Finding 3) - Defer PodTemplate condition to updateRegistryStatus instead of writing status twice per reconciliation on the valid path (Finding 4) - Remove dead ConditionType field from registryapi.Error (Finding 5) - Remove redundant ReadyReplicas assignment (Finding 6) - Update architecture docs to reflect simplified status model (Finding 1) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Finding 1 (docs) addressed in 3cd01b4 — updated |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
MCPRegistry had an over-engineered three-phase status model with separate
SyncStatusandAPIStatussub-objects, aDeriveOverallPhase()method, and aStatusCollectorbatching pattern. With sync operations moved to the registry server itself, only deployment readiness remains as a status source — making this complexity unnecessary and coupling the operator to internal registry server state.This flattens
MCPRegistryStatusto the standard Kubernetes workload pattern used by MCPServer:Phase+Readycondition +ReadyReplicas+URL. This enableskubectl wait --for=condition=Readyand consistent tooling across all ToolHive workloads.Closes #4251
Closes #4269
Closes #4270
Closes #4510
Type of change
Test plan
task test)task lint-fix)Changes
api/v1alpha1/mcpregistry_types.goMCPRegistryStatustoPhase/Message/URL/ReadyReplicas/Conditions/ObservedGeneration. RemoveSyncStatus,APIStatus,StorageReference,SyncPhase,APIPhasetypes. SimplifyMCPRegistryPhasetoPending/Running/Failed/Terminating. Update printer columns to match MCPServer.controllers/mcpregistry_controller.goStatusManager/StatusDeriver. AddupdateRegistryStatus()with refetch-before-update. AddsetRegistryReadyCondition()using sharedConditionTypeReady. SetObservedGenerationon conditions.pkg/registryapi/types.goErrortype from deletedmcpregistrystatuspackage. AddGetReadyReplicas()toManagerinterface.pkg/registryapi/manager.goErrorreferences, replaceConditionAPIReadywithConditionTypeReady, addGetReadyReplicasimplementation.pkg/mcpregistrystatus/pkg/validation/image_validation.goMCPRegistryPhaseReady→MCPRegistryPhaseRunningtest-integration/mcp-registry/DESIGN.mdDoes this introduce a user-facing change?
Yes —
MCPRegistrystatus is simplified:.status.phasevalues change:Readyis nowRunning,Syncingis removed.status.syncStatusand.status.apiStatusfields are removed.status.urland.status.readyReplicaskubectl get mcpregistriescolumns change fromPhase/Sync/API/Servers/Last Sync/AgetoStatus/Ready/Replicas/URL/Agekubectl wait --for=condition=Readynow worksThis is a breaking API change to
v1alpha1which carries no stability guarantee.Large PR justification
This PR touches 26 files (+325/-2270) but the actual review surface is small:
mcpregistrystatus/package and its testszz_generated.deepcopy.goThis is an atomic refactoring that cannot be split: the type changes in
mcpregistry_types.gocascade to the controller, registryapi, validation, and CRDs. Splitting would create intermediate states that don't compile.Key files for review (the substantive changes):
api/v1alpha1/mcpregistry_types.gocontrollers/mcpregistry_controller.goupdateRegistryStatus()andsetRegistryReadyCondition()replacing StatusCollectorpkg/registryapi/manager.goGetReadyReplicas()addition andErrortype relocationSpecial notes for reviewers
v1alpha1has no stability guarantee, no conversion webhook needed. Orphaned etcd fields will be overwritten on next reconciliation.Stoppedphase intentionally omitted — MCPServer has it for scale-to-zero but MCPRegistry has noreplicasspec field.TestMCPRemoteProxyStatusProgressionhas a data race unrelated to this PR. The vmcptask genmockgen failures are a Go version mismatch (go1.24mockgen vsgo1.26runtime) also pre-existing.Generated with Claude Code