Align MCPRegistry CRD with registry server v2 config format#4653
Align MCPRegistry CRD with registry server v2 config format#4653
Conversation
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4653 +/- ##
==========================================
- Coverage 68.80% 68.70% -0.11%
==========================================
Files 506 507 +1
Lines 52597 52837 +240
==========================================
+ Hits 36189 36301 +112
- Misses 13613 13720 +107
- Partials 2795 2816 +21 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
fee3776 to
821ccda
Compare
The registry server has moved to a v2 config format that separates data sources from registry views, adds claims-based authorization, and supports managed and Kubernetes source types. The operator's MCPRegistry CRD was still generating v1 config (flat registries[] with inline source configs), meaning operators deploying via the CRD got none of the v2 authorization features. This updates the CRD spec to use separate sources[] and registries[] fields, adds support for claims (apiextensionsv1.JSON), authz roles, managed sources, Kubernetes sources with namespace selectors, and public paths on auth config. The config generator now produces v2-compatible YAML that the registry server can consume directly. A default Kubernetes source is auto-injected when the user defines none, preserving backward compatibility for MCP server discovery. Closes #4572 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Users should explicitly define all sources in their MCPRegistry spec, including Kubernetes discovery sources. The auto-injection was adding implicit behavior that would be surprising when combined with claims-based authorization. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
99e64c0 to
d2d6430
Compare
ChrisJBurns
left a comment
There was a problem hiding this comment.
Multi-Agent Code Review
Reviewed by three specialized agents: Code Reviewer (Go style/conventions), Kubernetes Expert (CRD/operator patterns), and Security Advisor (auth, claims, threat modeling). See inline comments.
Legend: 🔴 Must Fix | 🟡 Strongly Recommended | 🟢 Nice to Have
Confirmed Positive
- Sources/registries split is well-designed (K8s Expert)
apiextensionsv1.JSONis the correct type choice for claims (K8s Expert)- DeepCopy generation is correct for all
apiextensionsv1.JSONfields (K8s Expert) deserializeClaimsis safe and well-implemented (Security Advisor)- PVC removal is clean across all files (Code Reviewer)
- Test coverage is thorough (all agents)
Review performed by Claude Code using code-reviewer, kubernetes-expert, and security-advisor agents.
Fix misplaced godoc for buildCACertFilePath that was attached to buildTelemetryConfig. Return error from buildTelemetryConfig instead of silently swallowing ParseFloat failures on sampling values. Add value type validation to deserializeRoleEntry matching the existing claims validation contract. Add CEL XValidation rules: exactly-one-source-type per source config, registry-to-source cross-reference validation, at-most-one managed source, authz rejected with anonymous auth mode. Remove redundant MinItems XValidation rules now covered by field-level markers. Add kubebuilder validation on publicPaths requiring "/" prefix. Update REGISTRY.md examples to v2 format with sources/registries split. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update the Examples section (Production, Development, Private Git, and Multiple Sources) to use the v2 sources/registries split format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The nested all/all/exists CEL rule for validating registry-to-source references exceeded the Kubernetes CEL cost budget by over 100x. Remove it and keep validation at reconciliation time only (which already works). Add MaxItems=20 on sources and registries to bound CEL cost for the remaining rules. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve merge conflict: keep pvc_source_test.go deleted since PVC source support was removed. Fix WithUpstreamFormat() which was setting RegistryFormatToolHive instead of RegistryFormatUpstream. Add missing doc comment on RegistryFormatUpstream constant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update the default registry API image from v0.6.6 to v1.0.0 which supports the v2 config format with sources/registries split. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Local Validation ResultsTested end-to-end on a Kind cluster with Setup
MCPRegistry Appliedsources:
- name: toolhive-catalog
format: toolhive
configMapRef: { name: toolhive-registry-data, key: registry.json }
syncPolicy: { interval: "1h" }
filter:
tags:
include: ["production"]
- name: k8s-discovery
kubernetes:
namespaces: [test-registry]
registries:
- name: default
sources: [toolhive-catalog, k8s-discovery]Config GenerationGenerated v2
Registry Server
CEL Validation RulesAll admission-time validations working:
Generated with Claude Code |
The MCPRegistry CRD currently mirrors the registry server's config.yaml structure field-by-field, coupling the operator to registry server config changes. PR #4653 demonstrated this cost: 2,993+/2,458- lines to align the CRD with a v2 config format. Add a new decoupled config path alongside the existing typed fields: - configYAML: raw YAML string passed through to the registry server as config.yaml, with no operator parsing or transformation - volumes/volumeMounts: standard corev1 types for user-managed volume wiring (secrets, ConfigMaps, PVCs) - pgpassSecretRef: operator-managed pgpass mount with init container for chmod 0600 (the one Kubernetes plumbing detail users cannot express through volumes alone) The two paths are mutually exclusive (CEL + reconciler validation). The existing typed fields are deprecated but fully functional, allowing gradual migration. A future release will remove the legacy path. Includes RFC document, validation with reserved name/mount path collision detection, 27 new unit tests, and 6 example manifests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add decoupled configYAML path to MCPRegistry CRD The MCPRegistry CRD currently mirrors the registry server's config.yaml structure field-by-field, coupling the operator to registry server config changes. PR #4653 demonstrated this cost: 2,993+/2,458- lines to align the CRD with a v2 config format. Add a new decoupled config path alongside the existing typed fields: - configYAML: raw YAML string passed through to the registry server as config.yaml, with no operator parsing or transformation - volumes/volumeMounts: standard corev1 types for user-managed volume wiring (secrets, ConfigMaps, PVCs) - pgpassSecretRef: operator-managed pgpass mount with init container for chmod 0600 (the one Kubernetes plumbing detail users cannot express through volumes alone) The two paths are mutually exclusive (CEL + reconciler validation). The existing typed fields are deprecated but fully functional, allowing gradual migration. A future release will remove the legacy path. Includes RFC document, validation with reserved name/mount path collision detection, 27 new unit tests, and 6 example manifests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Use raw JSON for volumes/volumeMounts to reduce CRD schema size Change Volumes and VolumeMounts fields from typed []corev1.Volume / []corev1.VolumeMount to []apiextensionsv1.JSON with x-kubernetes-preserve-unknown-fields. This avoids expanding the full Kubernetes volume OpenAPI schema into the CRD manifest, reducing the CRD diff by ~3,900 lines. Add ParseVolumes() and ParseVolumeMounts() helper methods on MCPRegistrySpec that deserialize the raw JSON into typed objects at reconciliation time. Validation of volume structure happens at pod creation time rather than CRD admission time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Remove RFC document from tracked files The RFC is a design document used to guide implementation, not a permanent repo artifact. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address review findings from multi-agent consensus review Fix correctness bugs, validation gaps, code duplication, and test coverage identified during the review of the decoupled configYAML path. Correctness: - Config hash now covers full spec (not just ConfigYAML), so changes to volumes/volumeMounts/pgpassSecretRef trigger pod rollouts - Parse errors for volumes/volumeMounts are returned as errors instead of being silently swallowed in the deployment builder Validation: - CEL mutual exclusivity rule now covers all legacy fields (registries, databaseConfig, authConfig, telemetryConfig), not just sources - TelemetryConfig added to reconciler-side validatePathExclusivity - MaxLength=131072 on configYAML, MaxItems=50 on volumes/volumeMounts Duplication: - Extract upsertDeployment to share create-or-update logic between ensureDeployment and ensureDeploymentNewPath - Extract withPGPassMountFromVolume to share init container logic between WithPGPassMount and WithPGPassSecretRefMount Tests: - Add ParseVolumes/ParseVolumeMounts unit tests (8 cases) - Add PodTemplateSpec reserved name/mount path collision tests (3 cases) - Extract pgpassEnvVar constant to fix goconst lint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Simplify CEL rules and fix has() guards for absent fields Remove the overly complex mutual exclusivity CEL rule — the reconciler handles this as defense-in-depth, and the registry server validates its own config. Keep only the essential admission-time rules: - Either configYAML or sources must be set (operator path selection) - At most one managed source (legacy path constraint) Guard all CEL rules with has() checks so they handle absent optional fields (e.g., sources is absent when using the new configYAML path). Verified end-to-end with Kind cluster + CNPG PostgreSQL + registry server v1.0.0: MCPRegistry accepted, ConfigMap created with raw YAML, registry-api pod running and serving /health, /readiness, /v1/registries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Validate pgpassSecretRef name and key are non-empty WithPGPassSecretRefMount silently no-ops when name or key is empty, which would produce a deployment without pgpass mounting and a confusing database connection error at runtime. Catch this early in reconciler validation with clear error messages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Remove MaxLength and MaxItems constraints from new config fields These constraints couple the operator to opinions about size limits that Kubernetes and etcd already enforce naturally (1MB ConfigMap limit, ~1.5MB etcd object limit). Removing them keeps the decoupled path truly pass-through. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Extract validatePGPassSecretRef to fix gocyclo lint Split pgpassSecretRef name/key validation into its own function to bring validatePathExclusivity below the cyclomatic complexity threshold. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix all CI lint failures Add nolint:staticcheck directives to legacy code that intentionally references deprecated fields (config.go, pgpass_test.go). Move inline nolint comments above the line to stay within the 130-char line limit. Fix unparam lint on ensureService by removing the unused *corev1.Service return value — both callers discarded it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Regenerate CRD API reference docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
The registry server (
toolhive-registry-server) moved to a v2 config format that separates data sources from registry views and adds claims-based authorization. The operator's MCPRegistry CRD was still generating v1 config (flatregistries[]with inline source configs), so operators deploying via the CRD got none of the v2 authorization features.This updates the MCPRegistry CRD and config generation to produce v2-compatible config with full parity:
spec.registriesintospec.sources(data source definitions) andspec.registries(lightweight views referencing sources by name)apiextensionsv1.JSONManagedSource,KubernetesSource(with namespace selectors), andURLSourceas new source typesauthz(role-based access control) andpublicPathsto auth configmaxMetaSizeanddynamicAuth(AWS RDS IAM) to database configtelemetryConfigwith tracing (sampling) and metrics sub-configsCloses #4572
Type of change
Test plan
task test)task lint-fix)Deployed updated CRDs to a Kind cluster, applied a v2 MCPRegistry resource with sources + registries + claims, ran the operator locally, and verified the generated ConfigMap produces correct v2 YAML with
sources[]andregistries[].Changes
cmd/thv-operator/api/v1alpha1/mcpregistry_types.goMCPRegistrySourceConfig,MCPRegistryViewConfig,ManagedSource,KubernetesSource,URLSource,MCPRegistryAuthzConfig,MCPRegistryRolesConfig,MCPRegistryTelemetryConfig,MCPRegistryDynamicAuthConfig; claims viaapiextensionsv1.JSON;PublicPathson auth;MaxMetaSize/DynamicAuthon database; removedPVCSourcecmd/thv-operator/api/v1alpha1/zz_generated.deepcopy.gocmd/thv-operator/pkg/registryapi/config/config.goSourceConfig/RegistryConfigstructs, claims deserialization, authz mapping, telemetry/database config mapping, URL source support, removed PVC config buildingcmd/thv-operator/pkg/registryapi/config/config_test.gocmd/thv-operator/pkg/registryapi/deployment.goSpec.Sourcesfor git auth mountscmd/thv-operator/pkg/registryapi/podtemplatespec.goWithRegistrySourceMountstakes[]MCPRegistrySourceConfig, removed PVC volume mountingcmd/thv-operator/pkg/registryapi/{*_test.go}cmd/thv-operator/test-integration/mcp-registry/*.godeploy/charts/operator-crds/**docs/operator/crd-api.mdexamples/operator/mcp-registries/*.yamlDoes this introduce a user-facing change?
Yes. This is a breaking CRD change —
spec.registrieschanges meaning from source definitions to registry views. Since this isv1alpha1(explicitly unstable), an in-place update is acceptable. Existing MCPRegistry resources must be updated to the new format.Migration example:
Before:
After:
Special notes for reviewers
apiextensionsv1.JSONfor claims fields on sources and registries, and for role entries in authz config. The config generator deserializes and validates that values arestringor[]stringbefore emittingmap[string]anyin the YAML output.mcpregistry_types.goandconfig.go.Generated with Claude Code