fix: actually grab metric measurements in UI verifications query#1022
fix: actually grab metric measurements in UI verifications query#1022adityachoudhari26 merged 2 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughThe Changes
Sequence DiagramsequenceDiagram
participant Client
participant Procedure as Status<br/>Procedure
participant DB as Database
participant Logic as Phase<br/>Logic
Client->>Procedure: Request status for jobId
Procedure->>DB: Query jobVerificationMetricStatus<br/>(where jobId = ?)
DB-->>Procedure: metric status rows
alt Metrics exist
Procedure->>DB: Load jobVerificationMetricMeasurement<br/>rows (ordered) for all<br/>metric status IDs
DB-->>Procedure: measurement rows
Procedure->>Logic: Group measurements<br/>by metric
Logic-->>Procedure: grouped measurements
Procedure->>Logic: Derive per-metric phase<br/>(failed count, passed streak,<br/>thresholds, count cap)
Logic-->>Procedure: phase per metric
Procedure->>Logic: Aggregate phases across<br/>metrics (failed > running > passed)
Logic-->>Procedure: final status
Procedure-->>Client: status result
else No metrics found
Procedure-->>Client: empty string ("")
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Fixes verification status computation so metrics that meet success_threshold via consecutive passes are treated as terminal/passed, preventing jobs from remaining stuck in a “verifying” state (issue #1018).
Changes:
- Extend
GetSiblingMetricStatusesterminal detection to include consecutive-pass early termination. - Extend
GetAggregateJobVerificationStatusso metrics that hitsuccess_thresholdno longer keep the job inrunning. - Regenerate sqlc output to reflect the updated SQL.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| apps/workspace-engine/pkg/db/queries/job_verification_metric.sql | Adds consecutive-pass counting logic and updates terminal/aggregate status rules to respect success_threshold. |
| apps/workspace-engine/pkg/db/job_verification_metric.sql.go | Regenerated sqlc code embedding the updated SQL and comments. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| LEFT JOIN LATERAL ( | ||
| SELECT COUNT(*)::int AS consecutive_passes | ||
| FROM job_verification_metric_measurement mm | ||
| WHERE mm.job_verification_metric_status_id = m.id | ||
| AND mm.status = 'passed' |
There was a problem hiding this comment.
The new consecutive_passes lateral join introduces additional scans of job_verification_metric_measurement (including a correlated MAX(measured_at) subquery). In the current schema there is no index on job_verification_metric_measurement.job_verification_metric_status_id (see apps/workspace-engine/pkg/db/queries/schema.sql:419+), so this is likely to devolve into repeated seq scans as measurement volume grows. Consider adding an index (or partial indexes) to support lookups by (job_verification_metric_status_id, status, measured_at) so these terminal-status queries remain fast.
| OR ( | ||
| COALESCE(m.success_threshold, 0) > 0 | ||
| AND COALESCE(cp.consecutive_passes, 0) >= m.success_threshold | ||
| ) |
There was a problem hiding this comment.
This PR changes the definition of is_terminal to allow early termination when success_threshold consecutive passes are reached. There are already controller-level tests for success-threshold behavior, but there doesn’t appear to be a DB/query-level regression test to ensure GetSiblingMetricStatuses (and the aggregate status query) reflects the same logic. Adding an integration test that inserts measurements and asserts terminal/aggregate status would help prevent the UI from getting stuck in "verifying" again.
| // | ||
| // 'running' if any metric is still incomplete, | ||
| // 'failed' if any metric has exceeded its failure threshold. | ||
| // Returns ” (empty string) if the job has no verification metrics. |
There was a problem hiding this comment.
The generated Go doc comment has a typographic closing quote (”) where it looks like it intended to show an empty string literal. This reads like a copy/paste/smart-quote artifact; please update the source SQL comment (and regenerate) so the Go comment renders correctly.
| // Returns ” (empty string) if the job has no verification metrics. | |
| // Returns "" (empty string) if the job has no verification metrics. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/trpc/src/routes/verifications.ts (1)
23-33: Minor: single-pass grouping is cleaner and O(n) instead of O(metrics × measurements).Not a correctness issue — current rowsets are small — but for each metric the code re-scans the entire measurements array. A single pass is also easier to read:
Proposed refactor
-function groupMeasurementsByMetric( - metricIds: string[], - measurements: MeasurementRow[], -): Map<string, MeasurementStatus[]> { - return new Map( - metricIds.map((id) => [ - id, - measurements.filter((r) => r.metricId === id).map((r) => r.status), - ]), - ); -} +function groupMeasurementsByMetric( + metricIds: string[], + measurements: MeasurementRow[], +): Map<string, MeasurementStatus[]> { + const byMetric = new Map<string, MeasurementStatus[]>( + metricIds.map((id) => [id, []]), + ); + for (const r of measurements) { + if (r.metricId == null) continue; + byMetric.get(r.metricId)?.push(r.status); + } + return byMetric; +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/trpc/src/routes/verifications.ts` around lines 23 - 33, The current groupMeasurementsByMetric scans measurements once per metric; instead build the Map in a single pass: initialize a Map with each metricId -> empty MeasurementStatus[] (use metricIds array), then iterate measurements exactly once and for each MeasurementRow r do if map.has(r.metricId) map.get(r.metricId)!.push(r.status); finally return the map. Update the function groupMeasurementsByMetric to use metricIds, measurements, MeasurementRow, and MeasurementStatus as described.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/trpc/src/routes/verifications.ts`:
- Around line 35-60: computeMetricPhase currently treats failureThreshold with
strict '>' (fails after exceeding) and successThreshold with '>=' (passes when
reaching), and it treats reaching count as automatic pass when successThreshold
is null/0; confirm intended semantics and update accordingly: either unify to
symmetric comparisons (use '>=' for failures to fail when failedCount >=
failureThreshold, or use '>' for both to require exceeding) inside
computeMetricPhase (referencing failureThreshold, failedCount, successThreshold,
consecutiveSuccessCount) and change the count-cap fallback behavior (the final
if using measurements.length >= count) to instead treat inconclusive/no-success
runs as "running" or a new "inconclusive" state per spec; ensure logic still
consults getFailedCount and getConsecutiveSuccessCount and add/tests to reflect
the chosen behavior.
---
Nitpick comments:
In `@packages/trpc/src/routes/verifications.ts`:
- Around line 23-33: The current groupMeasurementsByMetric scans measurements
once per metric; instead build the Map in a single pass: initialize a Map with
each metricId -> empty MeasurementStatus[] (use metricIds array), then iterate
measurements exactly once and for each MeasurementRow r do if
map.has(r.metricId) map.get(r.metricId)!.push(r.status); finally return the map.
Update the function groupMeasurementsByMetric to use metricIds, measurements,
MeasurementRow, and MeasurementStatus as described.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 95b839eb-5976-448a-add8-0130f18be67d
📒 Files selected for processing (1)
packages/trpc/src/routes/verifications.ts
| function computeMetricPhase( | ||
| measurements: MeasurementStatus[], | ||
| count: number, | ||
| successThreshold: number | null, | ||
| failureThreshold: number | null, | ||
| ): "passed" | "failed" | "running" { | ||
| const failureLimit = failureThreshold ?? 0; | ||
| const failedCount = getFailedCount(measurements); | ||
| const consecutiveSuccessCount = getConsecutiveSuccessCount(measurements); | ||
|
|
||
| const hasAnyFailures = failedCount > 0; | ||
| const isFailureLimitExceeded = failureLimit > 0 && failedCount > failureLimit; | ||
| if ((failureLimit === 0 && hasAnyFailures) || isFailureLimitExceeded) | ||
| return "failed"; | ||
|
|
||
| if ( | ||
| successThreshold != null && | ||
| successThreshold > 0 && | ||
| consecutiveSuccessCount >= successThreshold | ||
| ) | ||
| return "passed"; | ||
|
|
||
| if (measurements.length >= count) return "passed"; | ||
|
|
||
| return "running"; | ||
| } |
There was a problem hiding this comment.
❓ Verification inconclusive
🏁 Script executed:
# First, explore the verifications.ts file structure and context
cd packages/trpc/src/routes && wc -l verifications.tsRepository: ctrlplanedev/ctrlplane
Length of output: 86
🏁 Script executed:
# Check if there are test files for this module
find . -type f -name "*verification*" | head -20Repository: ctrlplanedev/ctrlplane
Length of output: 633
🏁 Script executed:
# Search for the full context of computeMetricPhase and related functions
cd packages/trpc/src/routes && sed -n '1,100p' verifications.tsRepository: ctrlplanedev/ctrlplane
Length of output: 3359
🏁 Script executed:
# Look for tests that might clarify the expected behavior
find . -type f \( -name "*.test.ts" -o -name "*.spec.ts" \) | xargs grep -l "computeMetricPhase\|Measurement\|threshold" 2>/dev/null | head -10Repository: ctrlplanedev/ctrlplane
Length of output: 48
🏁 Script executed:
# Search for any documentation or comments about thresholds
rg "failureThreshold|successThreshold" --type ts -B2 -A2Repository: ctrlplanedev/ctrlplane
Length of output: 17061
🏁 Script executed:
# Check if there's documentation about the verification logic
find . -type f \( -name "*.md" -o -name "README*" \) | xargs grep -l "verification\|threshold" 2>/dev/nullRepository: ctrlplanedev/ctrlplane
Length of output: 189
🏁 Script executed:
# Look for test files specifically for the verifications route
find . -path "*/node_modules" -prune -o -type f \( -name "*verifications*test*" -o -name "*verifications*spec*" \) -printRepository: ctrlplanedev/ctrlplane
Length of output: 48
🏁 Script executed:
# Search for test files that might contain tests for computeMetricPhase or similar logic
rg "computeMetricPhase\|getFailedCount\|getConsecutiveSuccessCount" --type tsRepository: ctrlplanedev/ctrlplane
Length of output: 48
🏁 Script executed:
# Look at the full status procedure to understand the flow better
cd packages/trpc/src/routes && sed -n '70,136p' verifications.tsRepository: ctrlplanedev/ctrlplane
Length of output: 2374
🏁 Script executed:
# Check if there are any comments or documentation about verification requirements
rg "verification.*metric|threshold.*behavior" --type ts -i -B2 -A2Repository: ctrlplanedev/ctrlplane
Length of output: 33046
🏁 Script executed:
# Look at the database schema to understand defaults and constraints
cd packages/db/src/schema && cat -n job-verification-metric.tsRepository: ctrlplanedev/ctrlplane
Length of output: 3625
🏁 Script executed:
# Check the deployment verification documentation
find . -name "deployment-verification.mdx" -exec cat {} \;Repository: ctrlplanedev/ctrlplane
Length of output: 6522
🏁 Script executed:
# Search for examples or documentation about threshold semantics
rg "successThreshold|failureThreshold" --type mdx -B3 -A3Repository: ctrlplanedev/ctrlplane
Length of output: 93
🏁 Script executed:
# Look for any Go implementation in the workspace-engine (backend) that might clarify semantics
find apps/workspace-engine -name "*.go" -type f | xargs grep -l "failureThreshold\|successThreshold" 2>/dev/null | head -5Repository: ctrlplanedev/ctrlplane
Length of output: 369
🏁 Script executed:
# Check if there are any comments or logic in the Go code
find apps/workspace-engine -name "*.go" -type f | xargs grep -B5 -A10 "failureThreshold\|successThreshold" 2>/dev/null | head -80Repository: ctrlplanedev/ctrlplane
Length of output: 6959
🏁 Script executed:
# Get the full test file to understand the threshold semantics
cat apps/workspace-engine/pkg/oapi/job_verification_test.goRepository: ctrlplanedev/ctrlplane
Length of output: 3045
🏁 Script executed:
# Also look at the Status() method implementation to understand the logic
rg "func.*Status\(\)" apps/workspace-engine/pkg/oapi/ -A15Repository: ctrlplanedev/ctrlplane
Length of output: 1175
🏁 Script executed:
# Get the full Status() method implementation
sed -n '/func (jv \*JobVerification) Status()/,/^func /p' apps/workspace-engine/pkg/oapi/oapi.go | head -80Repository: ctrlplanedev/ctrlplane
Length of output: 1391
Verify threshold semantics against your verification spec and decision on count-cap fallback.
The code has two semantic behaviors worth confirming:
-
Asymmetric thresholds are intentional. Failure uses strict
>while success uses>=: withfailureThreshold = 3the metric fails on the 4th failure, but withsuccessThreshold = 3it passes on the 3rd success. This is confirmed by the backend test suite (FailureLimitReached_NotFailedBeforeExceededshows failureThreshold=2 with 2 failures does NOT fail). Decide if this asymmetry matches your intent or should be unified. -
Count-cap fallback behavior. When
successThresholdis not set (null or 0—the database defaults), reachingcountmeasurements without exceeding the failure limit declares success, even if no actual successes occurred (e.g., all inconclusive). This is the current design, but consider whether inconclusive runs should be treated as "running" or "inconclusive" instead of "passed".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/trpc/src/routes/verifications.ts` around lines 35 - 60,
computeMetricPhase currently treats failureThreshold with strict '>' (fails
after exceeding) and successThreshold with '>=' (passes when reaching), and it
treats reaching count as automatic pass when successThreshold is null/0; confirm
intended semantics and update accordingly: either unify to symmetric comparisons
(use '>=' for failures to fail when failedCount >= failureThreshold, or use '>'
for both to require exceeding) inside computeMetricPhase (referencing
failureThreshold, failedCount, successThreshold, consecutiveSuccessCount) and
change the count-cap fallback behavior (the final if using measurements.length
>= count) to instead treat inconclusive/no-success runs as "running" or a new
"inconclusive" state per spec; ensure logic still consults getFailedCount and
getConsecutiveSuccessCount and add/tests to reflect the chosen behavior.
fixes #1018
Summary by CodeRabbit