TRT-2559: Handle new variants in dbgroupby in regression tracking#3363
Conversation
When db_groupby is modified to add new variant dimensions, the
regression tracker now gracefully handles existing regressions through
two mechanisms:
1. Subset Matching: FindOpenRegression matches if all regression variants
are present in the input test, allowing the input to have additional
variants. This prevents duplicate regressions when new variants are
added to db_column_groupby.
2. Variant Updating: When a regression is matched via subset matching,
any missing variants from the input test are added to the existing
regression's variant list and persisted to the database.
This enables automatic regression splitting by new variant dimensions:
Example with OS variant added to db_column_groupby:
Initial state:
- Regression ID 1000: test_foo [Platform:aws, Network:ovn]
After adding OS to db_column_groupby:
First test execution (OS:rhcos9):
- Subset match finds regression 1000 (Platform:aws, Network:ovn are present)
- Updates regression 1000 → [Platform:aws, Network:ovn, OS:rhcos9]
Second test execution (OS:rhcos10):
- No match (regression 1000 now has OS:rhcos9, but test has OS:rhcos10)
- Creates regression 1001 → [Platform:aws, Network:ovn, OS:rhcos10]
Result: One regression is split into two, properly separated by the new
OS variant dimension, while preserving the original regression's ID and
opened date for rhcos9.
Testing: Validated with production database in dry-run mode by adding
OS to 4.22-main view. 206 regressions matched via subset matching and
were correctly identified for variant updates.
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@xueqzhan: This pull request references TRT-2559 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughUpdated regression variant matching semantics from iterating input variants against a helper function to subset matching of regression variants, added malformed variant handling, and integrated variant expansion during regression sync with dry-run mode gating. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
@xueqzhan: This pull request references TRT-2559 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Scheduling required tests: |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker.go (1)
191-218:⚠️ Potential issue | 🟠 MajorSubset matching needs an explicit tie-breaker.
After this change, more than one open regression can legitimately match the same input. Returning the first match makes the chosen regression depend on backend ordering, so the middleware can attach triage/history to one record while
pkg/api/componentreadiness/regressiontracker.goexpands or reopens another. Please define and enforce a deterministic precedence rule here (for example, most-specific variants, or oldest regression if preserving the legacy ID is the goal).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker.go` around lines 191 - 218, Multiple open regressions can match the same input; instead of returning matches[0] (which is non-deterministic), sort and pick a deterministic winner: after building matches (slice of tr), implement a stable sort that orders by your chosen precedence (e.g., primary: descending specificity = len(tr.Variants) to prefer most-specific, secondary: ascending creation time like tr.CreatedAt or ascending tr.ID to tie-break by oldest/legacy), then return matches[0]; update the matching block in regressiontracker.go (the matches slice and return) to perform this sort before returning.
🧹 Nitpick comments (2)
pkg/api/componentreadiness/regressiontracker.go (1)
340-360: PersistopenRegonce after all mutations.This new branch saves the record immediately for variant expansion, and the existing update/reopen paths can save it again in the same iteration. That adds avoidable DB churn and leaves the regression partially updated if a later save fails; it would be safer to accumulate the in-memory changes and call
UpdateRegressiononce.Also applies to: 403-408
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/api/componentreadiness/regressiontracker.go` around lines 340 - 360, The code updates openReg and calls rt.backend.UpdateRegression immediately when newVariants are detected, causing multiple DB writes in one iteration; instead, accumulate all in-memory changes to openReg (e.g., appending variantStrs to openReg.Variants) and defer calling rt.backend.UpdateRegression until after all mutation paths (including the existing update/reopen logic) have completed, invoking UpdateRegression exactly once if rt.dryRun is false; update the logic around existingVariantMap, newVariants, the variant-appending block that references regTest.Variants and openReg.Variants, and remove the early call to rt.backend.UpdateRegression so only the consolidated UpdateRegression call remains later in the function.pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker_test.go (1)
719-837: Add coverage for overlapping subset matches.Every case here passes at most one candidate regression, so this suite will not catch the ambiguity introduced when both a legacy subset regression and a fully specific regression match the same input. Please add a case with both candidates and assert the intended precedence rule.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker_test.go` around lines 719 - 837, The test suite misses a case where two candidate regressions overlap (one legacy/subset and one fully specific) so ambiguity isn’t exercised; add a subtest that creates two regressions in the regressions slice (both TestRegression entries) — one with fewer regressionVars (the subset/legacy) and one with a superset or fully specific regressionVars — then call FindOpenRegression(sampleRelease, testID, inputVariants, regressions) and assert the function returns the correct regression according to the intended precedence rule (explicitly assert which ID — e.g., the fully specific regression ID — should be chosen), and also validate the returned Release, BaseRelease and TestID as in other tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In
`@pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker.go`:
- Around line 191-218: Multiple open regressions can match the same input;
instead of returning matches[0] (which is non-deterministic), sort and pick a
deterministic winner: after building matches (slice of tr), implement a stable
sort that orders by your chosen precedence (e.g., primary: descending
specificity = len(tr.Variants) to prefer most-specific, secondary: ascending
creation time like tr.CreatedAt or ascending tr.ID to tie-break by
oldest/legacy), then return matches[0]; update the matching block in
regressiontracker.go (the matches slice and return) to perform this sort before
returning.
---
Nitpick comments:
In
`@pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker_test.go`:
- Around line 719-837: The test suite misses a case where two candidate
regressions overlap (one legacy/subset and one fully specific) so ambiguity
isn’t exercised; add a subtest that creates two regressions in the regressions
slice (both TestRegression entries) — one with fewer regressionVars (the
subset/legacy) and one with a superset or fully specific regressionVars — then
call FindOpenRegression(sampleRelease, testID, inputVariants, regressions) and
assert the function returns the correct regression according to the intended
precedence rule (explicitly assert which ID — e.g., the fully specific
regression ID — should be chosen), and also validate the returned Release,
BaseRelease and TestID as in other tests.
In `@pkg/api/componentreadiness/regressiontracker.go`:
- Around line 340-360: The code updates openReg and calls
rt.backend.UpdateRegression immediately when newVariants are detected, causing
multiple DB writes in one iteration; instead, accumulate all in-memory changes
to openReg (e.g., appending variantStrs to openReg.Variants) and defer calling
rt.backend.UpdateRegression until after all mutation paths (including the
existing update/reopen logic) have completed, invoking UpdateRegression exactly
once if rt.dryRun is false; update the logic around existingVariantMap,
newVariants, the variant-appending block that references regTest.Variants and
openReg.Variants, and remove the early call to rt.backend.UpdateRegression so
only the consolidated UpdateRegression call remains later in the function.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: a9f7c0ca-090d-4dd1-a47b-e3bc945377b3
📒 Files selected for processing (3)
pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker.gopkg/api/componentreadiness/middleware/regressiontracker/regressiontracker_test.gopkg/api/componentreadiness/regressiontracker.go
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sosiouxme, xueqzhan The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test lint |
|
Scheduling required tests: |
When db_groupby is modified to add new variant dimensions, the
regression tracker now gracefully handles existing regressions through
two mechanisms:
Subset Matching: FindOpenRegression matches if all regression variants are present in the input test, allowing the input to have additional variants. This prevents duplicate regressions when new variants are added to db_column_groupby.
Variant Updating: When a regression is matched via subset matching, any missing variants from the input test are added to the existing regression's variant list and persisted to the database.
This enables automatic regression splitting by new variant dimensions:
Example with OS variant added to db_column_groupby:
Initial state:
- Regression ID 1000: test_foo [Platform:aws, Network:ovn]
After adding OS to db_column_groupby:
First test execution (OS:rhcos9):
- Subset match finds regression 1000 (Platform:aws, Network:ovn are present)
- Updates regression 1000 → [Platform:aws, Network:ovn, OS:rhcos9]
Second test execution (OS:rhcos10):
- No match (regression 1000 now has OS:rhcos9, but test has OS:rhcos10)
- Creates regression 1001 → [Platform:aws, Network:ovn, OS:rhcos10]
Result: One regression is split into two, properly separated by the new
OS variant dimension, while preserving the original regression's ID and
opened date for rhcos9.
Testing: Validated with production database in dry-run mode by adding
OS to 4.22-main view. 206 regressions matched via subset matching and
were correctly identified for variant updates.
Summary by CodeRabbit
Bug Fixes
Tests