Skip to content

TRT-2559: Handle new variants in dbgroupby in regression tracking#3363

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
xueqzhan:db-groupby-regression
Mar 26, 2026
Merged

TRT-2559: Handle new variants in dbgroupby in regression tracking#3363
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
xueqzhan:db-groupby-regression

Conversation

@xueqzhan
Copy link
Contributor

@xueqzhan xueqzhan commented Mar 24, 2026

When db_groupby is modified to add new variant dimensions, the
regression tracker now gracefully handles existing regressions through
two mechanisms:

  1. Subset Matching: FindOpenRegression matches if all regression variants are present in the input test, allowing the input to have additional variants. This prevents duplicate regressions when new variants are added to db_column_groupby.

  2. Variant Updating: When a regression is matched via subset matching, any missing variants from the input test are added to the existing regression's variant list and persisted to the database.

This enables automatic regression splitting by new variant dimensions:

Example with OS variant added to db_column_groupby:

Initial state:
- Regression ID 1000: test_foo [Platform:aws, Network:ovn]

After adding OS to db_column_groupby:

First test execution (OS:rhcos9):
- Subset match finds regression 1000 (Platform:aws, Network:ovn are present)
- Updates regression 1000 → [Platform:aws, Network:ovn, OS:rhcos9]

Second test execution (OS:rhcos10):
- No match (regression 1000 now has OS:rhcos9, but test has OS:rhcos10)
- Creates regression 1001 → [Platform:aws, Network:ovn, OS:rhcos10]

Result: One regression is split into two, properly separated by the new
OS variant dimension, while preserving the original regression's ID and
opened date for rhcos9.

Testing: Validated with production database in dry-run mode by adding
OS to 4.22-main view. 206 regressions matched via subset matching and
were correctly identified for variant updates.

Summary by CodeRabbit

  • Bug Fixes

    • Improved regression variant matching to use subset matching semantics instead of previous iteration approach.
    • Added proper dry-run mode support to prevent unintended database writes during test runs.
    • Enhanced variant tracking to automatically expand and consolidate variants when regressions are matched.
  • Tests

    • Added comprehensive test suite validating regression variant subset matching scenarios.

  When db_groupby is modified to add new variant dimensions, the
  regression tracker now gracefully handles existing regressions through
  two mechanisms:

  1. Subset Matching: FindOpenRegression matches if all regression variants
     are present in the input test, allowing the input to have additional
     variants. This prevents duplicate regressions when new variants are
     added to db_column_groupby.

  2. Variant Updating: When a regression is matched via subset matching,
     any missing variants from the input test are added to the existing
     regression's variant list and persisted to the database.

  This enables automatic regression splitting by new variant dimensions:

  Example with OS variant added to db_column_groupby:

  Initial state:
    - Regression ID 1000: test_foo [Platform:aws, Network:ovn]

  After adding OS to db_column_groupby:

  First test execution (OS:rhcos9):
    - Subset match finds regression 1000 (Platform:aws, Network:ovn are present)
    - Updates regression 1000 → [Platform:aws, Network:ovn, OS:rhcos9]

  Second test execution (OS:rhcos10):
    - No match (regression 1000 now has OS:rhcos9, but test has OS:rhcos10)
    - Creates regression 1001 → [Platform:aws, Network:ovn, OS:rhcos10]

  Result: One regression is split into two, properly separated by the new
  OS variant dimension, while preserving the original regression's ID and
  opened date for rhcos9.

  Testing: Validated with production database in dry-run mode by adding
  OS to 4.22-main view. 206 regressions matched via subset matching and
  were correctly identified for variant updates.
@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 24, 2026

@xueqzhan: This pull request references TRT-2559 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

When db_groupby is modified to add new variant dimensions, the
regression tracker now gracefully handles existing regressions through
two mechanisms:

  1. Subset Matching: FindOpenRegression matches if all regression variants are present in the input test, allowing the input to have additional variants. This prevents duplicate regressions when new variants are added to db_column_groupby.

  2. Variant Updating: When a regression is matched via subset matching, any missing variants from the input test are added to the existing regression's variant list and persisted to the database.

This enables automatic regression splitting by new variant dimensions:

Example with OS variant added to db_column_groupby:

Initial state:

  • Regression ID 1000: test_foo [Platform:aws, Network:ovn]

After adding OS to db_column_groupby:

First test execution (OS:rhcos9):

  • Subset match finds regression 1000 (Platform:aws, Network:ovn are present)
  • Updates regression 1000 → [Platform:aws, Network:ovn, OS:rhcos9]

Second test execution (OS:rhcos10):

  • No match (regression 1000 now has OS:rhcos9, but test has OS:rhcos10)
  • Creates regression 1001 → [Platform:aws, Network:ovn, OS:rhcos10]

Result: One regression is split into two, properly separated by the new
OS variant dimension, while preserving the original regression's ID and
opened date for rhcos9.

Testing: Validated with production database in dry-run mode by adding
OS to 4.22-main view. 206 regressions matched via subset matching and
were correctly identified for variant updates.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 24, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 24, 2026

Walkthrough

Updated regression variant matching semantics from iterating input variants against a helper function to subset matching of regression variants, added malformed variant handling, and integrated variant expansion during regression sync with dry-run mode gating.

Changes

Cohort / File(s) Summary
Variant Matching Logic
pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker.go
Changed FindOpenRegression to perform subset matching by iterating regression variants and validating presence in input map. Removed findVariant helper and added early exit for malformed variant strings (not exactly one key:value pair).
Test Coverage
pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker_test.go
Added TestFindOpenRegression_SubsetMatching test suite with six cases validating subset matching behavior, extra variants, missing values, empty variants, and expectation alignment on matched regression properties.
Regression Tracking Integration
pkg/api/componentreadiness/regressiontracker.go
Implemented variant expansion when an open regression is found, computing missing variants from regTest.Variants and persisting updates via UpdateRegression. Gated all regression update persistence behind !rt.dryRun check.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Go Error Handling ✅ Passed Proper Go error handling patterns found: errors are checked, wrapped with %w format, no panic calls outside init(), no ignored errors with _, and nil checks present before dereferences.
Sql Injection Prevention ✅ Passed PR does not introduce SQL injection vulnerabilities. Variant values are stored as data in struct fields and persisted via GORM's parameterized Save() method, never used in direct SQL construction.
Excessive Css In React Should Use Styles ✅ Passed PR modifies Go backend files only; custom check targets React components with inline CSS, which is not applicable to this codebase.
Single Responsibility And Clear Naming ✅ Passed PR maintains excellent single responsibility with action-oriented method names (FindOpenRegression, SyncRegressionsForReport, UpdateRegression) and clear package separation by concern. Struct field counts are cohesive and appropriate, parameters are reasonable, and unnecessary indirection is removed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from dgoodwin and smg247 March 24, 2026 20:42
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 24, 2026

@xueqzhan: This pull request references TRT-2559 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

When db_groupby is modified to add new variant dimensions, the
regression tracker now gracefully handles existing regressions through
two mechanisms:

  1. Subset Matching: FindOpenRegression matches if all regression variants are present in the input test, allowing the input to have additional variants. This prevents duplicate regressions when new variants are added to db_column_groupby.

  2. Variant Updating: When a regression is matched via subset matching, any missing variants from the input test are added to the existing regression's variant list and persisted to the database.

This enables automatic regression splitting by new variant dimensions:

Example with OS variant added to db_column_groupby:

Initial state:

  • Regression ID 1000: test_foo [Platform:aws, Network:ovn]

After adding OS to db_column_groupby:

First test execution (OS:rhcos9):

  • Subset match finds regression 1000 (Platform:aws, Network:ovn are present)
  • Updates regression 1000 → [Platform:aws, Network:ovn, OS:rhcos9]

Second test execution (OS:rhcos10):

  • No match (regression 1000 now has OS:rhcos9, but test has OS:rhcos10)
  • Creates regression 1001 → [Platform:aws, Network:ovn, OS:rhcos10]

Result: One regression is split into two, properly separated by the new
OS variant dimension, while preserving the original regression's ID and
opened date for rhcos9.

Testing: Validated with production database in dry-run mode by adding
OS to 4.22-main view. 206 regressions matched via subset matching and
were correctly identified for variant updates.

Summary by CodeRabbit

  • Bug Fixes

  • Improved regression variant matching to use subset matching semantics instead of previous iteration approach.

  • Added proper dry-run mode support to prevent unintended database writes during test runs.

  • Enhanced variant tracking to automatically expand and consolidate variants when regressions are matched.

  • Tests

  • Added comprehensive test suite validating regression variant subset matching scenarios.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2026
@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker.go (1)

191-218: ⚠️ Potential issue | 🟠 Major

Subset matching needs an explicit tie-breaker.

After this change, more than one open regression can legitimately match the same input. Returning the first match makes the chosen regression depend on backend ordering, so the middleware can attach triage/history to one record while pkg/api/componentreadiness/regressiontracker.go expands or reopens another. Please define and enforce a deterministic precedence rule here (for example, most-specific variants, or oldest regression if preserving the legacy ID is the goal).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker.go`
around lines 191 - 218, Multiple open regressions can match the same input;
instead of returning matches[0] (which is non-deterministic), sort and pick a
deterministic winner: after building matches (slice of tr), implement a stable
sort that orders by your chosen precedence (e.g., primary: descending
specificity = len(tr.Variants) to prefer most-specific, secondary: ascending
creation time like tr.CreatedAt or ascending tr.ID to tie-break by
oldest/legacy), then return matches[0]; update the matching block in
regressiontracker.go (the matches slice and return) to perform this sort before
returning.
🧹 Nitpick comments (2)
pkg/api/componentreadiness/regressiontracker.go (1)

340-360: Persist openReg once after all mutations.

This new branch saves the record immediately for variant expansion, and the existing update/reopen paths can save it again in the same iteration. That adds avoidable DB churn and leaves the regression partially updated if a later save fails; it would be safer to accumulate the in-memory changes and call UpdateRegression once.

Also applies to: 403-408

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/componentreadiness/regressiontracker.go` around lines 340 - 360, The
code updates openReg and calls rt.backend.UpdateRegression immediately when
newVariants are detected, causing multiple DB writes in one iteration; instead,
accumulate all in-memory changes to openReg (e.g., appending variantStrs to
openReg.Variants) and defer calling rt.backend.UpdateRegression until after all
mutation paths (including the existing update/reopen logic) have completed,
invoking UpdateRegression exactly once if rt.dryRun is false; update the logic
around existingVariantMap, newVariants, the variant-appending block that
references regTest.Variants and openReg.Variants, and remove the early call to
rt.backend.UpdateRegression so only the consolidated UpdateRegression call
remains later in the function.
pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker_test.go (1)

719-837: Add coverage for overlapping subset matches.

Every case here passes at most one candidate regression, so this suite will not catch the ambiguity introduced when both a legacy subset regression and a fully specific regression match the same input. Please add a case with both candidates and assert the intended precedence rule.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker_test.go`
around lines 719 - 837, The test suite misses a case where two candidate
regressions overlap (one legacy/subset and one fully specific) so ambiguity
isn’t exercised; add a subtest that creates two regressions in the regressions
slice (both TestRegression entries) — one with fewer regressionVars (the
subset/legacy) and one with a superset or fully specific regressionVars — then
call FindOpenRegression(sampleRelease, testID, inputVariants, regressions) and
assert the function returns the correct regression according to the intended
precedence rule (explicitly assert which ID — e.g., the fully specific
regression ID — should be chosen), and also validate the returned Release,
BaseRelease and TestID as in other tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In
`@pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker.go`:
- Around line 191-218: Multiple open regressions can match the same input;
instead of returning matches[0] (which is non-deterministic), sort and pick a
deterministic winner: after building matches (slice of tr), implement a stable
sort that orders by your chosen precedence (e.g., primary: descending
specificity = len(tr.Variants) to prefer most-specific, secondary: ascending
creation time like tr.CreatedAt or ascending tr.ID to tie-break by
oldest/legacy), then return matches[0]; update the matching block in
regressiontracker.go (the matches slice and return) to perform this sort before
returning.

---

Nitpick comments:
In
`@pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker_test.go`:
- Around line 719-837: The test suite misses a case where two candidate
regressions overlap (one legacy/subset and one fully specific) so ambiguity
isn’t exercised; add a subtest that creates two regressions in the regressions
slice (both TestRegression entries) — one with fewer regressionVars (the
subset/legacy) and one with a superset or fully specific regressionVars — then
call FindOpenRegression(sampleRelease, testID, inputVariants, regressions) and
assert the function returns the correct regression according to the intended
precedence rule (explicitly assert which ID — e.g., the fully specific
regression ID — should be chosen), and also validate the returned Release,
BaseRelease and TestID as in other tests.

In `@pkg/api/componentreadiness/regressiontracker.go`:
- Around line 340-360: The code updates openReg and calls
rt.backend.UpdateRegression immediately when newVariants are detected, causing
multiple DB writes in one iteration; instead, accumulate all in-memory changes
to openReg (e.g., appending variantStrs to openReg.Variants) and defer calling
rt.backend.UpdateRegression until after all mutation paths (including the
existing update/reopen logic) have completed, invoking UpdateRegression exactly
once if rt.dryRun is false; update the logic around existingVariantMap,
newVariants, the variant-appending block that references regTest.Variants and
openReg.Variants, and remove the early call to rt.backend.UpdateRegression so
only the consolidated UpdateRegression call remains later in the function.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: a9f7c0ca-090d-4dd1-a47b-e3bc945377b3

📥 Commits

Reviewing files that changed from the base of the PR and between 8f46b90 and e22c267.

📒 Files selected for processing (3)
  • pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker.go
  • pkg/api/componentreadiness/middleware/regressiontracker/regressiontracker_test.go
  • pkg/api/componentreadiness/regressiontracker.go

Copy link
Member

@sosiouxme sosiouxme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 25, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 25, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sosiouxme, xueqzhan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sosiouxme
Copy link
Member

sosiouxme commented Mar 26, 2026

/test lint
(stbenjam fixed the deps)

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD ba3d7ee and 2 for PR HEAD e22c267 in total

@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e

@openshift-merge-bot openshift-merge-bot bot merged commit 4dedbab into openshift:main Mar 26, 2026
3 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants