Skip to content

OCPBUGS-78534: Make E2E test create helper idempotent and improve test cleanup#16148

Open
cajieh wants to merge 1 commit intoopenshift:mainfrom
cajieh:fix-flaky-deprecated-operator-test
Open

OCPBUGS-78534: Make E2E test create helper idempotent and improve test cleanup#16148
cajieh wants to merge 1 commit intoopenshift:mainfrom
cajieh:fix-flaky-deprecated-operator-test

Conversation

@cajieh
Copy link
Contributor

@cajieh cajieh commented Mar 16, 2026

Summary:

  • Change oc create to oc apply for idempotent resource creation.
    Using oc apply instead of oc create follows Kubernetes best practices for declarative resource management
    and is more reliable for test infrastructure.
  • Improve deprecated-operator-warnings test cleanup robustness

The create() helper function used oc create which fails with "AlreadyExists" error if the resource exists from a previous run. This causes flaky test failures in CI due to:

  1. Kubernetes asynchronous deletion - Resources can still be terminating even after delete command completes
  2. Test timeouts - Cleanup hooks don't run when tests timeout
  3. Infrastructure failures - CI crashes can prevent cleanup

Changed to oc apply which is idempotent - it creates the resource if it doesn't exist, or updates it if it does. This improves test reliability regardless of prior cluster state.

Issues fixed:

Specific: Resolves the flaky "Deprecated operator warnings" test suite that was failing 51% of the time (9% overall CI impact) with consistent oc create failures for the kiali.subscription.json resource.
Broader Impact: The create() helper is used by other test suites, so they will all benefit from this change.

Summary by CodeRabbit

Release Notes

  • Tests
    • Improved integration test stability with enhanced resource cleanup procedures and extended timeouts.
    • Updated test resource management for more reliable and idempotent test operations.

@openshift-ci openshift-ci bot requested review from Leo6Leo and TheRealJon March 16, 2026 12:04
@openshift-ci openshift-ci bot added component/olm Related to OLM kind/cypress Related to Cypress e2e integration testing approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 16, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 16, 2026

📝 Walkthrough

Walkthrough

This pull request refactors Kubernetes CLI operations and test infrastructure cleanup procedures. The integration-tests-cypress support file transitions from oc create to oc apply semantics for resource provisioning. The operator-lifecycle-manager test suite introduces a dedicated cleanup function that consolidates idempotent resource teardown operations (Subscriptions, Clusterserviceversions, and InstallPlans) with standardized flags (ignore-not-found, extended timeouts, no-wait behavior). The setup and teardown hooks are restructured to leverage this centralized cleanup rather than executing individual deletion commands, improving consistency and maintainability of test resource lifecycle management.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use Trivy to scan for security misconfigurations and secrets in Infrastructure as Code files.

Add a .trivyignore file to your project to customize which findings Trivy reports.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@frontend/packages/operator-lifecycle-manager/integration-tests-cypress/tests/deprecated-operator-warnings.cy.ts`:
- Around line 64-65: The afterEach catalogsource deletion is missing the
non-blocking flag; update the teardown command that uses catalogSourceName and
catalogSourceNamespace in the afterEach hook to include --wait=false (to match
the beforeEach invocation) so the oc delete call becomes non-blocking and avoids
blocking teardown under load; locate the afterEach block where the command
string is built and append --wait=false to the oc delete invocation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: bd32f783-3a3e-4087-bcca-e6cf64a4c19c

📥 Commits

Reviewing files that changed from the base of the PR and between 42e0cce and 079758e.

📒 Files selected for processing (2)
  • frontend/packages/integration-tests-cypress/support/index.ts
  • frontend/packages/operator-lifecycle-manager/integration-tests-cypress/tests/deprecated-operator-warnings.cy.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • frontend/packages/integration-tests-cypress/support/index.ts
  • frontend/packages/operator-lifecycle-manager/integration-tests-cypress/tests/deprecated-operator-warnings.cy.ts
🧬 Code graph analysis (1)
frontend/packages/operator-lifecycle-manager/integration-tests-cypress/tests/deprecated-operator-warnings.cy.ts (1)
frontend/packages/integration-tests-cypress/support/index.ts (1)
  • create (126-134)
🔇 Additional comments (2)
frontend/packages/integration-tests-cypress/support/index.ts (1)

132-132: Idempotent oc apply change looks correct.

Good change for rerun safety and CI flake reduction in resource setup.

frontend/packages/operator-lifecycle-manager/integration-tests-cypress/tests/deprecated-operator-warnings.cy.ts (1)

29-56: Cleanup centralization is a solid reliability improvement.

Extracting cleanupOperatorResources() and reusing it in setup reduces duplication and makes teardown behavior consistent.

Comment on lines +64 to +65
`oc delete catalogsource ${catalogSourceName} -n ${catalogSourceNamespace} --ignore-not-found`,
{ failOnNonZeroExit: false, timeout: 60000 },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify catalogsource teardown commands are consistently non-blocking.
# Expected: both before/after delete commands include --wait=false.
rg -n --type=ts -C2 'oc delete catalogsource .*--ignore-not-found'

Repository: openshift/console

Length of output: 1637


Add --wait=false to afterEach catalogsource deletion for consistent non-blocking teardown.

Line 64's afterEach hook omits --wait=false while line 52's beforeEach hook includes it. This asymmetry can cause test flakes under load, as teardown may block on async catalogsource deletion. Align both teardown operations.

Fix
-      `oc delete catalogsource ${catalogSourceName} -n ${catalogSourceNamespace} --ignore-not-found`,
+      `oc delete catalogsource ${catalogSourceName} -n ${catalogSourceNamespace} --ignore-not-found --wait=false`,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
`oc delete catalogsource ${catalogSourceName} -n ${catalogSourceNamespace} --ignore-not-found`,
{ failOnNonZeroExit: false, timeout: 60000 },
`oc delete catalogsource ${catalogSourceName} -n ${catalogSourceNamespace} --ignore-not-found --wait=false`,
{ failOnNonZeroExit: false, timeout: 60000 },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@frontend/packages/operator-lifecycle-manager/integration-tests-cypress/tests/deprecated-operator-warnings.cy.ts`
around lines 64 - 65, The afterEach catalogsource deletion is missing the
non-blocking flag; update the teardown command that uses catalogSourceName and
catalogSourceNamespace in the afterEach hook to include --wait=false (to match
the beforeEach invocation) so the oc delete call becomes non-blocking and avoids
blocking teardown under load; locate the afterEach block where the command
string is built and append --wait=false to the oc delete invocation.

The create() helper function used `oc create` which fails with
"AlreadyExists" if the resource exists from a previous failed test run.
This causes flaky test failures in CI, particularly affecting the
"Deprecated operator warnings" test suite.

Changed to `oc apply` which is idempotent - it creates the resource if
it doesn't exist, or updates it if it does. This improves test
reliability when runs don't fully clean up or when resources persist
between test runs.

Made-with: Cursor
@cajieh cajieh force-pushed the fix-flaky-deprecated-operator-test branch from 079758e to e93f6f0 Compare March 16, 2026 13:01
@cajieh
Copy link
Contributor Author

cajieh commented Mar 16, 2026

/retest

@cajieh cajieh changed the title [NO-JIRA]: Make E2E test create helper idempotent and improve test cleanup OCPBUGS-78534: Make E2E test create helper idempotent and improve test cleanup Mar 16, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Mar 16, 2026
@openshift-ci-robot
Copy link
Contributor

@cajieh: This pull request references Jira Issue OCPBUGS-78534, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary:

  • Change oc create to oc apply for idempotent resource creation
    Using oc apply instead of oc create follows Kubernetes best practices for declarative resource management
    and is more reliable for test infrastructure.
  • Improve deprecated-operator-warnings test cleanup robustness

The create() helper function used oc create which fails with "AlreadyExists" error if the resource exists from a previous run. This causes flaky test failures in CI due to:

  1. Kubernetes asynchronous deletion - Resources can still be terminating even after delete command completes
  2. Test timeouts - Cleanup hooks don't run when tests timeout
  3. Infrastructure failures - CI crashes can prevent cleanup

Changed to oc apply which is idempotent - it creates the resource if it doesn't exist, or updates it if it does. This improves test reliability regardless of prior cluster state.

Specific Fix: Resolves the flaky "Deprecated operator warnings" test suite that was failing 51% of the time (9% overall CI impact) with consistent oc create failures for the kiali.subscription.json resource.

Broader Impact: The create() helper is used by other test suites, so they will all benefit from this change.

Summary by CodeRabbit

Release Notes

  • Tests
  • Improved integration test stability with enhanced resource cleanup procedures and extended timeouts.
  • Updated test resource management for more reliable and idempotent test operations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cajieh
Copy link
Contributor Author

cajieh commented Mar 16, 2026

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Mar 16, 2026
@openshift-ci-robot
Copy link
Contributor

@cajieh: This pull request references Jira Issue OCPBUGS-78534, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cajieh
Copy link
Contributor Author

cajieh commented Mar 16, 2026

/retest

@openshift-ci-robot
Copy link
Contributor

@cajieh: This pull request references Jira Issue OCPBUGS-78534, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary:

  • Change oc create to oc apply for idempotent resource creation.
    Using oc apply instead of oc create follows Kubernetes best practices for declarative resource management
    and is more reliable for test infrastructure.
  • Improve deprecated-operator-warnings test cleanup robustness

The create() helper function used oc create which fails with "AlreadyExists" error if the resource exists from a previous run. This causes flaky test failures in CI due to:

  1. Kubernetes asynchronous deletion - Resources can still be terminating even after delete command completes
  2. Test timeouts - Cleanup hooks don't run when tests timeout
  3. Infrastructure failures - CI crashes can prevent cleanup

Changed to oc apply which is idempotent - it creates the resource if it doesn't exist, or updates it if it does. This improves test reliability regardless of prior cluster state.

Specific Fix: Resolves the flaky "Deprecated operator warnings" test suite that was failing 51% of the time (9% overall CI impact) with consistent oc create failures for the kiali.subscription.json resource.

Broader Impact: The create() helper is used by other test suites, so they will all benefit from this change.

Summary by CodeRabbit

Release Notes

  • Tests
  • Improved integration test stability with enhanced resource cleanup procedures and extended timeouts.
  • Updated test resource management for more reliable and idempotent test operations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

cajieh added a commit to cajieh/console that referenced this pull request Mar 17, 2026
Add cleanup helpers to both global and single-namespace Data Grid
operator installation tests. This addresses flaky test failures
caused by leftover resources from previous test runs.

Changes:
- Add cleanupOperatorResources() helper to delete Subscriptions, CSVs,
  and InstallPlans with --ignore-not-found --wait=false flags
- Call cleanup before operator installation to handle dirty state
- Add after() hook to global install test for consistent teardown
- Add cleanup to single-namespace test's after() hook

This follows the same pattern used in the deprecated-operator-warnings
test fix (PR openshift#16148) and improves test reliability in CI environments.

Made-with: Cursor
Copy link
Member

@sg00dwin sg00dwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 17, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cajieh, sg00dwin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cajieh
Copy link
Contributor Author

cajieh commented Mar 17, 2026

/verified later @yapei

@openshift-ci-robot openshift-ci-robot added verified-later verified Signifies that the PR passed pre-merge verification criteria labels Mar 17, 2026
@cajieh
Copy link
Contributor Author

cajieh commented Mar 17, 2026

/retest

@openshift-ci-robot
Copy link
Contributor

@cajieh: This PR has been marked to be verified later by @yapei.

Details

In response to this:

/verified later @yapei

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cajieh
Copy link
Contributor Author

cajieh commented Mar 17, 2026

/cherry-pick release-4.16

@openshift-cherrypick-robot

@cajieh: once the present PR merges, I will cherry-pick it on top of release-4.16 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-4.16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2026

@cajieh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-console e93f6f0 link unknown /test e2e-gcp-console

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. component/olm Related to OLM jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. kind/cypress Related to Cypress e2e integration testing lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria verified-later

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants