Skip to content

OCPBUGS-77952: Fix devfile sample import by adding fallback for parent resolution failures#16149

Open
jhadvig wants to merge 1 commit intoopenshift:mainfrom
jhadvig:OCPBUGS-77952
Open

OCPBUGS-77952: Fix devfile sample import by adding fallback for parent resolution failures#16149
jhadvig wants to merge 1 commit intoopenshift:mainfrom
jhadvig:OCPBUGS-77952

Conversation

@jhadvig
Copy link
Member

@jhadvig jhadvig commented Mar 16, 2026

Summary:

  • Devfile samples with parent references (e.g., go-basic) fail when the vendored devfile library's OCI pull of parent stack resources from registry.devfile.io breaks, keeping the submit button permanently disabled
  • Add fallback: try flattened parse first, retry without parent resolution on failure
  • Safe because the console backend only uses outer-loop components (image, kubernetes, deploy command) which are all defined in the child devfile

/assign @vikram-raj

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Implemented automatic fallback mechanism for devfile parsing to gracefully handle complex configurations and improve overall reliability.
    • Enhanced handling of devfiles with unreachable registry references or missing parent resources.
  • Tests

    • Added comprehensive test coverage for devfile parsing fallback scenarios.

@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels Mar 16, 2026
@openshift-ci-robot
Copy link
Contributor

@jhadvig: This pull request references Jira Issue OCPBUGS-77952, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary:

  • Devfile samples with parent references (e.g., go-basic) fail when the vendored devfile library's OCI pull of parent stack resources from registry.devfile.io breaks, keeping the submit button permanently disabled
  • Add fallback: try flattened parse first, retry without parent resolution on failure
  • Safe because the console backend only uses outer-loop components (image, kubernetes, deploy command) which are all defined in the child devfile

/assign @vikram-raj

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Mar 16, 2026
@openshift-ci openshift-ci bot requested review from Leo6Leo and TheRealJon March 16, 2026 14:33
@openshift-ci openshift-ci bot added component/backend Related to backend approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 16, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Mar 16, 2026
@openshift-ci-robot
Copy link
Contributor

@jhadvig: This pull request references Jira Issue OCPBUGS-77952, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @yapei

Details

In response to this:

Summary:

  • Devfile samples with parent references (e.g., go-basic) fail when the vendored devfile library's OCI pull of parent stack resources from registry.devfile.io breaks, keeping the submit button permanently disabled
  • Add fallback: try flattened parse first, retry without parent resolution on failure
  • Safe because the console backend only uses outer-loop components (image, kubernetes, deploy command) which are all defined in the child devfile

/assign @vikram-raj

Summary by CodeRabbit

Release Notes

  • Bug Fixes

  • Implemented automatic fallback mechanism for devfile parsing to gracefully handle complex configurations and improve overall reliability.

  • Enhanced handling of devfiles with unreachable registry references or missing parent resources.

  • Tests

  • Added comprehensive test coverage for devfile parsing fallback scenarios.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from yapei March 16, 2026 14:36
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 16, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a two-stage devfile parsing mechanism to improve resilience when handling parent and plugin references. A new parseDevfileWithFallback function attempts parsing with full parent/plugin resolution (flattened) first, then retries with flattening disabled if the initial attempt fails. The DevfileHandler delegates to this function rather than directly calling ParseDevfileAndValidate. New test coverage validates the fallback behavior with three scenarios: successful parsing without parents, fallback retry when a registry is unreachable, and error handling for invalid YAML input.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing a fallback mechanism for devfile parent resolution failures to fix sample imports.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can approve the review once all CodeRabbit's comments are resolved.

Enable the reviews.request_changes_workflow setting to automatically approve the review once all CodeRabbit's comments are resolved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/devfile/handler.go (1)

43-58: Gate fallback retries to network/resolution failures; skip for schema violations.

Line 51 currently retries unflattened parsing for any ParseDevfileAndValidate error, including schema validation and YAML syntax failures. This burns unnecessary cycles for invalid inputs and obscures root-cause diagnostics. Since the handler at line 86 already inspects error messages, add a guard before the retry—return the original error immediately for validation failures, and only retry for parent/plugin/OCI resolution timeouts:

 	if err == nil {
 		return devfileObj, nil
 	}
+
+	if !shouldRetryUnflattened(err) {
+		return parser.DevfileObj{}, err
+	}
 
 	klog.Warningf("Flattened devfile parse failed, retrying without parent resolution: %v", err)
 
 	flattenedDevfile := false
 	devfileObj, _, err = devfile.ParseDevfileAndValidate(parser.ParserArgs{
 		Data:             devfileContentBytes,
 		HTTPTimeout:      httpTimeout,
 		FlattenedDevfile: &flattenedDevfile,
 	})
 	return devfileObj, nil
+}
+
+func shouldRetryUnflattened(err error) bool {
+	msg := strings.ToLower(err.Error())
+	return strings.Contains(msg, "oci") ||
+		strings.Contains(msg, "parent") ||
+		strings.Contains(msg, "registry") ||
+		strings.Contains(msg, "timeout") ||
+		strings.Contains(msg, "connection refused")
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/devfile/handler.go` around lines 43 - 58, The current retry always calls
devfile.ParseDevfileAndValidate a second time (setting FlattenedDevfile) even
for validation/YAML errors; change the handler to inspect the original err from
devfile.ParseDevfileAndValidate and only perform the flattened-devfile retry
when the error indicates network/resolution/timeout issues (parent/plugin/OCI
resolution failures), otherwise return the original error immediately; use the
same error inspection approach already used later in the file (the handler's
existing error message checks) to detect validation/schema/YAML errors versus
resolution/timeouts before creating flattenedDevfile and re-calling
devfile.ParseDevfileAndValidate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/devfile/handler_test.go`:
- Around line 65-66: The test currently relies on an external DNS name via the
registryUrl YAML value and a long wait (about 10s), causing CI flakiness; update
the fallback test to use a deterministic local failure target (replace
registryUrl: 'https://does-not-exist.invalid' with 'http://127.0.0.1:1') and
shorten the related wait/timeout (the ~10s backoff used in the test) to a small
value like 200–500ms so the test fails fast; locate the registryUrl string in
the test fixture and the timeout/wait variable used around the fallback
assertion and make these two changes.

---

Nitpick comments:
In `@pkg/devfile/handler.go`:
- Around line 43-58: The current retry always calls
devfile.ParseDevfileAndValidate a second time (setting FlattenedDevfile) even
for validation/YAML errors; change the handler to inspect the original err from
devfile.ParseDevfileAndValidate and only perform the flattened-devfile retry
when the error indicates network/resolution/timeout issues (parent/plugin/OCI
resolution failures), otherwise return the original error immediately; use the
same error inspection approach already used later in the file (the handler's
existing error message checks) to detect validation/schema/YAML errors versus
resolution/timeouts before creating flattenedDevfile and re-calling
devfile.ParseDevfileAndValidate.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c233660d-44f2-4076-bf3d-b052fc6548a2

📥 Commits

Reviewing files that changed from the base of the PR and between 4d1f6f7 and dc59323.

📒 Files selected for processing (2)
  • pkg/devfile/handler.go
  • pkg/devfile/handler_test.go

Comment on lines +65 to +66
registryUrl: 'https://does-not-exist.invalid'
components:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove external DNS/network dependency from fallback test.

The fallback case currently depends on resolving an external invalid domain (Line 65) and can wait up to 10s (Line 111), which may cause CI flakiness/latency. Use a deterministic local failure target (for example http://127.0.0.1:1) and a shorter timeout.

Proposed test hardening
-  registryUrl: 'https://does-not-exist.invalid'
+  registryUrl: 'http://127.0.0.1:1'
@@
-	httpTimeout := 10
+	httpTimeout := 1

Also applies to: 111-125

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/devfile/handler_test.go` around lines 65 - 66, The test currently relies
on an external DNS name via the registryUrl YAML value and a long wait (about
10s), causing CI flakiness; update the fallback test to use a deterministic
local failure target (replace registryUrl: 'https://does-not-exist.invalid' with
'http://127.0.0.1:1') and shorten the related wait/timeout (the ~10s backoff
used in the test) to a small value like 200–500ms so the test fails fast; locate
the registryUrl string in the test fixture and the timeout/wait variable used
around the fallback assertion and make these two changes.

@vikram-raj
Copy link
Member

/retest

Copy link
Member

@vikram-raj vikram-raj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 17, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jhadvig, vikram-raj

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2026

@jhadvig: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. component/backend Related to backend jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants