OCPBUGS-77952: Fix devfile sample import by adding fallback for parent resolution failures#16149
OCPBUGS-77952: Fix devfile sample import by adding fallback for parent resolution failures#16149jhadvig wants to merge 1 commit intoopenshift:mainfrom
Conversation
…t resolution failures
|
@jhadvig: This pull request references Jira Issue OCPBUGS-77952, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@jhadvig: This pull request references Jira Issue OCPBUGS-77952, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
📝 WalkthroughWalkthroughThis pull request introduces a two-stage devfile parsing mechanism to improve resilience when handling parent and plugin references. A new 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip CodeRabbit can approve the review once all CodeRabbit's comments are resolved.Enable the |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/devfile/handler.go (1)
43-58: Gate fallback retries to network/resolution failures; skip for schema violations.Line 51 currently retries unflattened parsing for any
ParseDevfileAndValidateerror, including schema validation and YAML syntax failures. This burns unnecessary cycles for invalid inputs and obscures root-cause diagnostics. Since the handler at line 86 already inspects error messages, add a guard before the retry—return the original error immediately for validation failures, and only retry for parent/plugin/OCI resolution timeouts:if err == nil { return devfileObj, nil } + + if !shouldRetryUnflattened(err) { + return parser.DevfileObj{}, err + } klog.Warningf("Flattened devfile parse failed, retrying without parent resolution: %v", err) flattenedDevfile := false devfileObj, _, err = devfile.ParseDevfileAndValidate(parser.ParserArgs{ Data: devfileContentBytes, HTTPTimeout: httpTimeout, FlattenedDevfile: &flattenedDevfile, }) return devfileObj, nil +} + +func shouldRetryUnflattened(err error) bool { + msg := strings.ToLower(err.Error()) + return strings.Contains(msg, "oci") || + strings.Contains(msg, "parent") || + strings.Contains(msg, "registry") || + strings.Contains(msg, "timeout") || + strings.Contains(msg, "connection refused") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/devfile/handler.go` around lines 43 - 58, The current retry always calls devfile.ParseDevfileAndValidate a second time (setting FlattenedDevfile) even for validation/YAML errors; change the handler to inspect the original err from devfile.ParseDevfileAndValidate and only perform the flattened-devfile retry when the error indicates network/resolution/timeout issues (parent/plugin/OCI resolution failures), otherwise return the original error immediately; use the same error inspection approach already used later in the file (the handler's existing error message checks) to detect validation/schema/YAML errors versus resolution/timeouts before creating flattenedDevfile and re-calling devfile.ParseDevfileAndValidate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/devfile/handler_test.go`:
- Around line 65-66: The test currently relies on an external DNS name via the
registryUrl YAML value and a long wait (about 10s), causing CI flakiness; update
the fallback test to use a deterministic local failure target (replace
registryUrl: 'https://does-not-exist.invalid' with 'http://127.0.0.1:1') and
shorten the related wait/timeout (the ~10s backoff used in the test) to a small
value like 200–500ms so the test fails fast; locate the registryUrl string in
the test fixture and the timeout/wait variable used around the fallback
assertion and make these two changes.
---
Nitpick comments:
In `@pkg/devfile/handler.go`:
- Around line 43-58: The current retry always calls
devfile.ParseDevfileAndValidate a second time (setting FlattenedDevfile) even
for validation/YAML errors; change the handler to inspect the original err from
devfile.ParseDevfileAndValidate and only perform the flattened-devfile retry
when the error indicates network/resolution/timeout issues (parent/plugin/OCI
resolution failures), otherwise return the original error immediately; use the
same error inspection approach already used later in the file (the handler's
existing error message checks) to detect validation/schema/YAML errors versus
resolution/timeouts before creating flattenedDevfile and re-calling
devfile.ParseDevfileAndValidate.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c233660d-44f2-4076-bf3d-b052fc6548a2
📒 Files selected for processing (2)
pkg/devfile/handler.gopkg/devfile/handler_test.go
| registryUrl: 'https://does-not-exist.invalid' | ||
| components: |
There was a problem hiding this comment.
Remove external DNS/network dependency from fallback test.
The fallback case currently depends on resolving an external invalid domain (Line 65) and can wait up to 10s (Line 111), which may cause CI flakiness/latency. Use a deterministic local failure target (for example http://127.0.0.1:1) and a shorter timeout.
Proposed test hardening
- registryUrl: 'https://does-not-exist.invalid'
+ registryUrl: 'http://127.0.0.1:1'
@@
- httpTimeout := 10
+ httpTimeout := 1Also applies to: 111-125
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/devfile/handler_test.go` around lines 65 - 66, The test currently relies
on an external DNS name via the registryUrl YAML value and a long wait (about
10s), causing CI flakiness; update the fallback test to use a deterministic
local failure target (replace registryUrl: 'https://does-not-exist.invalid' with
'http://127.0.0.1:1') and shorten the related wait/timeout (the ~10s backoff
used in the test) to a small value like 200–500ms so the test fails fast; locate
the registryUrl string in the test fixture and the timeout/wait variable used
around the fallback assertion and make these two changes.
|
/retest |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jhadvig, vikram-raj The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@jhadvig: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary:
/assign @vikram-raj
Summary by CodeRabbit
Release Notes
Bug Fixes
Tests