Skip to content

fix(dx): harden cicd-diagnostics with diagnose.py entry point, continue-on-error detection, and improved error extraction#34866

Merged
spbolton merged 4 commits intomainfrom
issue-34865-harden-cicd-diagnostics
Mar 30, 2026
Merged

fix(dx): harden cicd-diagnostics with diagnose.py entry point, continue-on-error detection, and improved error extraction#34866
spbolton merged 4 commits intomainfrom
issue-34865-harden-cicd-diagnostics

Conversation

@spbolton
Copy link
Copy Markdown
Member

@spbolton spbolton commented Mar 4, 2026

Summary

Closes #34865

Fixes a misdiagnosis bug where the skill blamed a continue-on-error step (CLI Deploy / maven-repo) instead of the actual failure (SDKs Publish / npm 403). Adds diagnose.py as a single entry point that replaces multi-script orchestration.

Key changes

  • diagnose.py — Single entry point with progressive subcommands (--metadata, --jobs, --annotations, --logs, --evidence). One permission pattern covers all operations.
  • fetch-jobs.py — Step-level detail with continue-on-error detection. Correctly identifies which step killed the job vs which errors were masked.
  • Error extraction — Now catches npm error, npm ERR!, BUILD FAILURE in addition to ##[error] lines.
  • WORKFLOWS.md — Documents deployment step continue-on-error flags and artifact-run-id propagation.
  • SKILL.md — Rewritten: triage-first, diagnose.py as primary tool, explicit guidance against ad-hoc commands.
  • github_api.pyDiagnosticError class, _run_gh wrapper, preflight validates dotCMS/core checkout.
  • .claude/settings.json — Permissions for diagnostic script execution.

Before/After

Before: Skill sees ##[error] Artifact not found for name: maven-repo → blames CLI Deploy → proposes wrong fix.

After: diagnose.py output shows:

[success     ] Step 9: CLI Deploy
[FAIL <- likely caused job failure] Step 11: SDKs Publish
...
npm error 403 Forbidden - You cannot publish over previously published versions: 1.2.5-next.1

Test plan

  • diagnose.py <RUN_ID> — full evidence gathering, correct root cause identification
  • diagnose.py <RUN_ID> --jobs — step-level detail with continue-on-error flags
  • diagnose.py <RUN_ID> --logs <JOB_ID> — npm/build errors extracted alongside ##[error]
  • Preflight fails gracefully outside dotCMS/core checkout
  • Single permission pattern Bash(python3 .claude/skills/cicd-diagnostics/*) sufficient

🤖 Generated with Claude Code

This PR fixes: #34865

@spbolton spbolton added the Area : CI/CD PR changes GitHub Actions/workflows label Mar 4, 2026
@github-actions github-actions bot removed the Area : CI/CD PR changes GitHub Actions/workflows label Mar 4, 2026
@spbolton spbolton requested a review from a team as a code owner March 24, 2026 19:46
@spbolton spbolton force-pushed the issue-34865-harden-cicd-diagnostics branch from 9b28510 to 7576bed Compare March 24, 2026 19:48
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 24, 2026

Claude finished @spbolton's task in 3m 8s —— View job


Code Review

The core logic is sound and the consolidation from 5 scripts to one entry point is a clear improvement. A few issues worth calling out:


1. ValueError risk in download_logsdiagnose.py:236

targets = [{'id': int(specific_job_id), 'name': f'Job {specific_job_id}'}]

int(specific_job_id) raises ValueError if a non-numeric string is passed. The id is immediately converted back to str via str(job['id']) a few lines later anyway. Just use the string directly:

targets = [{'id': specific_job_id, 'name': f'Job {specific_job_id}'}]

Fix this →


2. Dead fallback for annotations — diagnose.py:show_annotations

show_annotations silently returns "(annotation scraper not available)" if html_scraper can't be imported, with no fallback. The PR author justified keeping get_workflow_run_annotations() in github_api.py as a "fallback if the HTML scraper breaks", but diagnose.py never calls it. The fallback is dead code. Either wire it in or remove that justification from the PR response.


3. Stray blank lines inside list literals — github_api.py:162, github_api.py:243

result = _run_gh(
    [
        "gh", "run", "view", run_id,
                             # ← blank line here
        "--json", "conclusion,...
    ],

Same pattern in get_pr_info. Valid Python, but clearly accidental copy-paste artifacts from the refactor.


4. continue-on-error heuristic is correct

The fix (later_ran = "any later non-cleanup step ran" rather than "succeeded") handles consecutive continue-on-error failures correctly. The logic in show_jobs is:

ran_non_cleanup = {step['number'] for step in steps
                   if step.get('conclusion') and not is_cleanup_step(step)}

later_ran = any(n > number for n in ran_non_cleanup)

For the triggering scenario (CLI Deploy fails w/ continue-on-error, then SDKs Publish also fails as the real killer): CLI Deploy gets later_ran=True → masked; SDKs Publish gets later_ran=False → "likely caused job failure". Correct.

Edge case acknowledged: a continue-on-error step that is the last non-cleanup step will be labeled "likely caused job failure". This is the right call given the API doesn't expose the continue-on-error flag.


5. _run_gh 403 heuristic could misfire — github_api.py:108

if "403" in stderr or "rate limit" in stderr.lower():
    raise DiagnosticError("Rate limit hit...")

"403" in stderr could match on a URL fragment or message body (e.g., if gh embeds response body in the error). For GitHub CLI errors this is unlikely in practice but it would give a misleading "rate limit" message when the actual cause is a 403 permission error. Consider checking for "HTTP 403" or "rate limit" only.


Summary: Clean PR. Issues 1 and 3 are minor fixes, issue 2 is a documentation inconsistency, issues 4 and 5 are low-risk observations. WORKFLOWS.md addition is a valuable diagnostic reference.

@spbolton spbolton changed the title fix(dx): harden cicd-diagnostics scripts with preflight checks and error handling fix(dx): harden cicd-diagnostics with diagnose.py entry point, continue-on-error detection, and improved error extraction Mar 24, 2026
@semgrep-code-dotcms-test
Copy link
Copy Markdown

Semgrep found 1 ssc-c313d835-7f7c-4f67-a6e2-ac5038d1fb9d finding:

Risk: Affected versions of storybook are vulnerable to Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting') / Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection') / Missing Origin Validation in WebSockets. Storybook's dev server WebSocket endpoint does not validate the Origin of incoming connections, enabling WebSocket hijacking. If a developer visits a malicious website while a local Storybook dev server is running (or if the dev server is publicly exposed), an attacker can open an unauthorized WebSocket connection and invoke the "create/save story" handlers to write attacker-controlled content into story files, leading to persistent XSS and potentially remote code execution and supply-chain compromise if the injected changes are committed and propagated.

Manual Review Advice: A vulnerability from this advisory is reachable if you visit a malicious website while your local Storybook dev server is running

Fix: Upgrade this library to at least version 10.2.10 at core/core-web/yarn.lock:20288.

Reference(s): GHSA-mjf5-7g4m-gx5w, CVE-2026-27148

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

Semgrep found 1 ssc-6193c409-cebc-449c-8a55-f95fa9d0e4f0 finding:

Risk: Affected versions of rollup are vulnerable to Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Rollup is vulnerable to arbitrary file write via path traversal: chunk/asset names derived from user-controlled inputs (e.g., CLI named inputs, manual chunk aliases, or malicious plugins) are insufficiently sanitized, allowing ../ sequences to survive and be passed into path.resolve when computing output paths. This lets an attacker escape the configured output directory and overwrite arbitrary files on the host filesystem that the build process can write to, potentially leading to persistent RCE by clobbering shell/profile or other executable/config files.

Manual Review Advice: A vulnerability from this advisory is reachable if you are running rollup --input

Fix: Upgrade this library to at least version 4.59.0 at core/core-web/yarn.lock:19323.

Reference(s): GHSA-mw96-cpmx-2vgc, CVE-2026-27606

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

Semgrep found 1 ssc-d1b4e9e7-4dae-4218-8bb1-046e9a0b7e60 finding:

Risk: Affected versions of next are vulnerable to Deserialization of Untrusted Data / Uncontrolled Resource Consumption. A flaw in React Server Components' deserialization allows an attacker to send a specially crafted HTTP request to any App Router Server Function endpoint in Next.js, triggering excessive CPU usage, out-of-memory conditions, or a server crash and resulting in a denial of service.

Fix: Upgrade this library to at least version 15.0.8 at core/core-web/yarn.lock:16728.

Reference(s): GHSA-h25m-26qc-wcjf, CVE-2026-23864

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

Semgrep found 1 ssc-b94a740c-3b13-43fd-9f2d-4d8bb0fe0b69 finding:

Risk: Affected versions of next are vulnerable to Dependency on Vulnerable Third-Party Component / Deserialization of Untrusted Data / Uncontrolled Resource Consumption. An attacker can send a specially crafted HTTP request to any Server Function endpoint (as used by Next.js' App Router) that, when deserialized by the React Server Components runtime, enters an infinite loop—hanging the server process, exhausting CPU, and resulting in a denial-of-service.

Fix: Upgrade this library to at least version 14.2.35 at core/core-web/yarn.lock:16728.

Reference(s): GHSA-5j59-xgg2-r9c4, CVE-2025-67779

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

Semgrep found 1 ssc-74b4cbd5-76e9-40fe-adb6-38be9f569d24 finding:

Risk: Affected versions of next are vulnerable to Dependency on Vulnerable Third-Party Component / Deserialization of Untrusted Data / Uncontrolled Resource Consumption. A flaw in Next.js's App Router deserialization allows an attacker to send a specially crafted HTTP request body that, when parsed by the server, triggers excessive CPU work or an infinite loop. By targeting any App Router endpoint with this malicious payload, the server process can hang and become unresponsive, resulting in a denial-of-service.

Fix: Upgrade this library to at least version 14.2.34 at core/core-web/yarn.lock:16728.

Reference(s): GHSA-mwv6-3258-q52c

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

Semgrep found 1 ssc-cee3e6d5-d7c8-4c35-9815-076aa1ebfd49 finding:

Risk: Affected versions of rollup are vulnerable to Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting').

Manual Review Advice: A vulnerability from this advisory is reachable if you use Rollup to bundle JavaScript with import.meta.url and the output format is set to cjs, umd, or iife formats, while allowing users to inject scriptless HTML elements with unsanitized name attributes

Fix: Upgrade this library to at least version 4.22.4 at core/core-web/yarn.lock:19323.

Reference(s): GHSA-gcx4-mw62-g8wm, CVE-2024-47068

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

Semgrep found 1 ssc-37ae9e0a-cbf0-4910-8f73-04f2275899a6 finding:

Risk: webpack 5.x before 5.76.0 is vulnerable to Improper Access Control due to ImportParserPlugin.js mishandling the magic comment feature. Due to this, webpack does not avoid cross-realm object access and an attacker who controls a property of an untrusted object can obtain access to the real global object.

Manual Review Advice: A vulnerability from this advisory is reachable if you host an application utilizing webpack and an attacker can control a property of an untrusted object

Fix: Upgrade this library to at least version 5.76.0 at core/core-web/yarn.lock:21903.

Reference(s): GHSA-hc6q-2mpp-qw7j, CVE-2023-28154

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

@semgrep-code-dotcms-test
Copy link
Copy Markdown

Semgrep found 1 spring-tainted-path-traversal finding:

  • dotCMS/src/main/java/com/dotcms/rest/ContentResource.java

The application builds a file path from potentially untrusted data, which can lead to a path traversal vulnerability. An attacker can manipulate the path which the application uses to access files. If the application does not validate user input and sanitize file paths, sensitive files such as configuration or user data can be accessed, potentially creating or overwriting files. To prevent this vulnerability, validate and sanitize any input that is used to create references to file paths. Also, enforce strict file access controls. For example, choose privileges allowing public-facing applications to access only the required files. In Java, you may also consider using a utility method such as org.apache.commons.io.FilenameUtils.getName(...) to only retrieve the file name from the path.

View Dataflow Graph
flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>dotCMS/src/main/java/com/dotcms/rest/ContentResource.java</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1424 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1424] multipart</a>"]
        end
        %% Intermediate

        subgraph Traces0[Traces]
            direction TB

            v2["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1424 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1424] multipart</a>"]

            v3["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1428 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1428] multipartPUTandPOST</a>"]

            v4["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1484 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1484] multipart</a>"]

            v5["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1499 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1499] part</a>"]

            v6["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1499 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1499] part</a>"]

            v7["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1581 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1581] processFile</a>"]

            v8["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1613 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1613] part</a>"]

            v9["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1616 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1616] badFileName</a>"]

            v10["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1617 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1617] filename</a>"]
        end
            v2 --> v3
            v3 --> v4
            v4 --> v5
            v5 --> v6
            v6 --> v7
            v7 --> v8
            v8 --> v9
            v9 --> v10
        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1632 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1632] tmpFolder.getAbsolutePath() + File.separator + filename</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    Traces0:::invis
    File0:::invis

    %% Connections

    Source --> Traces0
    Traces0 --> Sink

Loading

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

Semgrep found 1 tainted-file-path finding:

Detected user input controlling a file path. An attacker could control the location of this file, to include going backwards in the directory with '../'. To address this, ensure that user-controlled variables in file paths are sanitized. You may also consider using a utility method such as org.apache.commons.io.FilenameUtils.getName(...) to only retrieve the file name from the path.

View Dataflow Graph
flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>dotCMS/src/main/java/com/dotcms/rest/ContentResource.java</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1424 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1424] multipart</a>"]
        end
        %% Intermediate

        subgraph Traces0[Traces]
            direction TB

            v2["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1424 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1424] multipart</a>"]

            v3["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1428 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1428] multipartPUTandPOST</a>"]

            v4["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1484 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1484] multipart</a>"]

            v5["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1499 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1499] part</a>"]

            v6["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1499 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1499] part</a>"]

            v7["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1581 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1581] processFile</a>"]

            v8["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1613 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1613] part</a>"]

            v9["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1616 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1616] badFileName</a>"]

            v10["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1617 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1617] filename</a>"]
        end
            v2 --> v3
            v3 --> v4
            v4 --> v5
            v5 --> v6
            v6 --> v7
            v7 --> v8
            v8 --> v9
            v9 --> v10
        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/dotCMS/core/blob/9b28510ed3241697614d8e431ba912b7139dd856/dotCMS/src/main/java/com/dotcms/rest/ContentResource.java#L1631 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1631] new File(<br>                    tmpFolder.getAbsolutePath() + File.separator + filename)</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    Traces0:::invis
    File0:::invis

    %% Connections

    Source --> Traces0
    Traces0 --> Sink

Loading

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

nollymar
nollymar previously approved these changes Mar 24, 2026
spbolton added a commit that referenced this pull request Mar 25, 2026
…ale message, step numbering

Address review feedback from #34866:
- Fix continue-on-error heuristic: check if later non-cleanup steps *ran* (not
  just succeeded), so consecutive continue-on-error failures are correctly labeled
  instead of the first one being misidentified as the job killer.
- Fix stale "No ##[error] lines" message to reflect expanded error pattern set.
- Fix SKILL.md step numbering gap (was 1,3,4,5,6,7,8 → now 1-7).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@spbolton
Copy link
Copy Markdown
Member Author

Addressed the review in e80777b. Assessment of each issue:

# Issue Action
1 present_complete_diagnostic extra arg No changeevidence.py:748 already accepts workspace: Optional[Path] = None. Reviewer couldn't see the signature since evidence.py wasn't in the PR diff.
2 is_cleanup_step too broad No change — GitHub Actions auto-generates the "Post " prefix for cleanup steps. User-defined steps named that way are extremely rare in practice.
3 Consecutive continue-on-error mislabeled Fixed — Changed heuristic from "any later non-cleanup step succeeded" to "any later non-cleanup step ran". The old check broke when multiple consecutive continue-on-error steps failed before the real killer.
4 html_scraper not available No changeutils/html_scraper.py exists in the repo. Reviewer couldn't see it since it wasn't in the diff.
5 setup() fetches unconditionally No change — Refactoring adds complexity for marginal gain (cached on second run anyway).
6 Stale error message Fixed — Updated to "No error lines found in log." to reflect the expanded pattern set.
7 Dead get_workflow_run_annotations No change — Keeping as API-based fallback if the HTML scraper breaks.
Minor SKILL.md step numbering Fixed — Renumbered from 1,3–8 to sequential 1–7.

spbolton and others added 4 commits March 25, 2026 10:01
…ror handling (#34865)

Add DiagnosticError, preflight_check(), and _run_gh() wrapper to replace
raw subprocess calls with actionable error messages. Fix dead-code bug
where Path objects were checked with truthiness instead of .exists().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tection, and improved error extraction

- Add diagnose.py as single entry point with progressive subcommands
  (--metadata, --jobs, --annotations, --logs, --evidence)
- fetch-jobs.py now shows all jobs + step-level detail with
  continue-on-error detection (excludes Post/cleanup steps)
- Improved error extraction: catches npm error, BUILD FAILURE in
  addition to ##[error] lines
- WORKFLOWS.md documents trunk deployment step behavior including
  continue-on-error flags and artifact-run-id propagation chain
- SKILL.md rewritten: triage-first approach, diagnose.py as primary
  tool, explicit guidance against ad-hoc gh/python commands
- github_api.py: preflight validates dotCMS/core checkout,
  DiagnosticError class, _run_gh wrapper with actionable errors
- settings.json: add permissions for diagnostic script execution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tions

Remove init-diagnostic.py, fetch-metadata.py, fetch-jobs.py,
fetch-logs.py, and fetch-annotations.py. All functionality is
covered by diagnose.py with progressive subcommands.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ale message, step numbering

Address review feedback from #34866:
- Fix continue-on-error heuristic: check if later non-cleanup steps *ran* (not
  just succeeded), so consecutive continue-on-error failures are correctly labeled
  instead of the first one being misidentified as the job killer.
- Fix stale "No ##[error] lines" message to reflect expanded error pattern set.
- Fix SKILL.md step numbering gap (was 1,3,4,5,6,7,8 → now 1-7).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@spbolton spbolton force-pushed the issue-34865-harden-cicd-diagnostics branch from e80777b to feaabe2 Compare March 25, 2026 10:17
@spbolton spbolton enabled auto-merge March 30, 2026 11:41
@spbolton spbolton added this pull request to the merge queue Mar 30, 2026
Merged via the queue into main with commit e10d2f6 Mar 30, 2026
27 checks passed
@spbolton spbolton deleted the issue-34865-harden-cicd-diagnostics branch March 30, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Harden cicd-diagnostics skill scripts with preflight checks and error handling

2 participants