Skip to content

Fix master builds to work with new rc release#69007

Merged
dwoz merged 18 commits intosaltstack:masterfrom
dwoz:masterfix
May 2, 2026
Merged

Fix master builds to work with new rc release#69007
dwoz merged 18 commits intosaltstack:masterfrom
dwoz:masterfix

Conversation

@dwoz
Copy link
Copy Markdown
Contributor

@dwoz dwoz commented Apr 29, 2026

Master is now 3009.x and 3008.x should run with nightly builds.

This works around 3008.x deprecation warnings for now, this will get resolved when we do a merge forward from 3008.x. See #68989

@dwoz dwoz requested a review from a team as a code owner April 29, 2026 04:15
@dwoz dwoz added the test:full Run the full test suite label Apr 29, 2026
dwoz and others added 15 commits April 29, 2026 01:29
Running the packaging helper as a script breaks imports on Windows
(ModuleNotFoundError: tools). Invoke it as a module like other tooling.

Made-with: Cursor
Two causes behind every Linux/macOS/Windows Test Salt and Test Package
failure in CI:

1. salt/modules/vsphere.py: warn_until(3008, ...) fires as a hard
   RuntimeError under RAISE_DEPRECATIONS_RUNTIME_ERRORS=1 now that we
   are at 3008.0. Bump to 3009 since the module was not actually
   removed in 3008.

2. salt/features.py: the setup_features() rewrite looped over every
   feature flag and emitted a warn_until per flag, printing to stderr
   on every salt-call invocation. Package tests asserting clean stderr
   then failed. Restore the class-based structure so the deprecation
   warning only fires when callers explicitly use salt.features.get(),
   and bump that warn_until to 3009 as well.

Made-with: Cursor
Set to 0 in the layout template (which regenerates ci.yml, nightly.yml,
staging.yml, scheduled.yml) and directly in test-action.yml and
test-packages-action.yml while remaining deprecation warnings in the
codebase are cleaned up.

Made-with: Cursor
Set to 0 in all remaining workflow files (build-deps-ci-action.yml,
depcheck.yml, release.yml) and in tools/container.py which was
hardcoding "1" and overriding the workflow-level setting.

Also reverts the bad-branch changes to salt/features.py: removes the
ImmutableDict-breaking opts mutation block and restores warn_until(3008)
to match 3008.x exactly.

Made-with: Cursor
CI sets RAISE_DEPRECATIONS_RUNTIME_ERRORS=1. With Salt at 3008.0+, warn_until(3008, ...) and version="Argon" are treated as expired and raise RuntimeError. Bump numeric gates to 3009, use Potassium for file.shortcut, and align namespaced_function test expectations.

Made-with: Cursor
Correct RST title adornment length for salt.runners.index. Add 3009.0
release stub and template, and include 3009.* in the releases toctree
so Prepare Release / sphinx -W man builds do not warn on orphan pages.
Three independent failures observed on the 3008.0rc1 matrix rows of CI run
25205603902 (PR saltstack#69007), all reproduced locally before fixing:

* tests/pytests/pkg/upgrade/test_salt_upgrade.py — the pre-upgrade PyGithub
  probe wrapped its assertions in ``try/except AssertionError → pytest.skip``,
  which silently skipped the entire upgrade test whenever the salt loader did
  not surface ``github.get_repo_info`` after ``pip.install``. The systemd
  fixture had already left the system at prev_version, so stage 2
  (``--no-install``) then failed on ``test_salt_version``. Replace the skip
  with informational logging and a ``pip_pretest_ok`` flag; gate only the
  post-upgrade re-probe on it. The actual upgrade now runs unconditionally.
  Reproduced and validated on Amazon Linux 2023 and macOS 15 (Intel).

* tests/support/pkg.py — the apt pin file was written with the raw PEP440
  ``prev_version`` (e.g. ``3008.0rc1``), but Debian/Ubuntu publishes the
  pre-release with a tilde (``3008.0~rc1``). Apt pin matching is exact, so
  the pin selected nothing and the locally-installed dev build won. Map
  ``prev_version`` through ``pep440_version_to_rpm_nevra_version`` before
  writing the pin (the helper already produces the tilde form and is a no-op
  on stable releases). Reproduced on Debian 12 downgrade-3008.0rc1.

* tests/unit/modules/test_nxos.py — drop the nine test methods that exercised
  ``nxos.cmd``, ``nxos.show``, ``nxos.system_info``, and ``nxos.add_config``.
  Those functions were removed from ``salt/modules/nxos.py`` in e6b981e
  to clear RAISE_DEPRECATIONS_RUNTIME_ERRORS; the matching test cleanup was
  meant to land in the same change but did not. Also drop the two now-unused
  imports flagged by pylint. Reproduced on Rocky Linux 9 unit-1 chunk.
sujitdb
sujitdb previously approved these changes May 1, 2026
Copy link
Copy Markdown
Collaborator

@sujitdb sujitdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but since salt/auth/pki.py is removed just wanted to make sure all the test related to removed

``win_pkg.remove`` polls ``list_pkgs`` after the underlying uninstaller
exits, waiting for the Windows Add/Remove Programs registry entry to
clear so the change shows up in the returned ``difference`` dict. The
default 3-second wait was racing the state-level post-check on CI:
Chocolatey-driven Notepad++ (``npp``) regularly returned ``retcode=0``
before the registry entry was gone, the wait loop timed out, the module
returned only ``{'uninstall status': 'success'}`` (no version diff), and
``salt.states.pkg._uninstall`` then ran its own ``pkg.list_pkgs``, still
saw ``npp`` listed, and reported ``result=False``,
``"The following packages failed to remove: npp."``.

Reproduced cleanly on a Windows 11 box with the failing CI artifacts
(salt 3008.0+1111.g76078cd1a4 onedir + nox-windows-amd64 cache):
``test_pkg_005_installed_32bit`` and
``test_pkg_006_installed_32bit_with_version`` failed identically to CI
runs 25205603902 and 25211846568. Polling ``salt-call --local
pkg.list_pkgs`` ~30 seconds after the test exit confirmed the registry
had cleaned itself up — the lag just exceeded the 3-second budget.

Bump the budget to 30 seconds. Quick removes still exit on
``found_chgs`` and pay no extra cost; only laggy uninstallers wait
longer.
sujitdb
sujitdb previously approved these changes May 1, 2026
twangboy
twangboy previously approved these changes May 1, 2026
dwoz added 2 commits May 1, 2026 16:00
``doc/man-archive.tar.xz`` is generated by ``nox -e docs`` (noxfile.py
session writes it via ``tar -cJvf man-archive.tar.xz _build/man``) and
should not be in the repository. It was added inadvertently in
d7e14c7 alongside legitimate package
build fixes. Remove the binary blob and add the path to ``.gitignore``
next to the existing ``doc/doc-archive.tar.gz`` entry so a future docs
build does not re-introduce it.
Investigation of the recurring ``test_pkg_005_installed_32bit`` /
``test_pkg_006_installed_32bit_with_version`` failures on Windows 2022
and 2025 CI runners traced the root cause to a long-standing
NSIS-installer bug rather than a registry-update timing race:

  Notepad++'s silent uninstaller (``uninstall.exe /S``) returns
  ``retcode=0``, deletes its own files (including ``uninstall.exe``),
  but leaves the
  ``HKLM\Software\Microsoft\Windows\CurrentVersion\Uninstall\Notepad++``
  key in place. ``UninstallString`` continues to point at the now
  missing ``C:\Program Files (x86)\Notepad++\uninstall.exe``.

``salt.modules.win_pkg.list_pkgs`` reads the Add/Remove Programs
registry hives and reports the package as still installed. The
``pkg.removed`` state's post-check then trips and reports
``"The following packages failed to remove: npp."``, even though the
package is functionally gone. Because the registry never self-clears,
no amount of waiting in ``win_pkg.remove`` (the previous 3-second loop,
nor the bump to 30s in 01ff6b4) makes
a difference; the prior commit is reverted here.

Fix: teach ``_get_reg_software.skip_uninstall_string`` to additionally
skip entries whose ``UninstallString`` parses to an absolute executable
path that does not exist on disk. A new module-level helper
``_uninstall_string_is_orphan`` performs the parse and existence check.
The implementation is conservative: empty strings, unquoted commands
that don't resolve to an absolute path (e.g. ``MsiExec.exe /X{...}``),
quoted strings that fail to close, and any unexpected shape are all
treated as not-orphan so legitimate entries continue to surface.
Environment variables in the path are expanded before the existence
check.

Validated end-to-end on a Windows 11 box reproducing the CI failure
exactly (salt 3008.0+1111 onedir + nox-windows-amd64 cache, real
``salt-winrepo-ng`` ``npp`` install/uninstall): both
``test_pkg_005_installed_32bit`` and
``test_pkg_006_installed_32bit_with_version`` PASS in 56s, same total
wall time as the failing baseline. The orphan registry entry is
created by the same npp uninstaller as before; ``pkg.list_pkgs`` simply
no longer reports it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:full Run the full test suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants