Skip to content

fix: remove transition to CheckingFirmwareRepeatV2 for rms fw updates#2022

Closed
anunna0 wants to merge 2 commits into
NVIDIA:mainfrom
anunna0:remove_transition_to_checkingfirmwarerepeat
Closed

fix: remove transition to CheckingFirmwareRepeatV2 for rms fw updates#2022
anunna0 wants to merge 2 commits into
NVIDIA:mainfrom
anunna0:remove_transition_to_checkingfirmwarerepeat

Conversation

@anunna0

@anunna0 anunna0 commented May 29, 2026

Copy link
Copy Markdown
Contributor

Description

After RMS completed the SOT JSON firmware update, BMM was still sending the host into CheckingFirmwareRepeatV2.

That state uses BMM’s old local firmware config, not the SOT JSON. So BMM could decide HGXBmc still needed an update and start a second legacy Redfish firmware update. That second update failed and put the host in FailedFirmwareUpgrade.

Fix: when rack/RMS firmware completes, go straight to the completed host state (Ready) instead of CheckingFirmwareRepeatV2.

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

#2016

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@copy-pr-bot

copy-pr-bot Bot commented May 29, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@anunna0 anunna0 marked this pull request as ready for review May 29, 2026 23:16
@anunna0 anunna0 requested a review from a team as a code owner May 29, 2026 23:16

@zhaozhongn zhaozhongn left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wait for one more approval from your team.

}

let next_state = match &rack_fw_status.status {
model::rack::RackFirmwareUpgradeState::Completed => scenario.actual_new_state(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this change the legacy updates in any way? or should they never set that particular status field?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only affect rack firmware updates which i believe there is no flow for outside of RMS/SOT

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it break Assigned machines

@vinodchitraliNVIDIA

Copy link
Copy Markdown
Contributor

This wont work for Assigned/Ready. Dont merge this code

@vinodchitraliNVIDIA vinodchitraliNVIDIA self-requested a review May 30, 2026 05:34
@anunna0 anunna0 closed this Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants