Skip to content

feat: Expose last seen scout version for a machine#2037

Open
ericpretzel wants to merge 5 commits into
NVIDIA:mainfrom
ericpretzel:expose-scout-version
Open

feat: Expose last seen scout version for a machine#2037
ericpretzel wants to merge 5 commits into
NVIDIA:mainfrom
ericpretzel:expose-scout-version

Conversation

@ericpretzel
Copy link
Copy Markdown
Contributor

Description

When a machine is discovered with forge-scout, the version of forge-scout that was booted is currently not saved anywhere and can be lost. This PR has forge-scout report its version carbide_version::v!(build_version) to the API on machine discovery and persist it in the machines table as last_seen_scout_version (overwritten when re-discovered). For existing machines, this will report null until next discovery.

Note that dpu-agent also performs discovery and will report its version for consistency, but unlike forge-scout, the API currently won't do anything with it because dpu-agent's current running version is already reported elsewhere and is frequently updated.

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

Closes #1614

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

ericpretzel and others added 2 commits June 1, 2026 05:55
Track and surface the scout version reported during host registration,
persisting it via a new last_seen_scout_version column and exposing it
through the machine API.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Eric Wetzel <ewetzel@nvidia.com>
@ericpretzel ericpretzel requested a review from a team as a code owner June 1, 2026 07:11
Comment thread crates/api-model/src/machine/json.rs Outdated
pub controller_state: ManagedHostState,
pub last_discovery_time: Option<DateTime<Utc>>,
pub last_scout_contact_time: Option<DateTime<Utc>>,
pub last_seen_scout_version: Option<String>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit but since we have last_scout_contact_time, maybe your new variable can be last_scout_observed_version? (as in keep the last_scout_* prefix going?)

Comment thread crates/rpc/proto/forge.proto Outdated
}
bool create_machine = 3;
MachineDiscoveryReporter discovery_reporter = 4;
optional string reporter_version = 5;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another super nit: discovery_reporter and discovery_reporter_version

)
.await?;
}
}
Copy link
Copy Markdown
Contributor

@chet chet Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would turn this into a match on discovery_reporter, with match arms for each variant. ::Scout would obviously be this branch, ::Unspecified would log a warning, and ::DpuAgent would do...oh.. umm... something? i dont see DpuAgent really being used here at all?

i mean all said, i like the idea of reporting the scout version! just a couple of questions!

Copy link
Copy Markdown
Contributor Author

@ericpretzel ericpretzel Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... the API handler doesn't do anything with the version when it gets a DpuAgent reporter. From what I understand, dpu-agent's version is already reported in the machine inventory so it would be redundant to persist it here. I just decided to include it like this for the sake of consistency. scout and dpu-agent both go through this same code path and it felt awkward to try to separate the handler just for this change.

ericpretzel and others added 3 commits June 1, 2026 08:26
- Rename last_seen_scout_version -> last_scout_observed_version to keep
  the last_scout_* prefix consistent with last_scout_contact_time
- Rename proto field reporter_version -> discovery_reporter_version for
  consistency with discovery_reporter
- Collapse nested if in discover_machine handler (clippy::collapsible_if)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Eric Wetzel <ewetzel@nvidia.com>
Fixes check-format-nightly: import grouping, .await placement, and
trailing whitespace in the machine discovery handler and tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Eric Wetzel <ewetzel@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: expose last seen scout image version for a machine

2 participants