Skip to content

feat(#5352): healthgroups endpoint#5416

Open
ulischulte wants to merge 6 commits into
masterfrom
feature/5352-healthgroups-endpoint
Open

feat(#5352): healthgroups endpoint#5416
ulischulte wants to merge 6 commits into
masterfrom
feature/5352-healthgroups-endpoint

Conversation

@ulischulte

Copy link
Copy Markdown
Contributor

Closes #5352:

Added caching and a dedicated REST endpoint for health groups.

When SBA polls an application's /health endpoint, the response may include a groups field (e.g., ["liveness", "readiness"]). Previously this data was fetched on-the-fly from the client. Now:

  • HealthGroupsCache — A ConcurrentHashMap-based cache stores health groups per instance, updated automatically during each health status poll.
  • GET /instances/{id}/health-groups — A new REST endpoint serves the cached groups directly, avoiding repeated calls to client applications.
  • HealthGroupsCacheCleanupTrigger — Listens for InstanceDeregisteredEvent and evicts stale cache entries to prevent memory leaks.
  • UI integration — The health details view (details-health.vue) now calls the new server-side endpoint instead of fetching groups from the client's health response.

Key design decisions

  • Cache is populated as a side effect of the existing StatusUpdater health polling (no extra HTTP calls)
  • Thread-safe via ConcurrentHashMap + List.copyOf() for immutable snapshots
  • Backward-compatible: the protected convertStatusInfo(ClientResponse) API in StatusUpdater is preserved for subclasses
  • @ConditionalOnMissingBean on all new beans allows customization

@ulischulte ulischulte marked this pull request as ready for review June 5, 2026 15:18
@ulischulte ulischulte requested a review from a team as a code owner June 5, 2026 15:18
* specific data in {@link HealthGroupsCache} on receiving an
* {@link InstanceDeregisteredEvent}.
*/
public class HealthGroupsCacheCleanupTrigger extends AbstractEventHandler<InstanceDeregisteredEvent> {

@cdprete cdprete Jun 6, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit an overkill to me.

It would be enough to invoke the deletion in the StatusUpdater with a doOnNext as it's already done to persist the groups (actually, it's enough to add an if there to check the event type and either call the save or the delete method).
We're scheduling a thread to constantly check for a single even type, which barely happens, where the StatusUpdater is already handling all of them anyway.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatusUpdater polls the /health endpoint — it does not receive InstanceDeregisteredEvent. A deregistered instance simply stops being polled (doUpdateStatus returns early if !instance.isRegistered()), so eviction would never fire there.

I'll investigate a little further and probably will either
(a) keep a lightweight event subscription but reuse an existing handler/scheduler, or
(b) accept the small overhead since cleanup-on-deregister is the established pattern in SBA

@cdprete cdprete Jun 12, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatusUpdater polls the /health endpoint — it does not receive InstanceDeregisteredEvent.

Indeed. For a moment I had the impression it's handled there because the journal view shows all of them.

I'll investigate a little further and probably will either
(a) keep a lightweight event subscription but reuse an existing handler/scheduler, or
(b) accept the small overhead since cleanup-on-deregister is the established pattern in SBA

Wouldn't be enough to have a Spring Boot's @EventListener for the specific application event and remove the entry from the cache when it's received?


describe('SSE reactive updates', () => {
it('should call fetchHealth once on mount, not on SSE version changes', async () => {
it('should call fetchCachedHealthGroups once on mount, not on SSE version changes', async () => {

@cdprete cdprete Jun 6, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think we should call the fetchHealthGroups when an SSE is received now, otherwise stale data may be potentially used.

This comment is not specifically related to this test case, but it's a more general statement.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, and agreed in principle. The current else branch only invalidates the per-group details (group.data = null) but keeps the previously fetched group list, so a group added/removed at runtime would indeed show stale entries until the instance id changes or the view is remounted (although this seems to be an edge case).

I'll update onInstanceChanged so the same-instance branch also calls fetchHealthGroups() (which already resets healthGroupOpenStatus/healthGroupLoadingMap and clears stale data), making the displayed group list self-correcting on SSE updates.

@cdprete cdprete Jun 12, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.
Also because, from @SteKoe in a previous PR about a Thymeleaf fix, I discovered that the configuration properties can be changed at runtime from SBA.

Probably, to cover this case as well, we would need SSE also for the health groups, but I don't know how much worthy it really is. Unless a STATUS_CHANGED even is also created when the health response body changes in general and not just by the status (UP, DOWN and so on).
Performance and code clearity wise, it would then indeed be better to update the model of the instance to also have the health groups in it.

@SteKoe SteKoe changed the title Feature/5352 healthgroups endpoint feat(#5352): healthgroups endpoint Jun 12, 2026
this.cache.remove(instanceId);
}
else {
this.cache.put(instanceId, List.copyOf(groups));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should ensure that groups are unique, either here by using a LinkedHashSet instead of a List or in the UI to avoid potentially duplicated entries.

if (Array.isArray(res.data.groups)) {
this.healthGroups = res.data.groups.map((name: string) => ({
if (Array.isArray(res.data)) {
this.healthGroups = res.data.map((name: string) => ({

@cdprete cdprete Jun 17, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ulischulte.

Will this assignment trigger a UI re-rendering?
If it's so, we should maybe check the content of the response first and set the new state (health groups + reset of state of the accorditions) only if there is indeed a change in the collection of the returned group names.
Otherwise, on every fetch, we'll be re-rendering the UI for nothing if nothing really has changed in the context of the health groups.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will trigger a re-rendering of UI. I will add a check to prevent this in terms of UX, as it will also reset the collapsed state. Changing groups in backend will not happen that often, though.

@cdprete cdprete Jun 20, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing groups in backend will not happen that often, though.

I was wondering exactly because their update will be very rare.

ulischulte and others added 6 commits June 19, 2026 12:06
# Conflicts:
#	spring-boot-admin-server/src/test/java/de/codecentric/boot/admin/server/services/StatusUpdaterTest.java
…in/server/services/StatusUpdater.java

Co-authored-by: Cosimo Damiano Prete <8491864+cdprete@users.noreply.github.com>
@SteKoe SteKoe force-pushed the feature/5352-healthgroups-endpoint branch from fa4bcf7 to ffe3250 Compare June 19, 2026 10:07
@cdprete

cdprete commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Hi @ulischulte.

What do you think about the idea of adding a metadata boolean property that the applications under monitoring can populate to indicate if they use health groups or not?
In this way, the additional HTTP call can be avoided if we already know in advance that a particular application is anyway not exposing (or not willing to expose) any health group.
If the property is missing, then we default it to true and we try to fetch the health groups anyway.

Moreover, another optimization (to be verified if it's indeed possible; for now it's just an idea) would be to not just return the group names, but the name of the health indicators they include as well. For example: my-group: [db, ldap].
In this way, in the UI, their health can be shown directly - since we already have the health of the single health indicators available as part of the status update event - without having to fetch it explicitly.

I'm still in favour of adding them to the STATUS_UPDATED event, to be honest - especially after the changes done for #5450 - but so be it :)

this.healthGroupLoadingMap = {};
}
// Re-fetch the (server-cached) group list on every instance change
this.fetchHealthGroups();

@cdprete cdprete Jun 20, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do this only if the instance has changed in some meaningful way regarding its status, which means to not do it if the event that gets sent is INFO_UPDATED.

If the process InfoContributor is enabled, the data returned by https://docs.spring.io/spring-boot/api/rest/actuator/info.html changes on every poll, which would mean we're re-fetching the health groups over and over for no real reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhancement: Cache health groups and expose endpoint to fetch them

3 participants