Skip to content

feat(uptime): wire alarm notifications on uptime transitions#407

Open
capacitron wants to merge 3 commits intodatabuddy-analytics:stagingfrom
capacitron:bounty-fix
Open

feat(uptime): wire alarm notifications on uptime transitions#407
capacitron wants to merge 3 commits intodatabuddy-analytics:stagingfrom
capacitron:bounty-fix

Conversation

@capacitron
Copy link
Copy Markdown

@capacitron capacitron commented Apr 11, 2026

Fixes #268

Approach

Wire the existing alarms system to uptime monitoring DOWN/RECOVERED transitions. No schema changes — the alarms table already has websiteId and triggerType="uptime".

Changes:

  • apps/uptime/src/uptime-transition-emails.ts — adds sendUptimeAlarmNotificationsIfNeeded() that queries enabled uptime alarms for the monitored websiteId on transition, builds a NotificationClient per alarm from its destinations, and dispatches via buildUptimeNotificationPayload
  • apps/uptime/src/index.ts — calls the new function after sendUptimeTransitionEmailsIfNeeded

Supports all existing destination types: Slack, Discord, Teams, Google Chat, Telegram, and Webhook.

Query enabled uptime alarms for the monitored website on DOWN/RECOVERED
transitions and dispatch notifications via NotificationClient to all
configured destinations (Slack, Discord, Teams, Google Chat, Telegram,
Webhook). No schema changes required.

Closes databuddy-analytics#268
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 11, 2026

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 11, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6900c703-bfc6-4be5-b9ec-f781f418c829

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
dashboard Skipped Skipped Apr 11, 2026 5:43am
documentation Skipped Skipped Apr 11, 2026 5:43am

@vercel vercel bot temporarily deployed to Preview – dashboard April 11, 2026 04:00 Inactive
@vercel vercel bot temporarily deployed to Preview – documentation April 11, 2026 04:00 Inactive
Copy link
Copy Markdown
Member

@izadoesdev izadoesdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Welcome to Databuddy, @capacitron! Thanks for picking up #268 — really appreciate the contribution and the thorough PR description.

Summary

This PR wires the existing alarms system to uptime monitoring transitions (DOWN/RECOVERED). When a monitored site changes state, it queries enabled uptime alarms for that websiteId, builds a NotificationClient per alarm from its configured destinations, and dispatches notifications via Slack, Discord, Teams, Google Chat, Telegram, or Webhook. No schema changes needed.

Review

What's good:

  • Clean implementation that follows the existing NotificationClient dispatch pattern from packages/rpc/src/routers/alarms.ts
  • Proper early returns and null checks throughout
  • Per-alarm error isolation with captureError context
  • No breaking changes, no schema changes, no high-impact pipeline modifications
  • PR description is excellent — clear approach, linked issue, AI disclosure

Issues found:

🟡 Unsafe type assertions for Telegram configcfg.botToken as string and cfg.chatId as string skip runtime validation and could pass undefined into the Telegram client if config is malformed. See inline comment for a fix.

🔵 Sequential alarm dispatch — notifications are sent one-by-one which could slow the uptime pipeline if destinations are slow. Promise.allSettled would parallelize safely. Not blocking since it matches the existing email pattern.

🔵 Webhook headers cast — same as pattern concern for cfg.headers.

Assessment: Looks good for merge after addressing the Telegram type safety issue (🟡). The suggestions (🔵) are nice-to-haves.

Comment on lines +238 to +239
channels.push("google-chat");
} else if (dest.type === "telegram") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Warning: Unsafe type assertions

cfg.botToken and cfg.chatId could be undefined at runtime if a Telegram destination has malformed or incomplete config. Casting straight to string skips any runtime validation and would pass undefined into the Telegram client silently.

Consider guarding before pushing the channel:

const botToken = typeof cfg.botToken === "string" ? cfg.botToken : undefined;
const chatId = dest.identifier || (typeof cfg.chatId === "string" ? cfg.chatId : undefined);

if (botToken && chatId) {
  clientConfig.telegram = { botToken, chatId };
  channels.push("telegram");
}

This way a misconfigured Telegram destination gets skipped instead of causing a runtime error or a silent API failure.

if (channels.length === 0) {
continue;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Suggestion: Parallelize alarm dispatch

Alarms are dispatched sequentially here — each client.send() is awaited before moving to the next alarm. If a webhook destination is slow or times out, it blocks notifications for all subsequent alarms and holds up the uptime check pipeline.

Since each alarm is independent, Promise.allSettled would let them run in parallel with per-alarm error isolation:

await Promise.allSettled(
  matchingAlarms
    .filter((a) => a.destinations?.length)
    .map(async (alarm) => {
      // ... build clientConfig + channels ...
      const client = new NotificationClient(clientConfig);
      await client.send(payload, { channels });
    })
);

Not a blocker since it matches the sequential pattern in the existing email function, but worth considering since alarm count per site could grow.

chatId: dest.identifier || (cfg.chatId as string),
};
channels.push("telegram");
} else if (dest.type === "webhook") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Suggestion: Same as cast pattern applies to webhook headers

cfg.headers as Record<string, string> | undefined has the same issue as the Telegram casts above. If cfg.headers is unexpectedly a non-object type, this could cause subtle issues downstream. A quick typeof guard or a validation utility would make this more robust.

@capacitron capacitron marked this pull request as ready for review April 11, 2026 04:23
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 11, 2026

Greptile Summary

This PR wires the existing alarms / alarmDestinations tables into uptime DOWN/RECOVERED transitions, dispatching channel notifications (Slack, Discord, Teams, Google Chat, Telegram, Webhook) via a new sendUptimeAlarmNotificationsIfNeeded function that mirrors the anomaly notification pattern. No schema changes are required.

  • P1 – last-one-wins on same-type destinations: clientConfig[type] is overwritten for each destination in the loop, so when an alarm has two destinations of the same channel type (allowed by the (alarmId, type, identifier) unique index), only the last destination's URL is used while earlier ones are silently dropped — and the surviving provider is called once per duplicate channel entry, producing duplicate notifications. The same bug exists in packages/rpc/src/routers/anomalies.ts; this PR replicates it. Fix: instantiate one NotificationClient per destination.
  • P2 – failed send results ignored: client.send() resolves without throwing even on per-channel failures; the returned NotificationResult[] is discarded, so silent delivery failures are never captured.

Confidence Score: 4/5

Not safe to merge as-is: the same-type destination collision will silently drop all but the last webhook and send duplicates to the survivor.

One P1 logic bug (last-one-wins on same-type destinations) means some alarm destinations will never receive notifications when multiple of the same channel type are configured, while the surviving destination gets duplicate notifications. This is a correctness defect on the primary changed code path and should be addressed before merging.

apps/uptime/src/uptime-transition-emails.ts — the destination-loop logic in sendUptimeAlarmNotificationsIfNeeded needs a per-destination client approach.

Important Files Changed

Filename Overview
apps/uptime/src/uptime-transition-emails.ts Adds sendUptimeAlarmNotificationsIfNeeded; contains a P1 last-one-wins bug when an alarm has multiple destinations of the same channel type, and notification send failures are silently swallowed.
apps/uptime/src/index.ts Wires sendUptimeAlarmNotificationsIfNeeded after the existing email step; integration is correct and consistent with the existing pipeline structure.

Sequence Diagram

sequenceDiagram
    participant QS as QStash
    participant UP as Uptime Service
    participant CH as ClickHouse
    participant KF as Kafka (Redpanda)
    participant PG as PostgreSQL
    participant EM as Resend (Email)
    participant NC as NotificationClient

    QS->>UP: POST / (uptime check trigger)
    UP->>UP: lookupSchedule(scheduleId)
    UP->>UP: checkUptime(url)
    UP->>CH: getPreviousMonitorStatus(monitorId)
    CH-->>UP: previousStatus
    UP->>KF: sendUptimeEvent(result)
    UP->>UP: resolveTransitionKind(previous, current)
    alt Transition detected (down / recovered)
        UP->>EM: sendUptimeTransitionEmailsIfNeeded
        UP->>PG: query alarms WHERE websiteId + enabled + triggerType=uptime
        PG-->>UP: matchingAlarms[]
        loop for each alarm
            UP->>NC: new NotificationClient(clientConfig)
            NC->>NC: send(payload, channels)
            Note over NC: Same-type destinations collapse into one config key
        end
    end
    UP-->>QS: 200 OK
Loading

Reviews (1): Last reviewed commit: "feat(uptime): wire alarm notifications o..." | Re-trigger Greptile

Comment on lines +221 to +252
const clientConfig: Record<string, Record<string, unknown>> = {};
const channels: NotificationChannel[] = [];

for (const dest of alarm.destinations) {
const cfg = (dest.config ?? {}) as Record<string, unknown>;

if (dest.type === "slack") {
clientConfig.slack = { webhookUrl: dest.identifier };
channels.push("slack");
} else if (dest.type === "discord") {
clientConfig.discord = { webhookUrl: dest.identifier };
channels.push("discord");
} else if (dest.type === "teams") {
clientConfig.teams = { webhookUrl: dest.identifier };
channels.push("teams");
} else if (dest.type === "google_chat") {
clientConfig.googleChat = { webhookUrl: dest.identifier };
channels.push("google-chat");
} else if (dest.type === "telegram") {
clientConfig.telegram = {
botToken: cfg.botToken as string,
chatId: dest.identifier || (cfg.chatId as string),
};
channels.push("telegram");
} else if (dest.type === "webhook") {
clientConfig.webhook = {
url: dest.identifier,
headers: cfg.headers as Record<string, string> | undefined,
};
channels.push("webhook");
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Multiple same-type destinations silently dropped

The loop overwrites clientConfig[type] on each iteration, so when an alarm has two destinations of the same channel type — permitted by the alarm_destinations_alarm_type_identifier_unique index which is keyed on (alarmId, type, identifier) — only the last destination's config survives. channels still contains one entry per destination, so NotificationClient.send ends up calling the surviving provider twice, sending duplicate notifications to the last URL while all earlier ones receive nothing. The same pattern exists in packages/rpc/src/routers/anomalies.ts buildClientConfig.

The fix is to instantiate one NotificationClient per destination rather than collapsing everything into a single config map. Process each dest individually, build a singleConfig with only that one channel key, and call client.send(payload, { channels: [channel] }) — one client per destination. This ensures every configured URL receives exactly one notification regardless of how many destinations share the same channel type.

Comment on lines +258 to +265
try {
const client = new NotificationClient(clientConfig);
await client.send(payload, { channels });
} catch (error) {
captureError(error, {
error_step: "uptime_alarm_notification",
alarm_id: alarm.id,
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Notification send failures silently ignored

client.send() returns NotificationResult[] and resolves successfully even when individual channel sends fail (it uses Promise.allSettled internally, reporting failures in the result objects rather than throwing). Errors are only captured for exceptions. Failed deliveries — e.g. a bad webhook URL or an expired token — will leave the results array with { success: false, error: "..." } entries that are never logged or captured.

…dispatch

- Replace unsafe `as string` casts on botToken/chatId with runtime
  typeof checks, skipping the destination if config is malformed
- Add typeof guard for webhook headers cast
- Parallelize alarm dispatch with Promise.allSettled for better
  performance when multiple alarms have slow destinations
@vercel vercel bot temporarily deployed to Preview – dashboard April 11, 2026 05:43 Inactive
@vercel vercel bot temporarily deployed to Preview – documentation April 11, 2026 05:43 Inactive
@capacitron
Copy link
Copy Markdown
Author

Thanks for the thorough review @izadoesdev!

Addressed both items:

🟡 Telegram type safety — replaced as string casts with runtime typeof checks. If botToken or chatId is missing/non-string, the destination is skipped early rather than passing undefined into the Telegram client.

🔵 Sequential dispatch — switched to Promise.allSettled so all alarms dispatch in parallel. Matches the pattern from the anomaly notifications router.

🔵 Webhook headers — applied the same typeof cfg.headers === 'object' guard before casting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants