System Messages
Health check alerts and infrastructure consistency notifications
What you see
URL: /system/messages
The system messages page shows health check results detected by PodWarden's periodic infrastructure audit. PodWarden automatically checks for data inconsistencies between its database and your live infrastructure every 15 minutes.
A bell icon in the top navigation bar shows the number of unread messages. The badge color reflects the highest severity:
| Color | Meaning |
|---|---|
| Red (pulsing) | Critical issue — e.g. cluster unreachable |
| Red (static) | Error — e.g. node name mismatch preventing canvas visualization |
| Amber | Warning — e.g. host offline or orphaned |
| Blue | Informational — e.g. stale assignment |
Click the bell to open the system messages page.
Health checks
PodWarden runs these checks automatically:
Node naming
| Check | Severity | What it means |
|---|---|---|
| Node name mismatch | Error | The K8s node name stored in PodWarden doesn't match the actual node name in your cluster. This prevents deployment-to-node connections from appearing on the canvas. |
| Node name missing | Warning | A host is assigned to a cluster but has no K8s node name recorded. |
Cluster membership
| Check | Severity | What it means |
|---|---|---|
| Cluster unreachable | Critical | PodWarden cannot connect to the Kubernetes API for this cluster. |
| Orphaned host | Warning | A host claims to be in a cluster, but the cluster doesn't have a matching node. |
| Unknown K8s node | Error | A node exists in your K8s cluster that PodWarden doesn't know about. |
Deployments
| Check | Severity | What it means |
|---|---|---|
| Stale assignment | Info | A deployment references a cluster that no longer exists. |
| Unplaceable deployment | Warning | A deployment's placement targets a node that no host matches. |
Hosts
| Check | Severity | What it means |
|---|---|---|
| Host unreachable | Warning | A host hasn't reported stats in over 30 minutes. |
| Tailscale hostname drift | Info | A Tailscale-discovered host is missing its hostname. |
Ingress Drift
| Check | Severity | What it means |
|---|---|---|
| Unmanaged IngressRoute | Warning | A Traefik IngressRoute exists in your cluster but isn't tracked by PodWarden. Someone created it directly via kubectl or Helm. |
| Unmanaged Ingress | Warning | A standard Kubernetes Ingress resource exists but isn't managed by PodWarden. |
| Unmanaged exposed service | Warning | A NodePort or LoadBalancer service exposes traffic outside the cluster without going through PodWarden's ingress management. |
| Ghost ingress rule | Error | PodWarden has an ingress rule marked as active, but the corresponding Kubernetes resource doesn't exist. The rule may have been deleted directly from the cluster. |
Pod Health
| Check | Severity | What it means |
|---|---|---|
| CrashLoopBackOff | Error (managed) / Warning (unmanaged) | A pod keeps crashing and restarting. Error severity if it belongs to a PodWarden-managed deployment. |
| Evicted | Error (managed) / Warning (unmanaged) | A pod was evicted, usually due to resource pressure (memory, disk). |
| ImagePullBackOff | Error (managed) / Warning (unmanaged) | A pod can't pull its container image. Check that the image exists and registry credentials are correct. |
Infrastructure
| Check | Severity | What it means |
|---|---|---|
| TLS cert expiring (K8s) | Warning (14d) / Error (7d) / Critical (3d) | A TLS certificate stored as a Kubernetes secret is approaching expiry. |
| TLS cert expiring (live) | Warning (14d) / Error (7d) | The live TLS certificate served by a domain is approaching expiry. |
| DNS mismatch | Error | A domain's DNS resolves to a different IP than the gateway host's address. Traffic may not reach your cluster. |
| Gateway Traefik unhealthy | Critical | Traefik pods on a gateway host are in a failed state. Inbound traffic to all services on this gateway is likely down. |
Using the page
Filtering
- Severity chips at the top toggle which severity levels are shown
- Category dropdown filters by check category (node naming, cluster, deployments, hosts, ingress drift, pod health, infrastructure, system apps)
- Status toggle switches between active issues, resolved issues, or all
- Click a message row to expand and see full details and diagnostic data
Object links
When you expand a message, clickable links appear that take you directly to the affected object's detail page. For example, a "Host unreachable" message links to that host's detail page, and a "DNS mismatch" message links to the Network page. Links are shown for hosts, clusters, deployments, and network resources when the relevant object exists.
Actions
- Mark as read — dismisses the notification badge for you (other users still see it as unread)
- Mark all read — marks all active messages as read
- Run check (admin only) — triggers an immediate health check instead of waiting for the next cycle
- Delete (admin only) — permanently removes a message
Suppressing messages
Some alerts are expected and intentional — for example, an "Unmanaged IngressRoute" for a route you created directly via kubectl, or an "Unknown K8s node" for a node you provisioned outside of PodWarden. These are not bugs; they are known deviations you have chosen to accept. Suppressing them keeps your message list focused on issues that actually need attention.
To suppress a message, click the eye-slash icon (Suppress button) on any message row. The message is immediately hidden from the main list and will not trigger email notifications.
To view suppressed messages, enable the "Show suppressed" toggle in the page header. Suppressed messages reappear dimmed with a "Suppressed" badge so they are visually distinct from active issues.
To unsuppress a message, enable the "Show suppressed" toggle, then click the eye icon (Unsuppress button) on the message you want to restore. The message returns to the normal active list.
Suppression is permanent until you explicitly unsuppress — health check cycles do not clear or reset suppression state. Suppressed messages are also excluded from the notification digest, so re-enabling email notifications will not retroactively send alerts for them.
Auto-resolution
When a previously detected issue is no longer found during a health check (e.g., you fixed a node name mismatch), the message is automatically marked as resolved. Resolved messages are cleaned up after 7 days.
When a previously detected issue disappears, PodWarden doesn't resolve it immediately. Instead, it waits for multiple consecutive clean check cycles (default: 2, configurable in Settings) before marking it resolved. This prevents noisy alerts from pods that briefly recover before crashing again.
Drift Detection Dashboard
URL: /apps/drift-detection
PodWarden provides a dedicated dashboard for drift detection issues, accessible from the navigation menu. The dashboard shows:
- Summary cards — count of active issues by severity (critical, error, warning, info)
- Issues table — all active drift issues with severity, category, title, and time since detection
- Action buttons:
- Run Check Now — trigger an immediate health check without waiting for the next cycle
- Clear Resolved — delete all resolved drift messages
Each issue in the table links to the relevant PodWarden page where you can investigate and fix it.
Configuration
Drift detection settings are in Settings → Drift Detection:
| Setting | Default | Description |
|---|---|---|
| Clean cycles before auto-resolve | 2 | How many consecutive clean check cycles before an issue is automatically resolved. Higher values reduce false resolutions but delay cleanup. |
| Enabled categories | All enabled | Toggle which drift categories run: ingress, pods, infrastructure. Disabling a category stops its checks but preserves existing messages. |
Email notifications
PodWarden can send email alerts when new infrastructure issues are detected. Configure this in Settings → System Config → Notifications.
| Setting | Description |
|---|---|
| Enable email notifications | Master toggle |
| Recipients | Comma-separated list of email addresses |
| Minimum severity | Only issues at this severity or above trigger an email |
Emails are sent as a digest — one email per health check cycle containing all newly detected issues. Previously known issues do not re-trigger emails. If an issue resolves and later reappears, it is treated as new.
Email delivery uses the SMTP settings configured in the same Settings page. SMTP must be configured and working for email notifications to function.