Skip to main content

Notifications

Kymaros sends notifications when a restore test fails or when its score regresses by more than 10 points compared to the previous run. Notifications are configured per RestoreTest in the .spec.notifications field.

Trigger conditions

Notifications fire under two conditions:

  1. Test failure: Any health check fails and .spec.notifications.onFailure is non-empty.
  2. Score regression: The current score is more than 10 points below the previous score (delta < -10) and .spec.notifications.onSuccess is configured. A regression notification uses the onSuccess channels because the test technically passed but the score dropped.

A test that passes with a stable or improving score sends no notification.


Configuration structure

spec:
notifications:
onFailure:
- type: slack
channel: "#alerts"
webhookSecretRef:
name: slack-secret
namespace: kymaros-system
onSuccess:
- type: webhook
webhookSecretRef:
name: generic-webhook-secret
namespace: kymaros-system

Each entry in onFailure and onSuccess has these fields:

FieldTypeRequiredDescription
typestringYesOne of: slack, webhook, pagerduty
channelstringNoSlack channel name (Slack only)
webhookSecretRef.namestringYesName of the Secret containing the webhook URL
webhookSecretRef.namespacestringYesNamespace of the Secret

Slack

Creating the Secret

The Slack integration reads the webhook URL from a Kubernetes Secret. The Secret must contain one of these keys: webhook-url or url.

kubectl create secret generic slack-secret \
--from-literal=webhook-url=https://hooks.slack.com/services/T.../B.../... \
--namespace kymaros-system

Or as a manifest:

apiVersion: v1
kind: Secret
metadata:
name: slack-secret
namespace: kymaros-system
type: Opaque
stringData:
webhook-url: "https://hooks.slack.com/services/T.../B.../..."

RestoreTest configuration

apiVersion: restore.kymaros.io/v1alpha1
kind: RestoreTest
metadata:
name: webapp-nightly
namespace: kymaros-system
spec:
schedule: "0 2 * * *"
backupSource:
name: webapp-backup
namespace: webapp-prod
notifications:
onFailure:
- type: slack
channel: "#ops-alerts"
webhookSecretRef:
name: slack-secret
namespace: kymaros-system
onSuccess:
- type: slack
channel: "#ops-info"
webhookSecretRef:
name: slack-secret
namespace: kymaros-system

Message format

Slack messages use an emoji prefix followed by the test name, score, and result:

:red_circle: webapp-nightly | Score: 42/100 | FAILED

For regressions:

:warning: webapp-nightly | Score: 71/100 | REGRESSION (was 85)

The message is sent as plain text to the channel field. Slack formatting (bold, code blocks) is not used to maintain compatibility with /slack endpoint integrations.

Timeout

The Slack HTTP request has a fixed 10-second timeout. If the Slack API does not respond within 10 seconds, the notification is dropped and an error is logged by the operator. The test result is not affected.


Webhook

The webhook type sends a JSON POST to a URL stored in a Secret. This is a generic integration suitable for custom alerting pipelines, ITSM tools, or internal dashboards.

Payload structure

The JSON body sent to the webhook endpoint corresponds to the Notification struct:

{
"testName": "webapp-nightly",
"score": 42,
"result": "FAILED",
"reportRef": "webapp-nightly-20260402-020001",
"message": "Check 'api-health-endpoint' failed: connection refused"
}
FieldTypeDescription
testNamestringName of the RestoreTest
scoreintScore from 0 to 100
resultstringPASSED, FAILED, or REGRESSION
reportRefstringName of the RestoreReport resource
messagestringHuman-readable failure reason or empty

Creating the Secret

kubectl create secret generic webhook-secret \
--from-literal=url=https://your-endpoint.example.com/kymaros \
--namespace kymaros-system

The Secret must use the key url or webhook-url.

RestoreTest configuration

apiVersion: restore.kymaros.io/v1alpha1
kind: RestoreTest
metadata:
name: webapp-nightly
namespace: kymaros-system
spec:
notifications:
onFailure:
- type: webhook
webhookSecretRef:
name: webhook-secret
namespace: kymaros-system

The channel field is ignored for webhook type.

Timeout

Webhook requests have a fixed 10-second timeout. The receiving endpoint must respond within this window. If it does not, the notification is dropped.


PagerDuty

PagerDuty is referenced in the RestoreTest spec and uses the same webhook mechanism internally. Configure it by pointing the webhookSecretRef at a Secret containing a PagerDuty Events API v2 URL.

Creating the Secret

kubectl create secret generic pagerduty-secret \
--from-literal=url=https://events.pagerduty.com/v2/enqueue \
--namespace kymaros-system

RestoreTest configuration

apiVersion: restore.kymaros.io/v1alpha1
kind: RestoreTest
metadata:
name: webapp-nightly
namespace: kymaros-system
spec:
notifications:
onFailure:
- type: pagerduty
webhookSecretRef:
name: pagerduty-secret
namespace: kymaros-system

The payload sent to the PagerDuty endpoint is the same Notification JSON structure as the generic webhook. Full PagerDuty Events API v2 formatting (routing key, severity, dedup key) is not implemented in the current release. If you need structured PagerDuty events, use a middleware webhook that transforms the Notification payload into the PagerDuty format.


Multiple destinations

onFailure and onSuccess are arrays. You can route notifications to multiple destinations simultaneously:

notifications:
onFailure:
- type: slack
channel: "#ops-alerts"
webhookSecretRef:
name: slack-secret
namespace: kymaros-system
- type: pagerduty
webhookSecretRef:
name: pagerduty-secret
namespace: kymaros-system
- type: webhook
webhookSecretRef:
name: internal-webhook-secret
namespace: kymaros-system

Each destination is notified independently. A failure to reach one destination does not prevent others from being notified.


Troubleshooting

Notification not sent:

  • Verify the Secret exists in kymaros-system and contains the correct key (webhook-url or url).
  • Check operator logs: kubectl logs -n kymaros-system -l app=kymaros-operator.
  • Confirm the trigger condition: a test that passes without regression sends no onFailure notification.

Slack message not appearing:

  • Verify the webhook URL is for an active Slack app with the correct channel scope.
  • The channel field in the RestoreTest must match a channel the Slack app is authorized to post to. An incorrect channel name causes a 404 from the Slack API.

Webhook 4xx/5xx response:

  • The operator logs the HTTP status code from the webhook endpoint. A 4xx response typically indicates an authentication issue (missing API key header) or a malformed payload. Since the payload is fixed, authentication issues are the most common cause.