Skip to main content

Your First Report

A RestoreReport is written for every completed RestoreTest run. It contains a confidence score, the result of each validation level, individual health check outcomes, RTO measurement, and resource completeness ratios.


The Confidence Score

The score is an integer from 0 to 100. It represents how confidently you can trust that the backup is restorable into a working application.

Score rangeResultMeaning
90–100PassRestore succeeded with full or near-full confidence
70–89PartialRestore completed with degraded confidence — investigate the gaps
0–69FailRestore failed or is insufficient — treat the backup as untrusted

Validation Levels and Weights

The score is the weighted sum of six validation levels. Each level contributes a maximum number of points. A level that passes fully earns all its points; a level that fails earns zero. Some levels support partial credit (for example, Completeness earns proportional points based on how many resource types matched).

LevelMax pointsWhat is checked
Restore Integrity25The Velero restore operation completed without errors
Completeness20Resource counts (Deployments, Services, PVCs, ConfigMaps, Secrets) match the source namespace
Pod Startup20All pods reached Running and Ready within the TTL
Health Checks20All checks defined in the linked HealthCheckPolicy passed
Cross-NS Dependencies10Services expected outside the sandbox were reachable
RTO Compliance5Total restore duration was within sla.maxRTO

Example RestoreReport

The following is a complete RestoreReport with inline comments explaining each field.

apiVersion: restore.kymaros.io/v1alpha1
kind: RestoreReport
metadata:
# Name is derived from: <RestoreTest name>-<run timestamp>
name: my-first-test-20240115-0300
namespace: kymaros-system
labels:
# Links this report back to the RestoreTest that produced it
restore.kymaros.io/test-name: my-first-test
creationTimestamp: "2024-01-15T03:00:00Z"
spec:
# Reference to the RestoreTest that triggered this run
restoreTestRef:
name: my-first-test
namespace: kymaros-system
status:
# The confidence score: 0-100
score: 94

# Pass (>=90), Partial (70-89), or Fail (<70)
result: Pass

# Timestamps for the full run
runStartedAt: "2024-01-15T03:00:01Z"
runCompletedAt: "2024-01-15T03:07:43Z"

# Measured restore duration in seconds
# This is compared against sla.maxRTO
rto:
measuredSeconds: 462 # 7m 42s actual restore time
slaMaxSeconds: 900 # 15m SLA (from spec.sla.maxRTO)
compliant: true # measured < sla

# The sandbox namespace used for this run (already cleaned up after TTL)
sandboxNamespace: rp-test-a3f2c1

# Velero restore resource name created for this run
veleroRestoreName: kymaros-my-first-test-20240115t030001

# One entry per validation level
validationLevels:
- name: RestoreIntegrity
maxPoints: 25
# Points earned: 25/25 — Velero restore completed cleanly
pointsEarned: 25
passed: true
message: "Velero restore completed without errors"

- name: Completeness
maxPoints: 20
# Points earned: 18/20 — most resource types matched,
# one type (Secrets) had a count mismatch (7 found, 8 expected)
pointsEarned: 18
passed: true
message: "4/5 resource types matched exactly; Secrets: 7/8"
# Per-resource-type breakdown
details:
- resourceType: Deployment
expected: 3
found: 3
matched: true
- resourceType: Service
expected: 4
found: 4
matched: true
- resourceType: PersistentVolumeClaim
expected: 2
found: 2
matched: true
- resourceType: ConfigMap
expected: 6
found: 6
matched: true
- resourceType: Secret
expected: 8
found: 7 # one Secret was excluded by Velero's backup filter
matched: false

- name: PodStartup
maxPoints: 20
# Points earned: 20/20 — all pods reached Ready
pointsEarned: 20
passed: true
message: "All 6 pods reached Running/Ready state"
details:
- podName: api-7d9f8b6c4d-xk2lp
ready: true
startupSeconds: 18
- podName: worker-6c8b9d5f7-p9wqr
ready: true
startupSeconds: 22
- podName: postgres-0
ready: true
startupSeconds: 41

- name: HealthChecks
maxPoints: 20
# Points earned: 20/20 — all configured health checks passed
pointsEarned: 20
passed: true
message: "3/3 health checks passed"
# Individual health check results
checks:
- name: api-liveness
type: http
# HTTP probe: GET /healthz → 200
target: "http://api.rp-test-a3f2c1.svc.cluster.local/healthz"
result: pass
statusCode: 200
durationMs: 34

- name: db-connectivity
type: exec
# exec probe: psql connection test inside the postgres pod
target: "postgres-0"
command: ["pg_isready", "-U", "app"]
result: pass
durationMs: 12

- name: worker-queue-depth
type: http
target: "http://worker.rp-test-a3f2c1.svc.cluster.local/metrics"
result: pass
statusCode: 200
durationMs: 28

- name: CrossNSDependencies
maxPoints: 10
# Points earned: 10/10 — external service probe passed
# (the sandbox was allowed to reach this endpoint via policy)
pointsEarned: 10
passed: true
message: "1/1 external dependency reachable"
dependencies:
- name: shared-auth-service
namespace: platform
reachable: true
durationMs: 8

- name: RTOCompliance
maxPoints: 5
# Points earned: 5/5 — restore completed in 7m 42s, well within 15m SLA
pointsEarned: 5
passed: true
message: "RTO 7m42s is within SLA of 15m"

# Human-readable summary printed by kubectl describe
summary: >
Score 94/100 (Pass). Restore completed in 7m42s.
Minor completeness gap: 1 Secret missing from restore (check Velero backup filters).
All pods healthy, all health checks passed, RTO compliant.

Checking Reports with kubectl

List all reports for a specific RestoreTest:

kubectl get restorereports \
-n kymaros-system \
-l restore.kymaros.io/test-name=my-first-test \
--sort-by=.metadata.creationTimestamp

Get a quick score summary across all recent reports:

kubectl get restorereports -n kymaros-system \
-o custom-columns='NAME:.metadata.name,SCORE:.status.score,RESULT:.status.result,RTO:.status.rto.measuredSeconds'

Describe a report for the human-readable summary:

kubectl describe restorereport my-first-test-20240115-0300 -n kymaros-system

Next Steps