RestoreReport
API group: restore.kymaros.io/v1alpha1
Kind: RestoreReport
Short name: rr
Scope: Namespaced (typically kymaros-system)
A RestoreReport is created automatically by the Kymaros controller at the end of each RestoreTest run. It records the confidence score, RTO measurement, per-check results, and resource completeness for that single execution. Reports are read-only — you should not create or edit them manually.
The number of reports retained per test is governed by spec.historyLimit on the parent RestoreTest (default: 10). Older reports are deleted automatically when the limit is reached.
Spec
| Field | Type | Required | Description |
|---|---|---|---|
testRef | string | Yes | Name of the RestoreTest resource that triggered this report. |
Status
The status block carries the full result of the restore validation run.
Top-level status fields
| Field | Type | Description |
|---|---|---|
score | int | Confidence score from 0 to 100. Score ≥ 90 = pass. Score 70–89 = partial. Score < 70 = fail. |
result | string | Aggregated outcome: pass, fail, or partial. |
startedAt | Time | UTC timestamp when the restore operation started. |
completedAt | Time | UTC timestamp when all validation checks finished. |
rto
Restore Time Objective measurement for this run.
| Field | Type | Description |
|---|---|---|
rto.measured | Duration | Actual elapsed time from restore start to validation completion. |
rto.target | Duration | The sla.maxRTO value from the parent RestoreTest at the time of the run. |
rto.withinSLA | bool | true if measured is less than or equal to target. |
backup
Metadata about the backup that was restored in this run.
| Field | Type | Description |
|---|---|---|
backup.name | string | Name of the backup object as recorded by the backup provider. |
backup.age | Duration | Age of the backup at the time of restore (current time minus backup creation time). |
backup.size | string | Approximate size of the backup data transferred, as reported by the provider. |
checks
checks is an array of CheckResult objects, one per health check executed. Each object has the following fields:
| Field | Type | Description |
|---|---|---|
name | string | Name of the check as defined in the HealthCheckPolicy. |
status | string | Outcome of this individual check: pass, fail, or skip. |
duration | Duration | Time taken to execute this check. |
message | string | Human-readable detail. On failure, this contains the error or unexpected output. On skip, it explains why the check was not executed. |
completeness
Resource count comparison between the source namespace and the restored sandbox. Each value is a string in the format "actual/expected".
| Field | Type | Description |
|---|---|---|
completeness.deployments | string | Deployment count. Example: "3/3". |
completeness.services | string | Service count. |
completeness.secrets | string | Secret count. |
completeness.configMaps | string | ConfigMap count. |
completeness.pvcs | string | PersistentVolumeClaim count. |
completeness.customResources | string | Custom resource count across all CRDs present in the source namespace. |
validationLevels
validationLevels contains the result for each of the six structured validation stages. Each stage is a LevelResult with the following fields:
| Field | Type | Description |
|---|---|---|
status | string | Stage outcome: pass, fail, or skip. |
detail | string | Human-readable summary for this stage. |
tested | []string | Names of resources or checks that were evaluated in this stage. |
notTested | []string | Names of resources or checks that were identified but skipped (e.g., due to timeout or prior stage failure). |
The six stages in order:
| Stage key | Description |
|---|---|
restoreIntegrity | The backup restore operation itself succeeded without provider errors. |
completeness | Resource counts in the sandbox match the source namespace snapshot. |
podStartup | All pods in the sandbox reached Running and Ready state within the timeout. |
internalHealth | All checks defined in the referenced HealthCheckPolicy passed. |
crossNamespaceDeps | Services expected outside the sandbox (e.g., shared databases) are reachable. |
rtoCompliance | The measured restore duration did not exceed sla.maxRTO. |
Querying reports with kubectl
# List all RestoreReport objects
kubectl get rr -n kymaros-system
# Show score, result, and source test for all reports
kubectl get rr -n kymaros-system -o wide
# Describe a specific report (full status)
kubectl describe rr my-app-nightly-20240315-030012 -n kymaros-system
# List reports for a specific test (label selector set by the controller)
kubectl get rr -n kymaros-system -l kymaros.io/test=my-app-nightly
# Get the score field directly
kubectl get rr my-app-nightly-20240315-030012 -n kymaros-system \
-o jsonpath='{.status.score}'
# Get all checks and their status
kubectl get rr my-app-nightly-20240315-030012 -n kymaros-system \
-o jsonpath='{range .status.checks[*]}{.name}{"\t"}{.status}{"\t"}{.message}{"\n"}{end}'
# List all failed reports across all tests
kubectl get rr -n kymaros-system \
-o jsonpath='{range .items[?(@.status.result=="fail")]}{.metadata.name}{"\n"}{end}'
Examples
Passing report
A nightly test that completed within SLA with all checks passing.
apiVersion: restore.kymaros.io/v1alpha1
kind: RestoreReport
metadata:
name: my-app-nightly-20240315-030012
namespace: kymaros-system
labels:
kymaros.io/test: my-app-nightly
spec:
testRef: my-app-nightly
status:
score: 96
result: pass
startedAt: "2024-03-15T03:00:12Z"
completedAt: "2024-03-15T03:11:47Z"
rto:
measured: "11m35s"
target: "30m"
withinSLA: true
backup:
name: my-app-20240314-230000
age: "4h0m12s"
size: "2.3 GiB"
checks:
- name: api-http-ready
status: pass
duration: "8s"
message: "HTTP 200 received from /healthz"
- name: worker-pod-running
status: pass
duration: "12s"
message: "3/3 pods ready"
completeness:
deployments: "3/3"
services: "5/5"
secrets: "4/4"
configMaps: "2/2"
pvcs: "1/1"
customResources: "0/0"
validationLevels:
restoreIntegrity:
status: pass
detail: "Velero restore completed without errors"
tested:
- my-app-20240314-230000
notTested: []
completeness:
status: pass
detail: "All expected resources found in sandbox"
tested:
- Deployment/api
- Deployment/worker
- Deployment/scheduler
notTested: []
podStartup:
status: pass
detail: "All pods reached Ready state within 4m30s"
tested:
- api
- worker
- scheduler
notTested: []
internalHealth:
status: pass
detail: "2/2 health checks passed"
tested:
- api-http-ready
- worker-pod-running
notTested: []
crossNamespaceDeps:
status: pass
detail: "No cross-namespace dependencies declared"
tested: []
notTested: []
rtoCompliance:
status: pass
detail: "11m35s measured against 30m target"
tested:
- sla.maxRTO
notTested: []
Failing report
A test where pod startup timed out for one deployment, causing a cascade of skipped downstream stages.
apiVersion: restore.kymaros.io/v1alpha1
kind: RestoreReport
metadata:
name: orders-db-validation-20240316-030008
namespace: kymaros-system
labels:
kymaros.io/test: orders-db-validation
spec:
testRef: orders-db-validation
status:
score: 42
result: fail
startedAt: "2024-03-16T03:00:08Z"
completedAt: "2024-03-16T03:18:33Z"
rto:
measured: "18m25s"
target: "15m"
withinSLA: false
backup:
name: orders-daily-20240315-220000
age: "5h0m8s"
size: "8.7 GiB"
checks:
- name: postgres-pod-ready
status: fail
duration: "5m0s"
message: "Pod orders-postgres-0 did not reach Ready state within timeout: CrashLoopBackOff — OOMKilled"
- name: api-http-ready
status: skip
duration: "0s"
message: "Skipped: postgres-pod-ready failed, downstream checks aborted"
- name: worker-pod-running
status: skip
duration: "0s"
message: "Skipped: postgres-pod-ready failed, downstream checks aborted"
completeness:
deployments: "3/3"
services: "5/5"
secrets: "4/4"
configMaps: "2/2"
pvcs: "1/1"
customResources: "2/2"
validationLevels:
restoreIntegrity:
status: pass
detail: "Velero restore completed without errors"
tested:
- orders-daily-20240315-220000
notTested: []
completeness:
status: pass
detail: "All expected resources found in sandbox"
tested:
- Deployment/api
- Deployment/worker
- StatefulSet/orders-postgres
notTested: []
podStartup:
status: fail
detail: "Pod orders-postgres-0 entered CrashLoopBackOff: container OOMKilled"
tested:
- api
- worker
notTested:
- orders-postgres-0
internalHealth:
status: fail
detail: "1/3 checks passed, 2 skipped due to upstream failure"
tested:
- postgres-pod-ready
notTested:
- api-http-ready
- worker-pod-running
crossNamespaceDeps:
status: skip
detail: "Skipped: pod startup did not complete"
tested: []
notTested:
- shared-redis.platform-infra
rtoCompliance:
status: fail
detail: "18m25s measured against 15m target — exceeded by 3m25s"
tested:
- sla.maxRTO
notTested: []