Skip to main content

RestoreReport

API group: restore.kymaros.io/v1alpha1
Kind: RestoreReport
Short name: rr
Scope: Namespaced (typically kymaros-system)

A RestoreReport is created automatically by the Kymaros controller at the end of each RestoreTest run. It records the confidence score, RTO measurement, per-check results, and resource completeness for that single execution. Reports are read-only — you should not create or edit them manually.

The number of reports retained per test is governed by spec.historyLimit on the parent RestoreTest (default: 10). Older reports are deleted automatically when the limit is reached.


Spec

FieldTypeRequiredDescription
testRefstringYesName of the RestoreTest resource that triggered this report.

Status

The status block carries the full result of the restore validation run.

Top-level status fields

FieldTypeDescription
scoreintConfidence score from 0 to 100. Score ≥ 90 = pass. Score 70–89 = partial. Score < 70 = fail.
resultstringAggregated outcome: pass, fail, or partial.
startedAtTimeUTC timestamp when the restore operation started.
completedAtTimeUTC timestamp when all validation checks finished.

rto

Restore Time Objective measurement for this run.

FieldTypeDescription
rto.measuredDurationActual elapsed time from restore start to validation completion.
rto.targetDurationThe sla.maxRTO value from the parent RestoreTest at the time of the run.
rto.withinSLAbooltrue if measured is less than or equal to target.

backup

Metadata about the backup that was restored in this run.

FieldTypeDescription
backup.namestringName of the backup object as recorded by the backup provider.
backup.ageDurationAge of the backup at the time of restore (current time minus backup creation time).
backup.sizestringApproximate size of the backup data transferred, as reported by the provider.

checks

checks is an array of CheckResult objects, one per health check executed. Each object has the following fields:

FieldTypeDescription
namestringName of the check as defined in the HealthCheckPolicy.
statusstringOutcome of this individual check: pass, fail, or skip.
durationDurationTime taken to execute this check.
messagestringHuman-readable detail. On failure, this contains the error or unexpected output. On skip, it explains why the check was not executed.

completeness

Resource count comparison between the source namespace and the restored sandbox. Each value is a string in the format "actual/expected".

FieldTypeDescription
completeness.deploymentsstringDeployment count. Example: "3/3".
completeness.servicesstringService count.
completeness.secretsstringSecret count.
completeness.configMapsstringConfigMap count.
completeness.pvcsstringPersistentVolumeClaim count.
completeness.customResourcesstringCustom resource count across all CRDs present in the source namespace.

validationLevels

validationLevels contains the result for each of the six structured validation stages. Each stage is a LevelResult with the following fields:

FieldTypeDescription
statusstringStage outcome: pass, fail, or skip.
detailstringHuman-readable summary for this stage.
tested[]stringNames of resources or checks that were evaluated in this stage.
notTested[]stringNames of resources or checks that were identified but skipped (e.g., due to timeout or prior stage failure).

The six stages in order:

Stage keyDescription
restoreIntegrityThe backup restore operation itself succeeded without provider errors.
completenessResource counts in the sandbox match the source namespace snapshot.
podStartupAll pods in the sandbox reached Running and Ready state within the timeout.
internalHealthAll checks defined in the referenced HealthCheckPolicy passed.
crossNamespaceDepsServices expected outside the sandbox (e.g., shared databases) are reachable.
rtoComplianceThe measured restore duration did not exceed sla.maxRTO.

Querying reports with kubectl

# List all RestoreReport objects
kubectl get rr -n kymaros-system

# Show score, result, and source test for all reports
kubectl get rr -n kymaros-system -o wide

# Describe a specific report (full status)
kubectl describe rr my-app-nightly-20240315-030012 -n kymaros-system

# List reports for a specific test (label selector set by the controller)
kubectl get rr -n kymaros-system -l kymaros.io/test=my-app-nightly

# Get the score field directly
kubectl get rr my-app-nightly-20240315-030012 -n kymaros-system \
-o jsonpath='{.status.score}'

# Get all checks and their status
kubectl get rr my-app-nightly-20240315-030012 -n kymaros-system \
-o jsonpath='{range .status.checks[*]}{.name}{"\t"}{.status}{"\t"}{.message}{"\n"}{end}'

# List all failed reports across all tests
kubectl get rr -n kymaros-system \
-o jsonpath='{range .items[?(@.status.result=="fail")]}{.metadata.name}{"\n"}{end}'

Examples

Passing report

A nightly test that completed within SLA with all checks passing.

apiVersion: restore.kymaros.io/v1alpha1
kind: RestoreReport
metadata:
name: my-app-nightly-20240315-030012
namespace: kymaros-system
labels:
kymaros.io/test: my-app-nightly
spec:
testRef: my-app-nightly
status:
score: 96
result: pass
startedAt: "2024-03-15T03:00:12Z"
completedAt: "2024-03-15T03:11:47Z"
rto:
measured: "11m35s"
target: "30m"
withinSLA: true
backup:
name: my-app-20240314-230000
age: "4h0m12s"
size: "2.3 GiB"
checks:
- name: api-http-ready
status: pass
duration: "8s"
message: "HTTP 200 received from /healthz"
- name: worker-pod-running
status: pass
duration: "12s"
message: "3/3 pods ready"
completeness:
deployments: "3/3"
services: "5/5"
secrets: "4/4"
configMaps: "2/2"
pvcs: "1/1"
customResources: "0/0"
validationLevels:
restoreIntegrity:
status: pass
detail: "Velero restore completed without errors"
tested:
- my-app-20240314-230000
notTested: []
completeness:
status: pass
detail: "All expected resources found in sandbox"
tested:
- Deployment/api
- Deployment/worker
- Deployment/scheduler
notTested: []
podStartup:
status: pass
detail: "All pods reached Ready state within 4m30s"
tested:
- api
- worker
- scheduler
notTested: []
internalHealth:
status: pass
detail: "2/2 health checks passed"
tested:
- api-http-ready
- worker-pod-running
notTested: []
crossNamespaceDeps:
status: pass
detail: "No cross-namespace dependencies declared"
tested: []
notTested: []
rtoCompliance:
status: pass
detail: "11m35s measured against 30m target"
tested:
- sla.maxRTO
notTested: []

Failing report

A test where pod startup timed out for one deployment, causing a cascade of skipped downstream stages.

apiVersion: restore.kymaros.io/v1alpha1
kind: RestoreReport
metadata:
name: orders-db-validation-20240316-030008
namespace: kymaros-system
labels:
kymaros.io/test: orders-db-validation
spec:
testRef: orders-db-validation
status:
score: 42
result: fail
startedAt: "2024-03-16T03:00:08Z"
completedAt: "2024-03-16T03:18:33Z"
rto:
measured: "18m25s"
target: "15m"
withinSLA: false
backup:
name: orders-daily-20240315-220000
age: "5h0m8s"
size: "8.7 GiB"
checks:
- name: postgres-pod-ready
status: fail
duration: "5m0s"
message: "Pod orders-postgres-0 did not reach Ready state within timeout: CrashLoopBackOff — OOMKilled"
- name: api-http-ready
status: skip
duration: "0s"
message: "Skipped: postgres-pod-ready failed, downstream checks aborted"
- name: worker-pod-running
status: skip
duration: "0s"
message: "Skipped: postgres-pod-ready failed, downstream checks aborted"
completeness:
deployments: "3/3"
services: "5/5"
secrets: "4/4"
configMaps: "2/2"
pvcs: "1/1"
customResources: "2/2"
validationLevels:
restoreIntegrity:
status: pass
detail: "Velero restore completed without errors"
tested:
- orders-daily-20240315-220000
notTested: []
completeness:
status: pass
detail: "All expected resources found in sandbox"
tested:
- Deployment/api
- Deployment/worker
- StatefulSet/orders-postgres
notTested: []
podStartup:
status: fail
detail: "Pod orders-postgres-0 entered CrashLoopBackOff: container OOMKilled"
tested:
- api
- worker
notTested:
- orders-postgres-0
internalHealth:
status: fail
detail: "1/3 checks passed, 2 skipped due to upstream failure"
tested:
- postgres-pod-ready
notTested:
- api-http-ready
- worker-pod-running
crossNamespaceDeps:
status: skip
detail: "Skipped: pod startup did not complete"
tested: []
notTested:
- shared-redis.platform-infra
rtoCompliance:
status: fail
detail: "18m25s measured against 15m target — exceeded by 3m25s"
tested:
- sla.maxRTO
notTested: []