Skip to main content

Scoring

After each restore validation cycle, Kymaros computes a score from 0 to 100. The score is derived from six independently computed levels and determines whether the test result is reported as pass, partial, or fail.

Formula

The score is the sum of all six level contributions:

Score = L1 + L2 + L3 + L4 + L5 + L6

Where:

// Level 1 — Restore Integrity (25 pts)
if RestoreSucceeded {
L1 = 25
} else {
L1 = 0
}

// Level 2 — Completeness (max 20 pts)
L2 = int(CompletenessRatio * 20)

// Level 3 — Pod Startup (max 20 pts)
L3 = int(PodsReadyRatio * 20)

// Level 4 — Health Checks (max 20 pts)
L4 = int(HealthChecksPassRatio * 20)

// Level 5 — Cross-NS Dependencies (max 10 pts, currently 0)
L5 = int(DepsCoverageRatio * 10) // DepsCoverageRatio hardcoded to 0

// Level 6 — RTO Compliance (5 pts)
if RTOWithinSLA {
L6 = 5
} else {
L6 = 0
}

int() truncates toward zero (floor for positive values).

Thresholds

Score RangeResult
>= 90Pass
70 – 89Partial
< 70Fail

Maximum Score

The theoretical maximum is 100 (25 + 20 + 20 + 20 + 10 + 5). Because Level 5 (DepsCoverageRatio) is currently hardcoded to 0, the achievable maximum in the current release is 90 (25 + 20 + 20 + 20 + 0 + 5). A score of 90 therefore represents a fully passing restore under current implementation constraints.

Worked Examples

Scenario A — Full Pass, RTO Met (Score: 90)

All resources restored, all pods ready, all health checks passing, restore completed within RTO. Level 5 contributes 0 in the current release.

LevelVariableValueContribution
1RestoreSucceededtrue25
2CompletenessRatio1.0int(1.0 * 20) = 20
3PodsReadyRatio1.0int(1.0 * 20) = 20
4HealthChecksPassRatio1.0int(1.0 * 20) = 20
5DepsCoverageRatio0 (hardcoded)0
6RTOWithinSLAtrue5
Total90 — Pass

This is the best achievable score in the current release.


Scenario B — Full Pass, RTO Missed (Score: 85)

Everything restored correctly, but the restore took longer than the configured SLA.

LevelVariableValueContribution
1RestoreSucceededtrue25
2CompletenessRatio1.0int(1.0 * 20) = 20
3PodsReadyRatio1.0int(1.0 * 20) = 20
4HealthChecksPassRatio1.0int(1.0 * 20) = 20
5DepsCoverageRatio0 (hardcoded)0
6RTOWithinSLAfalse0
Total85 — Partial

Result is Partial because 85 falls in the 70–89 range. The only remediation path is to reduce restore duration (backup storage tuning, snapshot optimization) or revise the spec.rtoSLA if the current value does not reflect a realistic target.


Scenario C — Missing Resources (Score: 65)

Restore succeeded but a significant portion of resources and pods are absent or unhealthy. RTO was also missed.

LevelVariableValueContribution
1RestoreSucceededtrue25
2CompletenessRatio0.75int(0.75 * 20) = 15
3PodsReadyRatio0.75int(0.75 * 20) = 15
4HealthChecksPassRatio0.5int(0.5 * 20) = 10
5DepsCoverageRatio0 (hardcoded)0
6RTOWithinSLAfalse0
Total65 — Fail

A score of 65 is below the Fail threshold of 70. In this case, 25% of resources were not restored (likely excluded from the backup or failed to create in the sandbox), only 75% of pods became Ready, and half of the health probes failed.


Scenario D — Failed Restore (Score: 0)

The backup provider reported a complete restore failure. No data reached the sandbox.

LevelVariableValueContribution
1RestoreSucceededfalse0
2CompletenessRatio0int(0 * 20) = 0
3PodsReadyRatio0int(0 * 20) = 0
4HealthChecksPassRatio0int(0 * 20) = 0
5DepsCoverageRatio0 (hardcoded)0
6RTOWithinSLAfalse0
Total0 — Fail

A score of 0 is the clearest signal that the backup is non-restorable. Inspect the backup provider logs immediately. Common root causes include corrupted object storage, expired credentials, or a backup that was never completed successfully.

Regression Detection

The controller compares the new score against the previous RestoreReport for the same RestoreTest. If the score decreased, the RestoreReport is annotated as a regression. This allows alerting rules and dashboards to surface degradation trends independently of whether the absolute score crosses a threshold.