Skip to main content

Backup Adapters

Kymaros is backup-provider-agnostic by design. A pluggable adapter layer abstracts the differences between backup tools behind a common Go interface. The RestoreTest spec field spec.provider selects which adapter to use at runtime.

Adapter Interface

Every backup adapter implements the following interface:

type Adapter interface {
ListBackups(ctx context.Context) ([]Backup, error)
GetLatestBackup(ctx context.Context) (*Backup, error)
TriggerRestore(ctx context.Context, backup *Backup, namespaceMapping map[string]string) (string, error)
WaitForRestore(ctx context.Context, restoreID string) (RestoreResult, error)
CleanupRestore(ctx context.Context, restoreID string) error
}
MethodPurpose
ListBackupsReturns all available backups from the provider, pre-filtered to eligible phases
GetLatestBackupReturns the most recent eligible backup
TriggerRestoreInstructs the provider to start a restore, with namespace remapping
WaitForRestorePolls until the restore reaches a terminal state; returns result
CleanupRestoreRemoves any provider-side restore objects after validation

The namespaceMapping parameter in TriggerRestore is a map of sourceNamespace → sandboxNamespace. This is how backup data is directed into isolated sandbox namespaces rather than the original production namespaces.

Velero Adapter

Velero is the only fully implemented adapter. It targets the velero.io/v1 API and communicates with Velero's CRDs directly.

Backup Filtering

ListBackups queries all Backup objects in the Velero namespace and filters them by phase. Only backups in the following phases are returned as eligible:

Velero Backup PhaseEligible
CompletedYes
PartiallyFailedYes
InProgressNo
FailedNo
DeletingNo

Backups in PartiallyFailed are included because they often contain a usable subset of data. Whether a restore from a PartiallyFailed backup succeeds depends on which resources were captured before the failure.

Triggering a Restore

TriggerRestore creates a Velero Restore CR in the Velero namespace. The namespace mapping passed by the controller is applied directly to the spec.namespaceMapping field of the Restore object. This causes Velero to redirect resources into the sandbox namespaces during restore execution.

# Example Restore CR created by the Velero adapter
apiVersion: velero.io/v1
kind: Restore
metadata:
generateName: kymaros-payments-nightly-
namespace: velero
spec:
backupName: payments-2026-04-01-02-00-00
namespaceMapping:
payments: kymaros-payments-nightly-a3f8k2

Polling Restore Status

WaitForRestore polls the created Restore CR every 5 seconds until it reaches a terminal phase. The poll loop returns when:

  • Phase is Completed — restore succeeded.
  • Phase is PartiallyFailed — restore completed with some failures; treated as a partial success.
  • Phase is Failed — restore failed; RestoreSucceeded will be false at Level 1.

The 5-second poll interval inside WaitForRestore is the adapter-internal polling rate. At the controller level, the overall reconcile loop requeues every 30 seconds during PhaseRunning to check whether the restore has reached a terminal state.

Handling PartiallyFailed Restores

A Velero restore in PartiallyFailed phase is treated as a partial success by the adapter. The restore result is returned without a hard error, and subsequent validation levels (Completeness, Pod Startup, Health Checks) determine the actual score. This design choice reflects the practical reality that many production backups are in PartiallyFailed state due to cluster-scoped resources the backup user lacks permission to backup, while all namespace-scoped data is intact.

Cleanup

CleanupRestore deletes the Velero Restore CR after validation completes. This prevents accumulation of restore objects in the Velero namespace over time. Cleanup is called by the controller as part of the scoreAndReport flow, after sandbox namespaces have been deleted.

Provider Support Status

ProviderStatusNotes
VeleroImplementedFull support: list, restore, poll, cleanup
Kasten K10PlannedFactory returns error for unsupported provider
TrilioVaultPlannedFactory returns error for unsupported provider

Kasten and TrilioVault are recognized provider names in the RestoreTest spec, but attempting to use them will result in an error during the PhaseRunning transition: the adapter factory returns an explicit unsupported provider error, which causes the RestoreTest to transition to PhaseFailed with a descriptive status message.

Adding a New Adapter

To integrate a new backup provider, implement the Adapter interface and register it in the adapter factory. The factory selects the implementation based on spec.provider. No changes to the controller or scoring logic are required; the adapter boundary is the only integration point.