Backup and restore
Back up and restore guide for PerfectScale self-hosted deployments
Common requirements for all providers
CSI driver
A CSI driver specific to your cloud provider must be installed in the cluster.
Snapshot controller
Install the Kubernetes external-snapshotter’s snapshot-controller
Volume snapshot CRDs
Ensure that the following CRDs are installed:
volumesnapshots.snapshot.storage.k8s.io
volumesnapshotcontents.snapshot.storage.k8s.io
volumesnapshotclasses.snapshot.storage.k8s.io
VolumeSnapshotClass
A VolumeSnapshotClass
must be defined and configured for the respective CSI driver.
AWS (Amazon Web Services)
Requirements
Install the aws-ebs-csi-driver
Install the snapshot-controller
Define the following
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-aws-vsc
annotations:
snapshot.storage.kubernetes.io/is-default-class: "true"
driver: ebs.csi.aws.com
deletionPolicy: Delete
IAM permissions
Ensure that the IAM role used by the CSI driver has permissions:
ec2:CreateSnapshot
ec2:DeleteSnapshot
ec2:DescribeSnapshots
ec2:CreateTags
ec2:DeleteTags
Azure (Microsoft Azure)
Requirements
Install the azure-disk-csi-driver
Enable and configure the snapshot feature
Install the snapshot-controller
Define the following
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-azure-vsc
annotations:
snapshot.storage.kubernetes.io/is-default-class: "true"
driver: disk.csi.azure.com
deletionPolicy: Delete
Azure Role-Based Access Control (RBAC)
Ensure that the managed identity associated with the cluster has the needed permissions:
Microsoft.Compute/snapshots/*
Microsoft.Compute/disks/*
Microsoft.Resources/subscriptions/resourceGroups/read
GCP (Google Cloud Platform)
Requirements
Install the gcp-compute-persistent-disk-csi-driver
Install the snapshot-controller
Enable the
compute.googleapis.com
API in your projectDefine the following
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-gce-pd-snapclass
annotations:
snapshot.storage.kubernetes.io/is-default-class: "true"
driver: pd.csi.storage.gke.io
deletionPolicy: Delete
IAM permissions
Ensure that the service account used by the node pool has:
compute.snapshots.create
compute.snapshots.delete
compute.snapshots.get
compute.disks.createSnapshot
compute.snapshots.useReadOnly
How to back up a PVC
By default, we have a CronJob psc-snapshot-pvc-job running that checks for the existence of a VolumeSnapshotClass
. If one is found, it automatically creates snapshots for all PVCs with a retention period of 7 days.
To verify that snapshots are being created, you may use the following command
kubectl get volumesnapshot -A
How to restore a PVC from a VolumeSnapshot
Use the following guide to create a new PersistentVolumeClaim (PVC) from an existing VolumeSnapshot
.
Prerequisites
A valid
VolumeSnapshot
already exists.A compatible
StorageClass
andVolumeSnapshotClass
are installed.Your CSI driver supports volume snapshot restore (e.g., AWS EBS, Azure Disk, GCP PD).
Step 1: Identify your VolumeSnapshot
Run the following command to list available snapshots
kubectl get volumesnapshot -A
For example, we found a snapshot for PostgreSQL with the name snapshot-data-postgres-postgresql-0-1745265598
Step 2: Create a PVC from the Snapshot
We need to create a new PVC from the existing snapshot
Ensure that the service that will use it is scaled to 0
Set the correct volume size
Set the correct Namespace
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-postgres-postgresql-0
namespace: NAMESPACE
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3
resources:
requests:
storage: 8Gi
dataSource:
name: snapshot-data-postgres-postgresql-0-1745265598
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
EOF
Verify that PVC was created
kubectl get pvc -A
Step 3: Scale up or install the helm chart with Postgres to use the new PVC
How to restore ClickHouse
Scale down to 0 for ClickHouse and ClickHouse-keeper, and delete PVCs.
kubectl scale sts chi-clickhouse-clickhouse-0-0 chi-clickhouse-clickhouse-0-1 clickhouse-keeper --replicas=0
kubectl delete pvc data-volume-template-chi-clickhouse-clickhouse-0-0-0 data-volume-template-chi-clickhouse-clickhouse-0-1-0 clickhouse-keeper-datadir-volume-clickhouse-keeper-0 clickhouse-keeper-datadir-volume-clickhouse-keeper-1 clickhouse-keeper-datadir-volume-clickhouse-keeper-2
Run the command below to list the available snapshots:
kubectl get volumesnapshot -A
Identify the default StorageClass.
kubectl get storageclass
Use the following command to create PVC from the snapshot:
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-volume-template-chi-clickhouse-clickhouse-0-0-0
spec:
accessModes:
- ReadWriteOnce
storageClassName: ebs-sc
resources:
requests:
storage: 150Gi
dataSource:
name: snapshot-data-volume-template-chi-clickhouse-clickhouse-0-0-0-202504251454
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
EOF
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-volume-template-chi-clickhouse-clickhouse-0-1-0
spec:
accessModes:
- ReadWriteOnce
storageClassName: ebs-sc
resources:
requests:
storage: 150Gi
dataSource:
name: snapshot-data-volume-template-chi-clickhouse-clickhouse-0-1-0-202504251454
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
EOF
Scale up clickhouse-keeper to 3 replicas, and wait for all 3 replicas to be running.
kubectl scale sts clickhouse-keeper --replicas=3
Scale ClickHouse
kubectl scale sts chi-clickhouse-clickhouse-0-0 chi-clickhouse-clickhouse-0-1 --replicas=1
Once all pods are running, exec into the pod and check the status of the tables
kubectl exec -it chi-clickhouse-clickhouse-0-0-0 -c clickhouse-pod -- bash
clickhouse-client
select count(*) from psc_data_provider.node_reports
select count(*) from psc_data_provider.workload_reports
select database, table, replica_name, replica_path from system.replicas where is_readonly
If the last command returns some tables, run the restore replica command for every table
SYSTEM RESTORE REPLICA psc_data_provider.workload_reports ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA psc_data_provider.node_reports ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.flat_workloads ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.flat_nodegroups ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_successful_minutes ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1h_workloads ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1h_nodegroups ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1d_workloads ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1d_nodegroups ON CLUSTER `{cluster}`;
and check again
select database, table, replica_name, replica_path from system.replicas where is_readonly
How to restore Minio
Scale
minio1
andminio2
to 0 replicas and remove their associated PVCs.
kubectl scale deploy minio1 minio2 --replicas=0
kubectl delete pvc minio1 minio2
List the available snapshots by running the following command:
kubectl get volumesnapshot -A
For example, we found snapshots for minio with names snapshot-minio1-202506040306 and snapshot-minio2-202506040306.
Identify the default StorageClass.
kubectl get storageclass
Create PVC from the snapshot by running the following command:
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: minio1
labels:
app.kubernetes.io/instance: minio
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: minio1
app.kubernetes.io/version: 2025.4.22
helm.sh/chart: minio1-16.0.8
spec:
accessModes:
- ReadWriteOnce
storageClassName: default
resources:
requests:
storage: 10Gi
dataSource:
name: snapshot-minio1-202506040306
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
EOF
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: minio2
labels:
app.kubernetes.io/instance: minio
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: minio2
app.kubernetes.io/version: 2025.4.22
helm.sh/chart: minio1-16.0.8
spec:
accessModes:
- ReadWriteOnce
storageClassName: default
resources:
requests:
storage: 10Gi
dataSource:
name: snapshot-minio2-202506040306
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
EOF
Scale
minio1
andminio2
to 1 replica, and wait until both are running.
kubectl scale deploy minio1 minio2 --replicas=1
kubectl get pods | grep minio
Last updated
Was this helpful?