Backup and restore

Back up and restore guide for PerfectScale self-hosted deployments

Common requirements for all providers

CSI driver

A CSI driver specific to your cloud provider must be installed in the cluster.

Snapshot controller

Install the Kubernetes external-snapshotter’s snapshot-controller

Kubernetes external snapshotter can be cluster-wide or per-namespace.

Volume snapshot CRDs

Ensure that the following CRDs are installed:

  • volumesnapshots.snapshot.storage.k8s.io

  • volumesnapshotcontents.snapshot.storage.k8s.io

  • volumesnapshotclasses.snapshot.storage.k8s.io

VolumeSnapshotClass

A VolumeSnapshotClass must be defined and configured for the respective CSI driver.

AWS (Amazon Web Services)

Requirements

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-aws-vsc
  annotations:
    snapshot.storage.kubernetes.io/is-default-class: "true"
driver: ebs.csi.aws.com
deletionPolicy: Delete

IAM permissions

Ensure that the IAM role used by the CSI driver has permissions:

  • ec2:CreateSnapshot

  • ec2:DeleteSnapshot

  • ec2:DescribeSnapshots

  • ec2:CreateTags

  • ec2:DeleteTags

Azure (Microsoft Azure)

Requirements

  • Enable and configure the snapshot feature

  • Install the snapshot-controller

  • Define the following

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-azure-vsc
  annotations:
    snapshot.storage.kubernetes.io/is-default-class: "true"
driver: disk.csi.azure.com
deletionPolicy: Delete

Azure Role-Based Access Control (RBAC)

Ensure that the managed identity associated with the cluster has the needed permissions:

  • Microsoft.Compute/snapshots/*

  • Microsoft.Compute/disks/*

  • Microsoft.Resources/subscriptions/resourceGroups/read

GCP (Google Cloud Platform)

Requirements

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-gce-pd-snapclass
  annotations:
    snapshot.storage.kubernetes.io/is-default-class: "true"
driver: pd.csi.storage.gke.io
deletionPolicy: Delete

IAM permissions

Ensure that the service account used by the node pool has:

  • compute.snapshots.create

  • compute.snapshots.delete

  • compute.snapshots.get

  • compute.disks.createSnapshot

  • compute.snapshots.useReadOnly

How to back up a PVC

By default, we have a CronJob psc-snapshot-pvc-job running that checks for the existence of a VolumeSnapshotClass. If one is found, it automatically creates snapshots for all PVCs with a retention period of 7 days.

To verify that snapshots are being created, you may use the following command

kubectl get volumesnapshot -A

How to restore a PVC from a VolumeSnapshot

Use the following guide to create a new PersistentVolumeClaim (PVC) from an existing VolumeSnapshot.

Prerequisites

  1. A valid VolumeSnapshot already exists.

  2. A compatible StorageClass and VolumeSnapshotClass are installed.

  3. Your CSI driver supports volume snapshot restore (e.g., AWS EBS, Azure Disk, GCP PD).

Step 1: Identify your VolumeSnapshot

Run the following command to list available snapshots

kubectl get volumesnapshot -A

For example, we found a snapshot for PostgreSQL with the name snapshot-data-postgres-postgresql-0-1745265598

Step 2: Create a PVC from the Snapshot

We need to create a new PVC from the existing snapshot

  • Ensure that the service that will use it is scaled to 0

  • Set the correct volume size

  • Set the correct Namespace

kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-postgres-postgresql-0
  namespace: NAMESPACE
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: gp3
  resources:
    requests:
      storage: 8Gi
  dataSource:
    name: snapshot-data-postgres-postgresql-0-1745265598
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF
  • Verify that PVC was created

kubectl get pvc -A

Step 3: Scale up or install the helm chart with Postgres to use the new PVC

How to restore ClickHouse

  1. Scale down to 0 for ClickHouse and ClickHouse-keeper, and delete PVCs.

kubectl scale sts chi-clickhouse-clickhouse-0-0  chi-clickhouse-clickhouse-0-1 clickhouse-keeper --replicas=0
kubectl delete pvc data-volume-template-chi-clickhouse-clickhouse-0-0-0  data-volume-template-chi-clickhouse-clickhouse-0-1-0  clickhouse-keeper-datadir-volume-clickhouse-keeper-0  clickhouse-keeper-datadir-volume-clickhouse-keeper-1 clickhouse-keeper-datadir-volume-clickhouse-keeper-2
  1. Run the command below to list the available snapshots:

kubectl get volumesnapshot -A

Example: the snapshot for PostgreSQL with the names snapshot-data-volume-template-chi-clickhouse-clickhouse-0-1-0-202504251454 and snapshot-data-volume-template-chi-clickhouse-clickhouse-0-0-0-202504251454 was found.

  1. Identify the default StorageClass.

kubectl get storageclass
  1. Use the following command to create PVC from the snapshot:

Ensure you replace the namespace, snapshot name, volume size, and storage class with the actual values.

kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-volume-template-chi-clickhouse-clickhouse-0-0-0
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ebs-sc
  resources:
    requests:
      storage: 150Gi
  dataSource:
    name: snapshot-data-volume-template-chi-clickhouse-clickhouse-0-0-0-202504251454
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF

kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-volume-template-chi-clickhouse-clickhouse-0-1-0
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ebs-sc
  resources:
    requests:
      storage: 150Gi
  dataSource:
    name: snapshot-data-volume-template-chi-clickhouse-clickhouse-0-1-0-202504251454
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF
  1. Scale up clickhouse-keeper to 3 replicas, and wait for all 3 replicas to be running.

kubectl scale sts clickhouse-keeper --replicas=3
  1. Scale ClickHouse

kubectl scale sts chi-clickhouse-clickhouse-0-0  chi-clickhouse-clickhouse-0-1 --replicas=1
  1. Once all pods are running, exec into the pod and check the status of the tables

kubectl exec -it chi-clickhouse-clickhouse-0-0-0 -c clickhouse-pod -- bash
clickhouse-client 
select count(*) from psc_data_provider.node_reports
select count(*) from psc_data_provider.workload_reports
select database, table, replica_name, replica_path from system.replicas where is_readonly
  1. If the last command returns some tables, run the restore replica command for every table

SYSTEM RESTORE REPLICA psc_data_provider.workload_reports ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA psc_data_provider.node_reports ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.flat_workloads ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.flat_nodegroups ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_successful_minutes ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1h_workloads ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1h_nodegroups ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1d_workloads ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1d_nodegroups ON CLUSTER `{cluster}`;

and check again

select database, table, replica_name, replica_path from system.replicas where is_readonly

How to restore Minio

  1. Scale minio1 and minio2 to 0 replicas and remove their associated PVCs.

kubectl scale deploy minio1 minio2 --replicas=0 
kubectl delete pvc minio1 minio2
  1. List the available snapshots by running the following command:

kubectl get volumesnapshot -A

For example, we found snapshots for minio with names snapshot-minio1-202506040306 and snapshot-minio2-202506040306.

  1. Identify the default StorageClass.

kubectl get storageclass
  1. Create PVC from the snapshot by running the following command:

Replace namespace, snapshot name, volume size, and storage class with your actual values.

kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: minio1
  labels:
    app.kubernetes.io/instance: minio
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: minio1
    app.kubernetes.io/version: 2025.4.22
    helm.sh/chart: minio1-16.0.8
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: default
  resources:
    requests:
      storage: 10Gi
  dataSource:
    name: snapshot-minio1-202506040306
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF

kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: minio2
  labels:
    app.kubernetes.io/instance: minio
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: minio2
    app.kubernetes.io/version: 2025.4.22
    helm.sh/chart: minio1-16.0.8
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: default
  resources:
    requests:
      storage: 10Gi
  dataSource:
    name: snapshot-minio2-202506040306
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF
  1. Scale minio1 and minio2 to 1 replica, and wait until both are running.

kubectl scale deploy minio1 minio2 --replicas=1
kubectl get pods | grep minio

Last updated

Was this helpful?