# Backup and restore

## **Common requirements for all providers** <a href="#common-requirements-for-all-providers" id="common-requirements-for-all-providers"></a>

#### **CSI driver**

A CSI driver specific to your cloud provider must be installed in the cluster.

#### **Snapshot controller**

Install the [Kubernetes external-snapshotter’s snapshot-controller](https://github.com/kubernetes-csi/external-snapshotter)&#x20;

{% hint style="info" %}
Kubernetes external snapshotter can be cluster-wide or per-namespace.
{% endhint %}

#### **Volume snapshot CRDs**&#x20;

Ensure that the following CRDs are installed:

* `volumesnapshots.snapshot.storage.k8s.io`
* `volumesnapshotcontents.snapshot.storage.k8s.io`
* `volumesnapshotclasses.snapshot.storage.k8s.io`

#### **VolumeSnapshotClass**

A `VolumeSnapshotClass` must be defined and configured for the respective CSI driver.

## **AWS (Amazon Web Services)** <a href="#aws-amazon-web-services" id="aws-amazon-web-services"></a>

#### **Requirements**

* Install the [**aws-ebs-csi-driver**](https://github.com/kubernetes-sigs/aws-ebs-csi-driver)
* Install the [**snapshot-controller**](https://artifacthub.io/packages/helm/piraeus-charts/snapshot-controller#configuration)
* Define the following

```
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-aws-vsc
  annotations:
    snapshot.storage.kubernetes.io/is-default-class: "true"
driver: ebs.csi.aws.com
deletionPolicy: Delete
```

#### **IAM permissions**

Ensure that the IAM role used by the CSI driver has permissions:

* `ec2:CreateSnapshot`
* `ec2:DeleteSnapshot`
* `ec2:DescribeSnapshots`
* `ec2:CreateTags`
* `ec2:DeleteTags`

## **Azure (Microsoft Azure)** <a href="#azure-microsoft-azure" id="azure-microsoft-azure"></a>

#### **Requirements**

* Install the [**azure-disk-csi-driver**](https://github.com/kubernetes-sigs/azuredisk-csi-driver)
* Enable and configure the snapshot feature
* Install the **snapshot-controller**
* Define the following

```
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-azure-vsc
  annotations:
    snapshot.storage.kubernetes.io/is-default-class: "true"
driver: disk.csi.azure.com
deletionPolicy: Delete
```

#### **Azure Role-Based Access Control (RBAC)**

Ensure that the managed identity associated with the cluster has the needed permissions:

* `Microsoft.Compute/snapshots/*`
* `Microsoft.Compute/disks/*`
* `Microsoft.Resources/subscriptions/resourceGroups/read`

## **GCP (Google Cloud Platform)** <a href="#gcp-google-cloud-platform" id="gcp-google-cloud-platform"></a>

#### **Requirements**

* Install the [**gcp-compute-persistent-disk-csi-driver**](https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver)
* Install the **snapshot-controller**
* Enable the `compute.googleapis.com` API in your project
* Define the following

```
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-gce-pd-snapclass
  annotations:
    snapshot.storage.kubernetes.io/is-default-class: "true"
driver: pd.csi.storage.gke.io
deletionPolicy: Delete
```

**IAM permissions**

Ensure that the service account used by the node pool has:

* `compute.snapshots.create`
* `compute.snapshots.delete`
* `compute.snapshots.get`
* `compute.disks.createSnapshot`
* `compute.snapshots.useReadOnly`

## How to back up a PVC  <a href="#how-to-backup-a-pvc" id="how-to-backup-a-pvc"></a>

By default, we have a CronJob **psc-snapshot-pvc-job** running that checks for the existence of a `VolumeSnapshotClass`. If one is found, it automatically creates snapshots for all PVCs with a retention period of 7 days.

To verify that snapshots are being created, you may use the following command

```
kubectl get volumesnapshot -A
```

## How to restore a PVC from a VolumeSnapshot <a href="#how-to-restore-a-pvc-from-a-volumesnapshot" id="how-to-restore-a-pvc-from-a-volumesnapshot"></a>

Use the following guide to create a new PersistentVolumeClaim (PVC) from an existing `VolumeSnapshot`.

#### Prerequisites <a href="#prerequisites" id="prerequisites"></a>

1. A valid `VolumeSnapshot` already exists.
2. A compatible `StorageClass` and `VolumeSnapshotClass` are installed.
3. Your CSI driver supports volume snapshot restore (e.g., AWS EBS, Azure Disk, GCP PD).

#### Step 1: Identify your VolumeSnapshot <a href="#step-1-identify-your-volumesnapshot" id="step-1-identify-your-volumesnapshot"></a>

Run the following command to list available snapshots

```
kubectl get volumesnapshot -A
```

For example, we found a snapshot for PostgreSQL with the name **snapshot-data-postgres-postgresql-0-1745265598**

#### Step 2: Create a PVC from the Snapshot <a href="#step-2-create-a-pvc-from-the-snapshot" id="step-2-create-a-pvc-from-the-snapshot"></a>

We need to create a new PVC from the existing snapshot

* Ensure that the service that will use it is scaled to 0
* Set the correct volume size
* Set the correct Namespace

```
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-postgres-postgresql-0
  namespace: NAMESPACE
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: gp3
  resources:
    requests:
      storage: 8Gi
  dataSource:
    name: snapshot-data-postgres-postgresql-0-1745265598
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF
```

* Verify that PVC was created

```
kubectl get pvc -A
```

#### Step 3: Scale up or install the helm chart with Postgres to use the new PVC <a href="#step-3-scale-up-or-install-helm-chart-with-postgres-to-use-new-pvc" id="step-3-scale-up-or-install-helm-chart-with-postgres-to-use-new-pvc"></a>

## How to restore ClickHouse  <a href="#how-to-restore-clickhouse" id="how-to-restore-clickhouse"></a>

1. Scale down to 0 for ClickHouse and ClickHouse-keeper, and delete PVCs.

```
kubectl scale sts chi-clickhouse-clickhouse-0-0  chi-clickhouse-clickhouse-0-1 clickhouse-keeper --replicas=0
kubectl delete pvc data-volume-template-chi-clickhouse-clickhouse-0-0-0  data-volume-template-chi-clickhouse-clickhouse-0-1-0  clickhouse-keeper-datadir-volume-clickhouse-keeper-0  clickhouse-keeper-datadir-volume-clickhouse-keeper-1 clickhouse-keeper-datadir-volume-clickhouse-keeper-2
```

2. Run the command below to list the available snapshots:

```
kubectl get volumesnapshot -A
```

{% hint style="info" %}
**Example:** the snapshot for PostgreSQL with the names **`snapshot-data-volume-template-chi-clickhouse-clickhouse-0-1-0-202504251454`** and **`snapshot-data-volume-template-chi-clickhouse-clickhouse-0-0-0-202504251454`** was found.
{% endhint %}

3. Identify the default StorageClass.

```
kubectl get storageclass
```

4. Use the following command to create PVC from the snapshot:

{% hint style="info" %}
Ensure you replace the namespace, snapshot name, volume size, and storage class with the actual values.
{% endhint %}

```
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-volume-template-chi-clickhouse-clickhouse-0-0-0
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ebs-sc
  resources:
    requests:
      storage: 150Gi
  dataSource:
    name: snapshot-data-volume-template-chi-clickhouse-clickhouse-0-0-0-202504251454
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF

kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-volume-template-chi-clickhouse-clickhouse-0-1-0
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ebs-sc
  resources:
    requests:
      storage: 150Gi
  dataSource:
    name: snapshot-data-volume-template-chi-clickhouse-clickhouse-0-1-0-202504251454
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF
```

5. Scale up clickhouse-keeper to 3 replicas, and wait for all 3 replicas to be running.

```
kubectl scale sts clickhouse-keeper --replicas=3
```

6. Scale ClickHouse

```
kubectl scale sts chi-clickhouse-clickhouse-0-0  chi-clickhouse-clickhouse-0-1 --replicas=1
```

7. Once all pods are running, exec into the pod and check the status of the tables

```
kubectl exec -it chi-clickhouse-clickhouse-0-0-0 -c clickhouse-pod -- bash
clickhouse-client 
select count(*) from psc_data_provider.node_reports
select count(*) from psc_data_provider.workload_reports
select database, table, replica_name, replica_path from system.replicas where is_readonly
```

8. If the last command returns some tables, run the restore replica command for every table

```
SYSTEM RESTORE REPLICA psc_data_provider.workload_reports ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA psc_data_provider.node_reports ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.flat_workloads ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.flat_nodegroups ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_successful_minutes ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1h_workloads ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1h_nodegroups ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1d_workloads ON CLUSTER `{cluster}`;
SYSTEM RESTORE REPLICA pipeline.aggregated_1d_nodegroups ON CLUSTER `{cluster}`;
```

and check again

```
select database, table, replica_name, replica_path from system.replicas where is_readonly
```

## How to restore Minio

1. Scale `minio1` and `minio2` to 0 replicas and remove their associated PVCs.

```
kubectl scale deploy minio1 minio2 --replicas=0 
kubectl delete pvc minio1 minio2
```

2. List the available snapshots by running the following command:

```
kubectl get volumesnapshot -A
```

For example, we found snapshots for **minio** with names **snapshot-minio1-202506040306** and **snapshot-minio2-202506040306**.

3. Identify the default StorageClass.

```
kubectl get storageclass
```

4. Create PVC from the snapshot by running the following command:

{% hint style="info" %}
Replace **namespace**, **snapshot name**, **volume size**, and **storage class** with your actual values.
{% endhint %}

```
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: minio1
  labels:
    app.kubernetes.io/instance: minio
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: minio1
    app.kubernetes.io/version: 2025.4.22
    helm.sh/chart: minio1-16.0.8
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: default
  resources:
    requests:
      storage: 10Gi
  dataSource:
    name: snapshot-minio1-202506040306
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF

kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: minio2
  labels:
    app.kubernetes.io/instance: minio
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: minio2
    app.kubernetes.io/version: 2025.4.22
    helm.sh/chart: minio1-16.0.8
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: default
  resources:
    requests:
      storage: 10Gi
  dataSource:
    name: snapshot-minio2-202506040306
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
EOF
```

5. Scale `minio1` and `minio2` to 1 replica, and wait until both are running.

```
kubectl scale deploy minio1 minio2 --replicas=1
kubectl get pods | grep minio
```
