While Bare Metal K8s with HA describes how to get my cluster running from bare metal, it doesn’t touch on any workloads running inside the cluster.
Argo CD#
To allow declarative setup of workloads inside the k8s cluster, I chose Argo CD.
Since it needs to be able to bootstrap itself without other internal resources running, the kustomization is applied in my ansible role from this repository.
From here on, we can use ArgoCD applications.
Storage#
Rook#
The easiest way to run Ceph seems to be rook. The operator can easily be deployed using helm, in this case via an Argo CD Application.
The only interesting bit about the operator itself is the tolerations. Due to using the control-plane nodes as storage providers, the respective pods need to be allowed to run on the control-plane pods.
Ceph Cluster + Pool#
With the operator running and rook CRDs installed, this application sets up a mirrored storage device.
The core of a rook based ceph cluster is the cluster crd.
Similar to the etcd cluster for the control-plane, Ceph needs to be convinced that it’s ok to run with only two storage devices.
cephConfig:
global:
osd_pool_default_size: "2"
osd_pool_default_min_size: "1"
Also quite similar, the osd
service and mgr
are pinned to two nodes. But a
third node is required to run a mon
instance. This allows storage and APIs to
run twice, but a third instance ensures quorum.
CephFS#
At this point, it’s possible to create file systems and block storage on the pool.
The storage class allows automatic volume allocation.
At this point, a PVC can be instantiated without an explicit PersistentVolume. E.g. for my Home Assistant config storage.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: homeassistant-config
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: rook-cephfs
Snapshots#
Snapshots aren’t backups. But I like them for the 2 of the 3-2-1 strategy. They allow to recover data after messing something up / looking at previous state.
They don’t protect against hardware failure. Though that’s what we have the cluster for. So I consider redundant storage + snapshots good enough for the local component.
Make sure to keep an offsite backup for the important things though!
Snapshots in rook#
Rook documents its snapshot support.
It’s integrated with the kubernetes concept for snapshots. I.e. the VolumeSnapshot CRD.
So we only need to configure a VolumeSnapshotClass and then we can use k8s native mechanisms to create snapshots of volumes.
It’s a really simple resource.
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: cephfs-myfs
driver: rook-ceph.cephfs.csi.ceph.com # csi-provisioner-name
parameters:
# Specify a string that identifies your cluster. Ceph CSI supports any
# unique string. When Ceph CSI is deployed by Rook use the Rook namespace,
# for example "rook-ceph".
clusterID: rook-ceph # namespace:cluster
csi.storage.k8s.io/snapshotter-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/snapshotter-secret-namespace: rook-ceph # namespace:cluster
deletionPolicy: Delete
Scheduled Snapshots#
While this gives the ability to create snapshots, it doesn’t yet ensure any snapshots exist.
There doesn’t seem to be an upstream solution for this. But there’s snapscheduler, which has its own CRDs that allow to specify a cron like schedule for creation of snapshots, and various retention policies.
The slightly annoying bit for my installation: These schedules are per k8s namespace. But I use lots of namespaces.
To easily set various schedules I use a helm chart. It provides a couple schedules that can be enabled or disabled via helm values.
This can be instantiated in an argo application like this
- repoURL: ssh://git@git.local.ongy.net:7022/ongy/argocd.git
path: snapshot-schedule
targetRevision: main